2023-10-02 01:35:27,300 INFO [train.py:1114] (2/4) Training started 2023-10-02 01:35:27,301 INFO [train.py:1124] (2/4) Device: cuda:2 2023-10-02 01:35:27,335 INFO [train.py:1136] (2/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '4897f2c0-dirty', 'icefall-git-date': 'Thu Sep 28 11:38:28 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-7-1218101249-5d97868c7c-tp8w2', 'IP address': '10.177.6.147'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 50, 'start_epoch': 21, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'use_librispeech': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-10-02 01:35:27,335 INFO [train.py:1138] (2/4) About to create model 2023-10-02 01:35:27,990 INFO [train.py:1142] (2/4) Number of model parameters: 68625511 2023-10-02 01:35:27,992 INFO [checkpoint.py:112] (2/4) Loading checkpoint from zipformer/exp-w-tal-csasr/epoch-20.pt 2023-10-02 01:35:37,492 INFO [train.py:1157] (2/4) Using DDP 2023-10-02 01:35:37,825 INFO [train.py:1169] (2/4) Loading optimizer state dict 2023-10-02 01:35:38,599 INFO [train.py:1177] (2/4) Loading scheduler state dict 2023-10-02 01:35:38,600 INFO [multi_dataset.py:40] (2/4) About to get multidataset train cuts 2023-10-02 01:35:38,600 INFO [multi_dataset.py:43] (2/4) Loading Aishell-2 in lazy mode 2023-10-02 01:35:38,662 INFO [multi_dataset.py:50] (2/4) Loading TAL-CSASR in lazy mode 2023-10-02 01:35:38,664 INFO [multi_dataset.py:57] (2/4) Loading LibriSpeech in lazy mode 2023-10-02 01:35:38,664 INFO [multi_dataset.py:161] (2/4) About to get train-clean-100 cuts 2023-10-02 01:35:38,665 INFO [multi_dataset.py:168] (2/4) About to get train-clean-360 cuts 2023-10-02 01:35:38,680 INFO [multi_dataset.py:175] (2/4) About to get train-other-500 cuts 2023-10-02 01:35:48,665 INFO [asr_datamodule.py:218] (2/4) Enable MUSAN 2023-10-02 01:35:48,666 INFO [asr_datamodule.py:219] (2/4) About to get Musan cuts 2023-10-02 01:35:51,092 INFO [asr_datamodule.py:243] (2/4) Enable SpecAugment 2023-10-02 01:35:51,092 INFO [asr_datamodule.py:244] (2/4) Time warp factor: 80 2023-10-02 01:35:51,092 INFO [asr_datamodule.py:254] (2/4) Num frame mask: 10 2023-10-02 01:35:51,092 INFO [asr_datamodule.py:267] (2/4) About to create train dataset 2023-10-02 01:35:51,093 INFO [asr_datamodule.py:294] (2/4) Using DynamicBucketingSampler. 2023-10-02 01:35:51,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:35:51,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:35:51,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:35:51,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:51,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:51,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:51,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:51,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:51,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:35:51,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:51,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:35:51,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:35:51,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:35:51,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:35:52,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:35:52,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:35:52,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:35:52,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:35:52,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:52,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:52,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:53,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:53,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:35:53,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:53,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:53,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:53,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:53,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:53,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:35:53,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:53,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:35:54,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:35:54,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:54,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:54,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:35:54,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:35:54,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:35:54,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:35:54,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:35:54,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:35:54,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:35:54,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:35:55,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:55,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:35:55,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:35:55,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:35:55,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:55,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:35:55,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:35:55,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:35:55,715 INFO [asr_datamodule.py:309] (2/4) About to create train dataloader 2023-10-02 01:35:55,716 INFO [multi_dataset.py:103] (2/4) About to get multidataset dev cuts 2023-10-02 01:35:55,716 INFO [multi_dataset.py:106] (2/4) Loading Aishell-2 DEV set in lazy mode 2023-10-02 01:35:55,744 INFO [multi_dataset.py:182] (2/4) About to get dev-clean cuts 2023-10-02 01:35:55,751 INFO [multi_dataset.py:189] (2/4) About to get dev-other cuts 2023-10-02 01:35:55,780 INFO [asr_datamodule.py:340] (2/4) About to create dev dataset 2023-10-02 01:35:56,217 INFO [asr_datamodule.py:357] (2/4) About to create dev dataloader 2023-10-02 01:35:56,217 INFO [train.py:1358] (2/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-10-02 01:35:56,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:35:56,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:35:56,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:35:56,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:56,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:56,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:56,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:56,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:56,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:35:56,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:57,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:35:57,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:35:57,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:35:57,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:35:57,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:35:57,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:35:57,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:35:57,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:35:57,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:58,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:58,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:58,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:58,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:35:58,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:58,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:58,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:59,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:59,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:59,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:35:59,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:59,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:35:59,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:35:59,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:59,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:59,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:35:59,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:36:00,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:00,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:00,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:36:00,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:36:00,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:00,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:00,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:00,807 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:36:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:36:00,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:00,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:01,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:36:01,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:36:01,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:36:01,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:01,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:36:01,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:36:01,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:02,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:02,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:02,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:02,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:02,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:02,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:02,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:36:02,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:02,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:36:02,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:36:02,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:36:02,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:36:02,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:36:03,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:36:03,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:03,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:03,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:03,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:03,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:03,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:03,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:03,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:03,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:03,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:03,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:03,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:04,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:36:04,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:05,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:05,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:36:05,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:36:05,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:05,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:05,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:36:05,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:36:05,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:05,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:06,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:06,124 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:36:06,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:36:06,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:06,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:06,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:36:06,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:36:06,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:36:06,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:07,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 01:36:08,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:08,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:08,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:08,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:08,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:08,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 01:36:08,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 01:36:08,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:09,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:09,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:09,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:09,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:36:09,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:09,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 01:36:09,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:10,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:36:10,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:10,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 01:36:10,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:10,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:36:10,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:11,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:12,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:12,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 01:36:12,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 01:36:12,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:12,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:12,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:12,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:13,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 01:36:13,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:13,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:13,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:13,845 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 01:36:13,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:36:14,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:14,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:14,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 01:36:14,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:36:14,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:14,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:14,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:14,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:15,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 01:36:15,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:15,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:36:16,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 01:36:16,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 01:36:16,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:36:16,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:16,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:17,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:17,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:36:17,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:36:17,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:17,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:17,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:17,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:36:17,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 01:36:18,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:36:18,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:36:18,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 01:36:18,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:18,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 01:36:18,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:18,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:19,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:19,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:19,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:19,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 01:36:19,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 01:36:19,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:19,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:19,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:19,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 01:36:19,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 01:36:19,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:36:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:20,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:20,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 01:36:20,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 01:36:20,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:20,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:20,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:36:20,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:21,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:21,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:21,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 01:36:21,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:22,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:22,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:22,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:22,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:22,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:22,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 01:36:22,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:22,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:22,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:22,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:22,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 01:36:22,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:22,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:23,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:36:23,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:36:23,503 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 01:36:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 01:36:23,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:23,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:36:23,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:36:24,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:24,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:24,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 01:36:25,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:36:25,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:36:25,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:26,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:26,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:26,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:26,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:26,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:26,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:27,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:27,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:27,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:27,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 01:36:27,130 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 01:36:27,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:27,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:27,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:27,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:27,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:36:27,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:36:27,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:36:27,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:27,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:27,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:27,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:27,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:27,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:28,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:28,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:28,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:28,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:28,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:28,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:36:28,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:29,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 01:36:29,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 01:36:29,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 01:36:29,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:29,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:36:29,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:30,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:30,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:30,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:30,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:30,355 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 01:36:30,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:30,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:31,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:31,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 01:36:31,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:36:31,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:31,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:31,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:31,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:31,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:32,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:32,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 01:36:32,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:32,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:32,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:32,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:32,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:32,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:36:33,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:36:33,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:33,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:36:33,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 01:36:33,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:33,611 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 01:36:33,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:34,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:35,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 01:36:35,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:35,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:35,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 01:36:35,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:36:35,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:35,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:35,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:35,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:35,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:37,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:36:37,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:37,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:36:37,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:36:37,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:37,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:36:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:37,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 01:36:38,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:36:38,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:38,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:39,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:40,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:40,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:40,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 01:36:40,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:40,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:40,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:40,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:36:40,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 01:36:41,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:41,014 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 01:36:41,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:41,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:36:41,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:41,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:41,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:41,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:41,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:42,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:36:43,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:43,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:43,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:36:44,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:36:44,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:36:44,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:44,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:44,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:36:44,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:36:44,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:44,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:44,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 01:36:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:44,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:36:45,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:36:45,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:45,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:45,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:36:45,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:36:45,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:45,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:36:45,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:45,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:46,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:46,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:36:46,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:46,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:47,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 01:36:47,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:47,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:47,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 01:36:47,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:36:48,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:48,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 01:36:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:48,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:48,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 01:36:48,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:48,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:36:48,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 01:36:48,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:49,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:36:49,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:36:49,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 01:36:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 01:36:49,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:50,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:50,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:50,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 01:36:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:50,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:50,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:51,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:36:51,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 01:36:51,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:51,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:52,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 01:36:52,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:52,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:36:52,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:52,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 01:36:53,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:53,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:36:53,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:53,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 01:36:53,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:36:53,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:53,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 01:36:53,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:53,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:54,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:54,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:36:54,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:55,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:55,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:55,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:55,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 01:36:55,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:55,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 01:36:55,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:55,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 01:36:56,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:56,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 01:36:57,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:57,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:57,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:36:57,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:57,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:57,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:57,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:57,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:57,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:57,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:58,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:58,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:58,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:58,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:58,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 01:36:58,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:59,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:59,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:59,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:59,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 01:36:59,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:59,426 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 01:36:59,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 01:36:59,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:59,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 01:36:59,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:00,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:37:00,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:00,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:00,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:01,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:01,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:01,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:37:01,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 01:37:01,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:01,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:01,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:01,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:02,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:02,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:02,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 01:37:02,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 01:37:02,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:02,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 01:37:02,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:02,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:37:02,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:02,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 01:37:02,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:03,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:03,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:03,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:03,234 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 01:37:03,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 01:37:03,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:03,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:03,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 01:37:03,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 01:37:04,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:04,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:05,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 01:37:05,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:37:05,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 01:37:06,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:06,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:37:06,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 01:37:06,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:06,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:37:06,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:06,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:06,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 01:37:07,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:37:07,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 01:37:07,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:37:07,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:07,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 01:37:07,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:37:07,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:07,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:07,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 01:37:07,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:08,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:08,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:37:08,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 01:37:08,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:08,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:37:08,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:37:09,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:09,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:09,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 01:37:10,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 01:37:10,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:10,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:10,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:10,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:11,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:11,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 01:37:11,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 01:37:11,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 01:37:11,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:11,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:11,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:37:11,743 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 01:37:11,763 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 01:37:11,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:11,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:11,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:37:12,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:37:12,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:37:12,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:37:12,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 01:37:12,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:12,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:37:12,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:37:12,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 01:37:13,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:37:13,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 01:37:13,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 01:37:13,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:14,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:14,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:14,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:14,685 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 01:37:14,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:15,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:37:15,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:15,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 01:37:15,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 01:37:15,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:15,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:37:15,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:37:15,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:37:16,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:16,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:16,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:16,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:17,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:37:17,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:37:17,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:17,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 01:37:17,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:37:17,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:17,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:37:17,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:37:17,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:17,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 01:37:17,715 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 01:37:17,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:18,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:18,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:18,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:18,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:37:18,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 01:37:19,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:37:19,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:19,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:19,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:20,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:20,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 01:37:20,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:20,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:20,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 01:37:20,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:20,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:21,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 01:37:21,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 01:37:21,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:21,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 01:37:21,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:21,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:21,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:21,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:21,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:21,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:37:21,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:22,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 01:37:22,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:37:22,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:22,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:23,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:23,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 01:37:23,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 01:37:23,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:24,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:24,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:37:24,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:24,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:24,626 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 01:37:24,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:24,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:37:24,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:37:25,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:37:25,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:37:25,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:25,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 01:37:25,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 01:37:25,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:25,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:25,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:25,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:25,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:25,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:37:26,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:26,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:26,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:37:26,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:37:26,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:26,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:26,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:26,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:37:26,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:37:27,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 01:37:28,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 01:37:28,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:28,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:37:28,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:28,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:28,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:37:28,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 01:37:29,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:37:29,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:29,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:29,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 01:37:29,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:30,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 01:37:30,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:30,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:30,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:30,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:30,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:30,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:31,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:37:32,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:32,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:32,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:32,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 01:37:33,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:37:33,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:33,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 01:37:33,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:37:33,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 01:37:33,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:37:33,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:37:34,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:37:34,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:37:34,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:37:34,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:34,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:35,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 01:37:35,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:36,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:36,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:36,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:36,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 01:37:36,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:36,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:37,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:37,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:37:37,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:37,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:37,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:37:37,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:37:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:37,864 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 01:37:37,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:37,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:38,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:38,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:38,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:38,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:37:38,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 01:37:38,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:38,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:37:38,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:37:38,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:38,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:37:38,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 01:37:38,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 01:37:38,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:38,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:39,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:39,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:39,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:39,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:39,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:39,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:40,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:40,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:37:40,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:41,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:37:41,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:41,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:41,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:41,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 01:37:41,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 01:37:41,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 01:37:41,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:42,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 01:37:42,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:42,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:42,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:37:43,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:43,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:43,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:37:43,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:43,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 01:37:43,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 01:37:44,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:37:44,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:44,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:37:45,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:45,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 01:37:45,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:45,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:37:45,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 01:37:45,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:46,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:46,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:46,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:46,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 01:37:46,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 01:37:46,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 01:37:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:47,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:47,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:47,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 01:37:47,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 01:37:47,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 01:37:47,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 01:37:48,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 01:37:48,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 01:37:48,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:48,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 01:37:48,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:48,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:48,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:49,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:49,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:37:49,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:49,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:49,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:37:49,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:49,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:49,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:49,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 01:37:50,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:37:50,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:50,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:50,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:50,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 01:37:50,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:50,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 01:37:50,728 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 01:37:50,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 01:37:50,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:50,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:37:51,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:37:51,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:51,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:51,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:37:51,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:51,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:51,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 01:37:51,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:37:52,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:37:52,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:52,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:52,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 01:37:52,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:52,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:52,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:37:53,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:53,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:37:54,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 01:37:54,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:54,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:54,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:54,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:54,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:54,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:37:55,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:55,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:55,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:55,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:55,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:55,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:55,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:55,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:55,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:37:56,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 01:37:56,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:56,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:56,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:37:56,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:56,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 01:37:56,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:56,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 01:37:56,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:56,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:57,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:57,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:57,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:58,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:58,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:58,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:37:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 01:37:58,557 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 01:37:58,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 01:37:58,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:37:58,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:58,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:58,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:59,126 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 01:37:59,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 01:37:59,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:37:59,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:37:59,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:59,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:00,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 01:38:00,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:00,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 01:38:00,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:00,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:01,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 01:38:01,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:01,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:01,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 01:38:01,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:01,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:01,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:01,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:38:01,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:02,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 01:38:02,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 01:38:02,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 01:38:02,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:02,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:02,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:02,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:02,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:38:03,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:03,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:03,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 01:38:03,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 01:38:03,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:03,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 01:38:04,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 01:38:04,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 01:38:04,371 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 01:38:04,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:38:04,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:04,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:38:04,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:04,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:04,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 01:38:04,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:38:04,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:05,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:05,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:38:05,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:38:05,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:38:05,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 01:38:05,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:38:05,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:05,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:05,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:05,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:06,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:06,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:38:06,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:06,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:07,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:38:07,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:38:07,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:07,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 01:38:07,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:07,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:07,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 01:38:08,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:08,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:08,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 01:38:08,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:08,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 01:38:08,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:38:09,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:09,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:09,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:38:09,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:09,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:38:10,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:11,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 01:38:11,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:11,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:38:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:38:11,678 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 01:38:11,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 01:38:12,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:38:12,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:12,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:38:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:12,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:12,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 01:38:12,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:13,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 01:38:13,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:13,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:13,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:13,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:13,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 01:38:13,625 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 01:38:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:38:13,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 01:38:13,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:14,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 01:38:15,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:15,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:15,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:15,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:38:15,728 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 01:38:15,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:16,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:16,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:16,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:16,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 01:38:16,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:38:16,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:16,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 01:38:16,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:16,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:16,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:16,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:16,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 01:38:17,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:38:17,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:17,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:17,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:17,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 01:38:18,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 01:38:18,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:38:18,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:18,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:18,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 01:38:18,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:38:19,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:19,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:19,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 01:38:19,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:19,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:38:19,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 01:38:19,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:38:20,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:20,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:20,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 01:38:20,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 01:38:20,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:21,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 01:38:21,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:21,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:21,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 01:38:21,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 01:38:21,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:21,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:22,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:22,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 01:38:22,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 01:38:22,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 01:38:22,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:22,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 01:38:22,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:38:22,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 01:38:23,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:23,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:24,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:24,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:24,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:24,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:24,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 01:38:24,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:24,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:38:24,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:24,736 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 01:38:24,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 01:38:25,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 01:38:25,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 01:38:25,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:25,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:25,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:38:25,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:25,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:26,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 01:38:26,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:26,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 01:38:26,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 01:38:26,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:26,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:26,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:26,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:26,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:27,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:27,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:28,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:38:28,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:28,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:38:28,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:38:28,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:28,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:28,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:38:29,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:38:29,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 01:38:29,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:29,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 01:38:29,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:29,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 01:38:29,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:38:29,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:30,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:30,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 01:38:30,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 01:38:30,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:38:30,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 01:38:30,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 01:38:30,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:31,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:38:31,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:38:31,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:31,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:38:31,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:32,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 01:38:32,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 01:38:32,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 01:38:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:32,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:38:32,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 01:38:33,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:33,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:33,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:33,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:38:33,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:33,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:33,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 01:38:33,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:38:33,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 01:38:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 01:38:33,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:34,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:34,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:34,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:38:35,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:35,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:35,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 01:38:35,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:38:35,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:35,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:38:35,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 01:38:35,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:38:35,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:35,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:36,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:36,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:38:37,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:37,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 01:38:37,587 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 01:38:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:37,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:37,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:38:37,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:38,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 01:38:38,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:38,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:38:38,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:38,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:38,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 01:38:38,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:38,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 01:38:38,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:38:39,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:38:39,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 01:38:39,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:38:39,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:39,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:39,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:39,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 01:38:39,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:39,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:40,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 01:38:40,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:38:40,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 01:38:40,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:40,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:40,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:38:41,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:41,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:41,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:41,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:41,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 01:38:41,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 01:38:42,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:42,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:42,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 01:38:42,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:42,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:43,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:43,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 01:38:43,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:43,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:43,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 01:38:43,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:43,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:44,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:45,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:45,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 01:38:45,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:45,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:46,040 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 01:38:46,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:46,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 01:38:46,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:47,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:38:47,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:38:47,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:47,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:47,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:47,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:38:47,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:47,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:48,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:48,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:48,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:48,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:48,583 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 01:38:48,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 01:38:49,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:38:49,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:49,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:50,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 01:38:50,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:50,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:38:50,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:50,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 01:38:50,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:51,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 01:38:51,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 01:38:51,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:51,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:51,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:51,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:38:51,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:51,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:38:51,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:51,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 01:38:52,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:38:52,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:38:52,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:38:52,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:52,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:52,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:38:52,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:52,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 01:38:53,278 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 01:38:53,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:54,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:38:54,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:54,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:54,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 01:38:54,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:54,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:38:54,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 01:38:55,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:38:55,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:38:55,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:38:55,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:38:55,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:38:55,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:55,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:38:56,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:38:56,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:38:56,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:56,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:56,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:56,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:56,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:38:56,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 01:38:57,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:38:57,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:57,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 01:38:57,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:38:57,360 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 01:38:57,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:57,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:58,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:58,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:58,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 01:38:58,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 01:38:58,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 01:38:59,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:59,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 01:38:59,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:59,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 01:38:59,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:59,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 01:38:59,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:38:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:59,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:38:59,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:59,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 01:39:00,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:00,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:00,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:39:00,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:39:00,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:00,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 01:39:01,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:01,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:39:01,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:01,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:01,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:39:01,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 01:39:01,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:02,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:02,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 01:39:03,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:39:03,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:03,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:03,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:03,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:39:03,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:39:03,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 01:39:04,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:04,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:39:04,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 01:39:04,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:39:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:04,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:04,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 01:39:04,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:04,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 01:39:04,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:05,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:05,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:05,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 01:39:05,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 01:39:05,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 01:39:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:06,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 01:39:06,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:07,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 01:39:07,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:07,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:08,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:08,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:08,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:08,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:39:08,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:39:08,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 01:39:08,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:39:08,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:09,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 01:39:09,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:09,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:09,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 01:39:09,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 01:39:09,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 01:39:09,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:09,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 01:39:10,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:11,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:11,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:11,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 01:39:12,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:12,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 01:39:12,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:39:12,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:12,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:12,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 01:39:12,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:39:13,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 01:39:13,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 01:39:13,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 01:39:13,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:14,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 01:39:14,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 01:39:16,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:16,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:16,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:16,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:39:16,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:16,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 01:39:17,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:17,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:17,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 01:39:17,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:17,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:17,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:18,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:18,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:39:18,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:18,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 01:39:18,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:18,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:18,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:20,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 01:39:20,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:39:20,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:20,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:39:20,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:20,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:20,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:39:21,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:21,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:21,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 01:39:21,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:21,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:39:21,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:21,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 01:39:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:39:21,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 01:39:21,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:21,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:21,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 01:39:22,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:39:22,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:39:22,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:22,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:22,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:22,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:39:22,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:23,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:23,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:23,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:23,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:39:23,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:23,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:24,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 01:39:24,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 01:39:24,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:24,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:39:25,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:25,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 01:39:25,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:25,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 01:39:25,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 01:39:25,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:26,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:26,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:26,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 01:39:26,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 01:39:26,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 01:39:26,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:26,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:39:27,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 01:39:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:27,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:27,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:27,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:27,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:27,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 01:39:27,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:39:27,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:39:28,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:28,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:28,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:28,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:29,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:29,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 01:39:29,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:29,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:29,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:29,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 01:39:30,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 01:39:30,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:30,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 01:39:30,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:39:30,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:39:30,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:30,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:30,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 01:39:30,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:30,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:30,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 01:39:31,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:31,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:39:31,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 01:39:31,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:39:31,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:39:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 01:39:32,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:32,284 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 01:39:32,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:33,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:33,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 01:39:33,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:33,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 01:39:33,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:33,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:33,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:33,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:33,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:39:34,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 01:39:34,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 01:39:34,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:34,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 01:39:34,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 01:39:34,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:34,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:34,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:34,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:34,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:34,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:35,095 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 01:39:35,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:35,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:35,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:39:35,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:39:35,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 01:39:35,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:35,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 01:39:35,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 01:39:35,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 01:39:35,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:35,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:36,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:36,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 01:39:36,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 01:39:37,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:37,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:37,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:39:37,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 01:39:38,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:39:38,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:38,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:38,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:38,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:38,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 01:39:38,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:39,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:39:39,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:39,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 01:39:39,140 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 01:39:39,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 01:39:40,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:40,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:40,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 01:39:40,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:40,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:40,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:39:40,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:40,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:40,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:40,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 01:39:41,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 01:39:41,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 01:39:41,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:42,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 01:39:42,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:42,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:42,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:42,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 01:39:43,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:43,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 01:39:43,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:43,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 01:39:43,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 01:39:44,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:44,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 01:39:44,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:44,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:44,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:44,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 01:39:44,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:39:44,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:45,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:45,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:45,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:45,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:45,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:39:45,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:39:46,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:46,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:46,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 01:39:46,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:39:46,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 01:39:47,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:47,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:47,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:47,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 01:39:47,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 01:39:47,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 01:39:47,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 01:39:47,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:47,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:47,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:39:48,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:48,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 01:39:48,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:48,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:48,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:39:48,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 01:39:48,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 01:39:49,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:39:49,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:39:49,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 01:39:50,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:50,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 01:39:50,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:51,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:51,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:51,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:51,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:51,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:51,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:51,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:51,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:39:51,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:39:51,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:51,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:51,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:39:52,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 01:39:52,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:39:52,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 01:39:52,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 01:39:52,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 01:39:52,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:52,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:52,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:52,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:52,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 01:39:52,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:52,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:52,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:53,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 01:39:53,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:53,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:39:53,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 01:39:53,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:39:53,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:39:53,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:53,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:53,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:53,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 01:39:54,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:39:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:55,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:55,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:39:55,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:39:55,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:39:56,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:56,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 01:39:56,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:39:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:56,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:39:56,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:39:56,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 01:39:56,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 01:39:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:57,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 01:39:57,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:57,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:39:57,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:39:58,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 01:39:58,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:58,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:58,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 01:39:58,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:58,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:58,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:58,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:59,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:59,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:39:59,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:59,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:39:59,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:59,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:59,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 01:39:59,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:59,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:00,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 01:40:00,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:00,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:00,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:40:00,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 01:40:00,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:01,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:01,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:01,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 01:40:01,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:01,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 01:40:01,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:01,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:02,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:40:02,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 01:40:02,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:02,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 01:40:03,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:03,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:03,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:04,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:04,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:04,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:04,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:04,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:04,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 01:40:04,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:05,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 01:40:05,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:05,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:05,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:05,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:40:05,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:05,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:06,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:06,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:06,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 01:40:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:06,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:40:06,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:06,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:40:06,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:06,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:07,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:07,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:07,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:40:08,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:08,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:40:08,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:08,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:08,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:08,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:08,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:09,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:09,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 01:40:09,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:09,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:09,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 01:40:09,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 01:40:09,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 01:40:09,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:10,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:10,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:10,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:10,620 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 01:40:10,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:10,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:10,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 01:40:10,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 01:40:11,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:40:11,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:11,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:12,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 01:40:12,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:12,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 01:40:12,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:12,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:12,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:12,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 01:40:13,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:13,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:13,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 01:40:13,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:13,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:13,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:13,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:13,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:13,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:40:13,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:13,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:14,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:14,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:14,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:14,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 01:40:14,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 01:40:15,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 01:40:15,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:15,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 01:40:15,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:40:16,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:16,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 01:40:17,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:17,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:17,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 01:40:17,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:17,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:17,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:17,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:18,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:18,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:40:18,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:18,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:40:18,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:18,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:18,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:18,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 01:40:18,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:19,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:19,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:40:19,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 01:40:19,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 01:40:19,352 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 01:40:19,421 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 01:40:19,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:40:19,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:19,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:19,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:19,677 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 01:40:19,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:19,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:19,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:40:19,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:20,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:20,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 01:40:20,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:20,843 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 01:40:20,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:40:20,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:21,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:21,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 01:40:21,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 01:40:21,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:21,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:21,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 01:40:21,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 01:40:22,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 01:40:22,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:22,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 01:40:22,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 01:40:23,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 01:40:23,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 01:40:23,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:23,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 01:40:23,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 01:40:23,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 01:40:23,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 01:40:23,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:24,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 01:40:24,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:25,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 01:40:25,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:25,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 01:40:25,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:26,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:40:26,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:26,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:26,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:26,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:26,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:40:26,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:40:26,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:26,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:26,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:26,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:26,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:27,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:27,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:27,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:27,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:40:27,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 01:40:27,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:40:27,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:27,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:27,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:27,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:28,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:28,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:28,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:28,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:40:28,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:28,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:28,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:29,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:29,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:29,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:40:29,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 01:40:29,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:29,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:30,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:30,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:30,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:40:30,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:30,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:40:30,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 01:40:30,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:31,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:31,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:31,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:31,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:31,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:32,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:32,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:32,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:32,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 01:40:32,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:40:32,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:32,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 01:40:32,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:33,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:33,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:33,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:34,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:34,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:34,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 01:40:34,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:40:34,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:34,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 01:40:34,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:40:34,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:35,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:35,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 01:40:35,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:35,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:35,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:35,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 01:40:35,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:35,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 01:40:35,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:36,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:40:36,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:36,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:36,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:36,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 01:40:36,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 01:40:36,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:36,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:37,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:37,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:37,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:37,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:37,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:37,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:37,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:38,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:38,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:38,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 01:40:38,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:40:38,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:39,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:39,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:39,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:39,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:40:39,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:40:39,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:40,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:40,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:40,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:40,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:40,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:40,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:41,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 01:40:41,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:41,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:41,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:40:42,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:42,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:42,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 01:40:42,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:42,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 01:40:42,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:43,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:43,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:43,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:40:43,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:43,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:43,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:43,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:43,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:44,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:40:44,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:40:44,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:44,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:45,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:45,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 01:40:45,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:45,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:40:45,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:45,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 01:40:46,823 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 01:40:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:46,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:46,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:47,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:47,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 01:40:47,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 01:40:47,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:47,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:47,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:47,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:47,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:47,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 01:40:47,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:47,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 01:40:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 01:40:48,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:48,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:48,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 01:40:48,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:40:48,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 01:40:48,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:48,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:48,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:49,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:40:49,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 01:40:49,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:49,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:40:49,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 01:40:49,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:49,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 01:40:49,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 01:40:49,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 01:40:49,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:49,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:50,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:40:50,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:40:50,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:51,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:51,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 01:40:51,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:51,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:51,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:51,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 01:40:51,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 01:40:51,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 01:40:51,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:51,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:51,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 01:40:52,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:52,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:52,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:52,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:52,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 01:40:52,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:40:52,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:52,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:40:52,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:52,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:52,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 01:40:53,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 01:40:53,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:53,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:53,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:53,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:53,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:53,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:40:53,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:54,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:54,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:40:54,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:54,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:40:54,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:54,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:54,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:55,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:55,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 01:40:55,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:55,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:55,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:56,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:56,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:56,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:40:56,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:56,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:56,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:56,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 01:40:56,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:56,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:56,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:56,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:40:56,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:56,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:56,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:40:57,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:57,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 01:40:57,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:57,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:57,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:57,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:57,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:40:57,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:57,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:57,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 01:40:57,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 01:40:57,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:57,929 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 01:40:57,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:58,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 01:40:58,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:58,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 01:40:58,139 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 01:40:58,139 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 01:40:58,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 01:40:58,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:58,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:58,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:58,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:58,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:40:58,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:58,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:59,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:59,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 01:41:00,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:00,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:00,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:00,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:00,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:41:00,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:00,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:00,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 01:41:01,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 01:41:01,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:41:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 01:41:02,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:02,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:02,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:41:02,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:02,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 01:41:02,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:41:02,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:03,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:41:03,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:04,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:04,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:04,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:04,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 01:41:04,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:04,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 01:41:04,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:04,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:41:04,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:05,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:05,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:05,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:05,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:05,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:41:05,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:05,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:41:05,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:41:05,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:05,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:41:06,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 01:41:06,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:06,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 01:41:06,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:41:06,343 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 01:41:06,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:06,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:06,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:06,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 01:41:06,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:07,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:07,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:07,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:41:08,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:08,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:41:08,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:08,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 01:41:08,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:08,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:08,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 01:41:08,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:09,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:09,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:41:09,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:09,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:41:09,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:41:09,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 01:41:09,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:09,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:10,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:10,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:10,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:10,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:10,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:10,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:11,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:11,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:41:11,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:41:11,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:41:11,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:12,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:41:12,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:41:12,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 01:41:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:13,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:41:13,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 01:41:13,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:13,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:13,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:13,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:14,133 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 01:41:14,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:14,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:14,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:41:14,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:14,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:14,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 01:41:15,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:15,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:15,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:15,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:41:15,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:15,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:16,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:17,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:41:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:17,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:17,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:41:17,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:17,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 01:41:17,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:41:17,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:18,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:18,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:18,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:18,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:41:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:41:18,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 01:41:18,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:18,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:18,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 01:41:18,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:19,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:19,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:19,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:19,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:41:19,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:41:19,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:19,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:19,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 01:41:20,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:20,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 01:41:20,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 01:41:21,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:21,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:21,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:21,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:21,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 01:41:22,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:22,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 01:41:22,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:22,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:22,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:23,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:41:23,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 01:41:23,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:23,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:23,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:23,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:23,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:23,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 01:41:23,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:24,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:24,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 01:41:24,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:41:24,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 01:41:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:41:24,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 01:41:25,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 01:41:25,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:25,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:41:25,881 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 01:41:25,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 01:41:26,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 01:41:26,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:26,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:26,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:26,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:27,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 01:41:27,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 01:41:27,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:41:27,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:27,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 01:41:27,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:27,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:27,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 01:41:28,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:28,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 01:41:28,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:41:29,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 01:41:30,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:30,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:30,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 01:41:30,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:31,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:31,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:31,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:31,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:31,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:41:31,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:31,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:31,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:41:31,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:32,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:41:32,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 01:41:32,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 01:41:32,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:32,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:32,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 01:41:32,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 01:41:32,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 01:41:32,475 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 01:41:32,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 01:41:32,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:32,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:32,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:32,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 01:41:32,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:32,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:41:33,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:33,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:33,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:33,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:34,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 01:41:34,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:34,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:34,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:41:34,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:35,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 01:41:35,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:35,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:41:35,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:35,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:41:35,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:35,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:35,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:36,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 01:41:36,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:36,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:36,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:37,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:37,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:37,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:37,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 01:41:37,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:37,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:37,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:37,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:38,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:41:38,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 01:41:38,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:38,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:38,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 01:41:38,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:39,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:41:39,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:39,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:39,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:39,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 01:41:39,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:40,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:41,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:41:41,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:41,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:41,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 01:41:41,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:41:41,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:41,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:41:41,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:41:41,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 01:41:42,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:42,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:42,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 01:41:42,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:42,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 01:41:42,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:43,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:43,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:43,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:41:43,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 01:41:43,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:43,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:44,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:44,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:44,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:44,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:41:45,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 01:41:45,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:45,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:45,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:45,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:41:45,457 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 01:41:45,458 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 01:41:45,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 01:41:45,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:45,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 01:41:45,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 01:41:45,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:46,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 01:41:46,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 01:41:46,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:46,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:46,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:47,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:47,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 01:41:47,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:41:47,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 01:41:48,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:41:48,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:48,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:48,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 01:41:48,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:41:48,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:48,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:48,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:48,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 01:41:48,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:48,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:48,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 01:41:49,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:50,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:50,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:50,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:50,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:41:50,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:50,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:41:50,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:41:51,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:41:51,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:41:51,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:41:51,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:52,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:52,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 01:41:52,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:52,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:41:52,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:41:52,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:53,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:53,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:53,559 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 01:41:53,802 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 01:41:53,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:41:53,850 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 01:41:53,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 01:41:53,959 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 01:41:54,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:54,199 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 01:41:54,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 01:41:54,429 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 01:41:54,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:41:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 01:41:54,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 01:41:54,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:41:54,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 01:41:55,146 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 01:41:55,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 01:41:56,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:56,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 01:41:56,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:56,975 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 01:41:57,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:57,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:57,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 01:41:57,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:57,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:57,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 01:41:57,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:41:57,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:58,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:58,349 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 01:41:58,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:58,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:58,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:58,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:41:58,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 01:41:58,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:59,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:59,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:59,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 01:41:59,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:59,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:42:00,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 01:42:00,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:00,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:42:00,870 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 01:42:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:01,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:01,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:42:01,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:01,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:01,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 01:42:01,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:42:01,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:01,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 01:42:01,954 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 01:42:02,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:02,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 01:42:02,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:02,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 01:42:02,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:02,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:42:02,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:02,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:03,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 01:42:03,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 01:42:03,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:03,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 01:42:03,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:03,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:03,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:42:03,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:03,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:03,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:04,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:04,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:04,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:04,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:05,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:05,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:05,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:05,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:42:05,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:05,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:05,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:05,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 01:42:06,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:06,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:06,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:06,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:06,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:42:06,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:06,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 01:42:06,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:06,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:42:07,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:07,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:07,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:07,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:07,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:07,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:42:07,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:42:07,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 01:42:07,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:07,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:07,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:42:07,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:08,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:42:08,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 01:42:08,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:08,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:42:08,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:09,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:09,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:09,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:09,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:42:09,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:10,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:10,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:10,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:10,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:10,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:42:11,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:11,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:11,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:11,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:11,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:11,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:11,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:11,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:11,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:11,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:12,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:12,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:12,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 01:42:12,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:12,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:12,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 01:42:13,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 01:42:13,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:13,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:13,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:13,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:14,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:14,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:14,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:42:14,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:14,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:14,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 01:42:14,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:14,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:14,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 01:42:14,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:15,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:15,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:15,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:42:15,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:15,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:15,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:15,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:15,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:42:15,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:42:15,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:15,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:15,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:42:16,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:16,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:42:16,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:16,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:16,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:16,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:42:17,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:18,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:18,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 01:42:18,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:18,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 01:42:18,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:42:19,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:19,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 01:42:19,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:19,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:19,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 01:42:19,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:42:19,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:42:19,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:19,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:19,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 01:42:19,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:19,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:19,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:19,951 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 01:42:19,952 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 01:42:20,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:42:20,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:20,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:20,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 01:42:20,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:42:21,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 01:42:21,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:21,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:21,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:22,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:22,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:22,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:22,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:22,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:42:22,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:23,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:23,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:23,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:23,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:23,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 01:42:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:23,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:23,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:24,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:24,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:24,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:24,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:24,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:24,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:42:24,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:42:24,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:42:24,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:24,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 01:42:24,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:24,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:24,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 01:42:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:25,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:25,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:25,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 01:42:26,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:26,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:42:26,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:26,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:26,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:26,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:26,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:27,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:27,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:27,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:27,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 01:42:27,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 01:42:27,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:28,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 01:42:28,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:28,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 01:42:28,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 01:42:28,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:29,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:29,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:29,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:29,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:42:29,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:42:29,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:42:29,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:42:29,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 01:42:30,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:30,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:30,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:30,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:31,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:31,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:31,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:31,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:31,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:31,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:31,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:32,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:32,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 01:42:32,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 01:42:32,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:42:32,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:32,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 01:42:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:32,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:32,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:42:32,656 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 01:42:32,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 01:42:32,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:32,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:33,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:33,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:33,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:33,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 01:42:33,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:33,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 01:42:33,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 01:42:33,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:33,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:33,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:34,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:34,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:35,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:35,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:42:35,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 01:42:35,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:42:35,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:35,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 01:42:35,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 01:42:35,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:35,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 01:42:35,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:36,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:36,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:36,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:36,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:36,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:36,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:37,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 01:42:37,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 01:42:37,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:37,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:42:37,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 01:42:37,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:38,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:38,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:38,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:39,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 01:42:39,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:39,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 01:42:39,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:39,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:42:40,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:40,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 01:42:40,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:40,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:40,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:40,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:40,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 01:42:40,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 01:42:40,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:42:40,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:41,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:41,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:41,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:41,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:41,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:41,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:42:42,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:42,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:42,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:42,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 01:42:42,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 01:42:42,759 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 01:42:42,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:42,962 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 01:42:43,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 01:42:43,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:43,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:43,157 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 01:42:43,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:42:44,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 01:42:44,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:44,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:44,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:44,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:44,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:44,479 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 01:42:44,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 01:42:45,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:45,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:45,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 01:42:45,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:45,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 01:42:45,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:45,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:45,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:42:45,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:45,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:42:45,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:46,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:46,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:42:46,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:42:46,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:46,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:42:46,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:46,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 01:42:46,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:46,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:46,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:42:46,961 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 01:42:47,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 01:42:47,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:47,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:42:47,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 01:42:47,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:48,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:42:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:49,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 01:42:49,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:42:49,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:42:49,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:49,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:50,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 01:42:50,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 01:42:50,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:50,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:42:50,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:50,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:42:50,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:50,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:50,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:42:51,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:51,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:51,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:42:51,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 01:42:51,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:51,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:51,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:51,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:51,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:51,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:42:51,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 01:42:51,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:52,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 01:42:52,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:42:52,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 01:42:53,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:42:53,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:42:53,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 01:42:53,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 01:42:53,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:42:53,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:53,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:53,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:42:53,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:53,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:53,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 01:42:53,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:54,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:42:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:54,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 01:42:54,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 01:42:55,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 01:42:55,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:55,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:55,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:55,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:55,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:55,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:55,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:42:55,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:56,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:56,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:56,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:56,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:57,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:57,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 01:42:57,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:57,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:42:57,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:57,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:42:57,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:57,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:57,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:57,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:58,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:58,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:58,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:58,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:58,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:42:58,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:58,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 01:42:58,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:58,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:59,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 01:42:59,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:59,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:59,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:42:59,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:43:00,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 01:43:00,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 01:43:00,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 01:43:00,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:43:00,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:00,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:01,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:43:01,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:02,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 01:43:02,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:43:02,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:02,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:02,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:02,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:43:02,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:02,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 01:43:02,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:02,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:02,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 01:43:03,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:03,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:43:03,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 01:43:03,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 01:43:03,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:03,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:03,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:03,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:03,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:03,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:03,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:04,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:04,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:04,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:04,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:43:04,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:04,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 01:43:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:04,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 01:43:04,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:04,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:05,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 01:43:06,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 01:43:06,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:06,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:06,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:06,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:06,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 01:43:06,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:06,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:43:07,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 01:43:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:07,344 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 01:43:07,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 01:43:07,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:07,693 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 01:43:07,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:43:07,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 01:43:07,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 01:43:07,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 01:43:07,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:07,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:08,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 01:43:08,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:08,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:08,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:08,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:43:08,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 01:43:08,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:09,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:43:09,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:09,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 01:43:09,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 01:43:10,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:10,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:43:10,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:43:10,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:10,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:43:10,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:43:10,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:43:10,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 01:43:10,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:43:10,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:10,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:10,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:10,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 01:43:10,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:10,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 01:43:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:10,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 01:43:11,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 01:43:11,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:11,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:11,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 01:43:11,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:43:11,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:11,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:11,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:11,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:12,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:12,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:12,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 01:43:12,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:12,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:43:13,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:13,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 01:43:13,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:14,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:14,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:15,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:15,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 01:43:16,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:16,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 01:43:16,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:43:16,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:16,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:43:16,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:16,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 01:43:17,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:43:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 01:43:17,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 01:43:17,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:43:18,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:18,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:18,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:18,842 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 01:43:18,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:43:19,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:19,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 01:43:19,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 01:43:19,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 01:43:19,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 01:43:19,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:19,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:43:19,839 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 01:43:19,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:19,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:19,984 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 01:43:20,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:43:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:21,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:21,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 01:43:21,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:21,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:21,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:21,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:21,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:43:21,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:21,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:43:21,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:21,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:21,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:21,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:21,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:22,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:22,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:22,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:22,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:22,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:23,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:23,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 01:43:23,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:23,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:23,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:23,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:24,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:43:24,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:24,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:24,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 01:43:24,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:43:24,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:43:24,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:24,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 01:43:24,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 01:43:24,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:25,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:25,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:25,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:43:25,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:25,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:25,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:25,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 01:43:25,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:25,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:25,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 01:43:26,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:26,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 01:43:26,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 01:43:26,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 01:43:26,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:27,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:27,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:27,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:27,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:43:27,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:43:27,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:27,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:28,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 01:43:28,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:28,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:28,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:28,524 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 01:43:28,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:28,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:43:28,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:43:28,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:28,791 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 01:43:29,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:29,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:43:29,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:29,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 01:43:29,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 01:43:29,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:29,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:43:29,644 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 01:43:29,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 01:43:29,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:30,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 01:43:30,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:30,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:43:30,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:30,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:30,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:30,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:30,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:43:30,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:30,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:43:31,692 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 01:43:31,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 01:43:31,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:43:31,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:31,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:32,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:32,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:32,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:43:32,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:32,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:43:32,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:32,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:43:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 01:43:32,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:32,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:33,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:43:33,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:33,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:33,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:33,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:33,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:33,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:33,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:33,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:34,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:34,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:34,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:34,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 01:43:34,438 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 01:43:34,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:34,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 01:43:34,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 01:43:34,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:34,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:34,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:35,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 01:43:35,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:35,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:43:35,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:35,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:35,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:36,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:36,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:36,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:36,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:36,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:36,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:37,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:37,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:37,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:37,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 01:43:37,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:43:37,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 01:43:37,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:37,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 01:43:37,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:37,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:38,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:38,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 01:43:38,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:43:38,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:38,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:39,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:39,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 01:43:39,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:43:39,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:43:39,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:39,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 01:43:39,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 01:43:39,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:39,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:39,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:43:40,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 01:43:40,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 01:43:40,919 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 01:43:40,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:41,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:41,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:43:41,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:41,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:43:41,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:41,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 01:43:42,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:42,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:42,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:42,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:43:42,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:43,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 01:43:43,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:43,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:43,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 01:43:43,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:43,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:43,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:43,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:43:44,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:45,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:43:45,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:45,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:45,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 01:43:46,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:43:46,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 01:43:46,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 01:43:46,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:46,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:43:46,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 01:43:46,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:47,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:43:47,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:43:47,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:47,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:43:47,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:47,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:48,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 01:43:49,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 01:43:49,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:43:49,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:49,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:49,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 01:43:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:43:50,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:50,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:50,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:43:50,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:50,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 01:43:50,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:50,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:51,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 01:43:51,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:43:51,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:51,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:52,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:52,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:52,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:52,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:53,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:53,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:43:53,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 01:43:53,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:43:54,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:43:54,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:54,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 01:43:54,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:54,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:54,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:43:54,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:54,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:43:55,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:55,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:55,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 01:43:55,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:55,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:43:55,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:55,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:55,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 01:43:55,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:56,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:56,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:43:56,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:56,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 01:43:56,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 01:43:56,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 01:43:56,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:56,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:56,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:56,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:43:57,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:43:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:58,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:58,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 01:43:58,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 01:43:58,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:58,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:58,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:58,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:59,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 01:43:59,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:59,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:59,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 01:43:59,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 01:43:59,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:59,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:59,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:59,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:59,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:44:00,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:00,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:44:00,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:01,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:44:01,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:01,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:01,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:44:02,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:02,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:02,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:02,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:44:02,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:44:02,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:02,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:44:02,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:02,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:02,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 01:44:02,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:03,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:03,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:44:03,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:03,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:03,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:03,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 01:44:03,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:44:04,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 01:44:04,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:44:04,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:04,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:04,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 01:44:04,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:05,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:05,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 01:44:05,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:05,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:05,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 01:44:06,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 01:44:06,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:06,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:44:06,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:06,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:06,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:07,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:07,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:07,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:07,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:44:07,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:07,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 01:44:07,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:44:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:08,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:08,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:08,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:44:08,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:08,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 01:44:08,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:08,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:08,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:44:08,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:09,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:09,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:09,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 01:44:09,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:09,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:44:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 01:44:10,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:11,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:11,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:11,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:44:11,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:44:11,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 01:44:12,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 01:44:12,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 01:44:12,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:12,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:12,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 01:44:12,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:12,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:12,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:12,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 01:44:12,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 01:44:12,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:13,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 01:44:13,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 01:44:13,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:13,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 01:44:13,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 01:44:14,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:14,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:44:14,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:44:15,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:44:15,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:15,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 01:44:15,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:44:15,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:15,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 01:44:15,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:15,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:15,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:15,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:15,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 01:44:15,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 01:44:16,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:16,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 01:44:16,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:16,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:16,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:44:16,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:16,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:44:16,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:44:17,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:17,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:17,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:17,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:17,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:44:17,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:17,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 01:44:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:18,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:18,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:44:19,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:19,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:44:19,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:19,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 01:44:19,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:19,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:44:19,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:19,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:20,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:20,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 01:44:20,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:20,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:44:20,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:44:20,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:44:20,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:20,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:20,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:20,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:20,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:21,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:21,232 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 01:44:21,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:21,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:44:21,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:44:21,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 01:44:21,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:22,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 01:44:22,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:22,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:44:23,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:44:23,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 01:44:23,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:23,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 01:44:24,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:24,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:24,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:24,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 01:44:24,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:24,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:44:24,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:24,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:24,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:25,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 01:44:25,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 01:44:25,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:25,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:25,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:25,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:44:25,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:25,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:25,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:25,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 01:44:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 01:44:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:44:26,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 01:44:26,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:26,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:26,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:26,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:26,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:27,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:27,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:28,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:28,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:28,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 01:44:28,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:28,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:28,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:28,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 01:44:29,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 01:44:29,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:29,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:29,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:29,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:29,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:44:30,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 01:44:30,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:30,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:30,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:30,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:44:31,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:44:32,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:44:32,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:32,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:44:32,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:33,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:33,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:44:33,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 01:44:33,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:33,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:33,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:33,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:44:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:33,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:44:34,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:34,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:34,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:34,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 01:44:34,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:44:34,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:34,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:34,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:34,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:44:35,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:44:35,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:35,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:35,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:36,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:44:36,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 01:44:36,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:37,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:37,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:37,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 01:44:37,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 01:44:37,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:37,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:37,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:38,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 01:44:38,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 01:44:38,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 01:44:38,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:38,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:38,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:38,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:39,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:39,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 01:44:39,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:44:39,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:40,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:44:40,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:40,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:44:40,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 01:44:41,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:41,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:41,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:41,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:44:41,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:41,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:42,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:42,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:44:42,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:42,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:42,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:42,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:44:42,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 01:44:42,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 01:44:42,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:42,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:42,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:42,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:42,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 01:44:43,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 01:44:43,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:43,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 01:44:43,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:43,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:44,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:44,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:44,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 01:44:45,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 01:44:45,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:45,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:45,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:44:45,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:44:45,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:45,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:45,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:44:45,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 01:44:45,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:45,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 01:44:45,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:45,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:46,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:46,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:46,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:46,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:46,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:46,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:46,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 01:44:46,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:46,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:46,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:44:46,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:46,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:47,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:44:47,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:47,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:44:47,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 01:44:47,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:47,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 01:44:47,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:47,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 01:44:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 01:44:48,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:48,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:48,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:44:48,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:49,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:49,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:44:49,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:44:49,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:49,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:50,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:50,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:50,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:44:50,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:51,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:51,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:51,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 01:44:51,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 01:44:51,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:51,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 01:44:52,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:52,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 01:44:52,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:52,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 01:44:52,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:44:53,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:53,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:54,132 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 01:44:54,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:44:54,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 01:44:54,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 01:44:54,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:54,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:54,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:44:54,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:54,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 01:44:54,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:55,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:44:55,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:44:55,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:44:55,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:55,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:56,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:56,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 01:44:56,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 01:44:56,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 01:44:56,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:44:56,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:58,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:58,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:44:58,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:58,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:58,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 01:44:58,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:58,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:44:59,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 01:44:59,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:00,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:00,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:00,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:00,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:00,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 01:45:00,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:00,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 01:45:00,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:00,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 01:45:01,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:01,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:01,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:01,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:01,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:01,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:45:01,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:45:01,693 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 01:45:02,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:02,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:02,679 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 01:45:02,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:45:02,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:03,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 01:45:03,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:03,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:03,548 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 01:45:03,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:03,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 01:45:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:03,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:03,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:45:04,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:04,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:45:04,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:04,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 01:45:04,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:04,369 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 01:45:04,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:45:04,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 01:45:05,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:45:05,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:05,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:05,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:45:05,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 01:45:06,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:06,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:06,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:45:06,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:06,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:07,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:07,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:07,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:45:07,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:07,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:07,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:08,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:08,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:45:08,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 01:45:08,474 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 01:45:08,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:09,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 01:45:09,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:09,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:09,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:09,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:09,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:10,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:10,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 01:45:10,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:45:10,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:11,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 01:45:11,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:11,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 01:45:12,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:12,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:12,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 01:45:12,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 01:45:12,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:12,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:12,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:12,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:13,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 01:45:13,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 01:45:13,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 01:45:13,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 01:45:13,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:13,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:13,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:13,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:13,592 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 01:45:13,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:13,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:13,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:14,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:14,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:45:14,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:14,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:14,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 01:45:14,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:14,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:14,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:14,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:15,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 01:45:15,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:15,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 01:45:15,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:15,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:15,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 01:45:15,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:15,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:45:16,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:45:16,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 01:45:16,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:45:16,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:45:16,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 01:45:16,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:16,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:16,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:17,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:17,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:17,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:18,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:18,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:18,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:18,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:45:18,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:19,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:19,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:19,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:19,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 01:45:19,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:19,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 01:45:19,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 01:45:19,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 01:45:20,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:20,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:45:20,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:20,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:20,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:20,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:45:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:45:20,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:21,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:45:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:21,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:21,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 01:45:21,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 01:45:21,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:22,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 01:45:22,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:22,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:22,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:22,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:22,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 01:45:22,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:23,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:23,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 01:45:23,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:23,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 01:45:23,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:24,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:24,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:24,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 01:45:24,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:24,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:45:24,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:45:24,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 01:45:24,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:24,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:45:24,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:45:24,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 01:45:24,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:24,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:24,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:24,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:25,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 01:45:25,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:25,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:25,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 01:45:25,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:25,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:25,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:25,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:25,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:26,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 01:45:26,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 01:45:26,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:26,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:27,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:45:27,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:27,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:27,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 01:45:27,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:27,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:28,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:28,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:28,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:45:28,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 01:45:28,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:28,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:45:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:28,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:29,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:29,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:29,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 01:45:29,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:29,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:29,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:45:29,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:30,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:30,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 01:45:30,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:31,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:45:31,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:31,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 01:45:32,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:32,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:32,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:45:32,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:33,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:45:33,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 01:45:33,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:33,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:33,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:34,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:34,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:34,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:34,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:34,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:34,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:34,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:34,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:34,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 01:45:35,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 01:45:35,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:35,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:35,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:35,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:45:35,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:35,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:35,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:45:35,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:36,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:36,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:36,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 01:45:36,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:45:36,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 01:45:37,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:37,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:45:37,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:37,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:37,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 01:45:37,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:45:37,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:45:38,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:38,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:38,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:38,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:38,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:38,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:38,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 01:45:38,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:39,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:39,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:45:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:40,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:40,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 01:45:40,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:41,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:45:41,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:41,062 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 01:45:41,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:45:41,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:41,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 01:45:41,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:45:41,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 01:45:41,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:45:41,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:42,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:42,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:45:42,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:42,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:42,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:42,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 01:45:42,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 01:45:42,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:42,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:42,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:45:42,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:43,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:43,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 01:45:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 01:45:43,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 01:45:43,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:43,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 01:45:43,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 01:45:43,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:43,638 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 01:45:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:43,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:43,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:44,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 01:45:44,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:44,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:44,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:44,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:45:44,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:45:44,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:45,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:45,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:45,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:45:45,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 01:45:45,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:45:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:46,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:46,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:45:46,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:46,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:45:47,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:47,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:47,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:45:47,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:45:47,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:45:48,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 01:45:48,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:48,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:48,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:49,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 01:45:49,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:50,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:45:50,357 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 01:45:50,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:50,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:50,616 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 01:45:50,670 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 01:45:50,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:50,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:50,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:50,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:50,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:50,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:51,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 01:45:51,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:51,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:51,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:51,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 01:45:51,294 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 01:45:51,297 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 01:45:51,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 01:45:51,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:51,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:45:51,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:51,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:51,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 01:45:51,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 01:45:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:52,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:52,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:52,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:52,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 01:45:52,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 01:45:52,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 01:45:52,746 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 01:45:52,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:45:52,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:53,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 01:45:53,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:53,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:53,270 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 01:45:54,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:54,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 01:45:54,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 01:45:54,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 01:45:54,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 01:45:54,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 01:45:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:54,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:54,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:54,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:54,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 01:45:54,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 01:45:54,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:55,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:55,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:55,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:55,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 01:45:55,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 01:45:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:56,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:56,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 01:45:56,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:56,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:56,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:56,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 01:45:56,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:57,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:45:57,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:45:57,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 01:45:57,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 01:45:57,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 01:45:57,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:57,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 01:45:58,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:45:58,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:58,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 01:45:59,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:59,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:59,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:45:59,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:59,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:00,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:00,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:00,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:00,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:00,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 01:46:00,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:00,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:00,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:00,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:46:00,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:46:01,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:01,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:01,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:01,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:01,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:46:02,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 01:46:02,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 01:46:02,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:46:02,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:02,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 01:46:03,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:46:03,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:03,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 01:46:03,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:03,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:03,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:03,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:03,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:46:04,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 01:46:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:46:04,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:04,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:04,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:04,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:04,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:46:05,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 01:46:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:05,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:05,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 01:46:05,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 01:46:05,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:06,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:06,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:06,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:46:06,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:06,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:06,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:08,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:08,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:08,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:08,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:08,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:09,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:46:09,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:09,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:09,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 01:46:09,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:09,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:09,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:10,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:10,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:10,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 01:46:10,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:46:10,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:10,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:10,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:10,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:11,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:46:11,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:11,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 01:46:11,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 01:46:11,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 01:46:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 01:46:12,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 01:46:12,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:12,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:12,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:13,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:13,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:13,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:13,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:13,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:13,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:13,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:13,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:14,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:14,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 01:46:14,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 01:46:14,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:46:14,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 01:46:14,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 01:46:14,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:15,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 01:46:15,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:16,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:16,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:16,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:46:16,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 01:46:16,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:16,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:16,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:16,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:16,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 01:46:16,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 01:46:16,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:17,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 01:46:17,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 01:46:17,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:17,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:17,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:17,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:17,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:17,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:46:17,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 01:46:17,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:17,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:46:17,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 01:46:17,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:17,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 01:46:18,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:46:18,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:18,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:18,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:18,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:46:18,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:18,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:46:18,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:19,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:19,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:46:19,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:46:19,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:19,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 01:46:19,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:20,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:20,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:20,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:21,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 01:46:21,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:21,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:21,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:21,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:22,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 01:46:22,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:22,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:22,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:22,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:22,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:46:23,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 01:46:23,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:46:23,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:23,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:23,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:23,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 01:46:23,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:23,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 01:46:23,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:24,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:24,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:24,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:24,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 01:46:24,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 01:46:24,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 01:46:24,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:25,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:25,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:25,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:25,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:26,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:26,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:26,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:26,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:26,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:26,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:26,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 01:46:26,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:26,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 01:46:27,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:27,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 01:46:27,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:27,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 01:46:27,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 01:46:27,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:27,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:46:27,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:46:27,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:27,660 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 01:46:27,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:28,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 01:46:28,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:28,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:28,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 01:46:29,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:29,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:46:29,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:29,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:30,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:30,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:30,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:30,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 01:46:30,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 01:46:30,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:46:30,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 01:46:30,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:31,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:46:31,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:31,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 01:46:31,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:31,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:31,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:31,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:46:31,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:31,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:31,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:32,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:32,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:32,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:32,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 01:46:32,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:46:32,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 01:46:33,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:33,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:33,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:33,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:33,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:46:34,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 01:46:34,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 01:46:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:34,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:34,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:34,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:35,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:46:35,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:35,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:36,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 01:46:36,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:46:36,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:36,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 01:46:36,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:37,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:37,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 01:46:37,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:38,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:38,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:38,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:46:38,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 01:46:38,351 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 01:46:38,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:38,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:38,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:38,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 01:46:38,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:39,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 01:46:39,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:39,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:39,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:39,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:46:39,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 01:46:39,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:40,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 01:46:40,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:40,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:40,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:40,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 01:46:41,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:46:41,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 01:46:41,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:41,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:41,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:41,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:41,676 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 01:46:41,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 01:46:42,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 01:46:42,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:43,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:43,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:46:43,271 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 01:46:43,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:43,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:46:43,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:43,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 01:46:43,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 01:46:43,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:43,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:43,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:44,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:46:44,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 01:46:44,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 01:46:44,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:44,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:44,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 01:46:44,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:44,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:44,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:45,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:46:45,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:45,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 01:46:45,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 01:46:45,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 01:46:45,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:46:45,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:45,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 01:46:46,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:46,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:47,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:46:47,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:47,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:47,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 01:46:47,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:47,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:48,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:48,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:48,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:48,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:46:48,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:48,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:48,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:48,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:48,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:48,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:46:49,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:46:49,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:49,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 01:46:49,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 01:46:49,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:49,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:46:49,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:46:49,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:49,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:46:49,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:46:49,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:49,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:50,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:50,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:51,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 01:46:51,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:51,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:51,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:46:51,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:51,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:51,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:46:51,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:51,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:51,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:51,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:46:52,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:52,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:52,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 01:46:52,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 01:46:52,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:52,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:52,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:52,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:52,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:53,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 01:46:53,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:53,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:53,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:46:53,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:54,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:54,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:46:54,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:46:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 01:46:54,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:55,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:55,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:55,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:46:55,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 01:46:55,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:55,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:56,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:56,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 01:46:56,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 01:46:56,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:46:56,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:56,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 01:46:56,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:56,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:57,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:46:57,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 01:46:57,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:57,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 01:46:57,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:57,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:57,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 01:46:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:58,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:46:58,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:58,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 01:46:58,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:46:59,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:59,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:00,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:00,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 01:47:00,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 01:47:00,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 01:47:00,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 01:47:01,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:01,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:01,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:01,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:01,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:47:01,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 01:47:01,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 01:47:01,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:01,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:47:01,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:01,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:01,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:01,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:01,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 01:47:02,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:02,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:02,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 01:47:02,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 01:47:02,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 01:47:02,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:47:02,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:02,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:02,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:02,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:47:02,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:02,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 01:47:03,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:03,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:47:04,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 01:47:04,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:04,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:47:04,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 01:47:04,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:05,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:05,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 01:47:05,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:47:05,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:05,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:05,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:05,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:05,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:05,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:47:05,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:47:05,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:05,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:47:06,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 01:47:06,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 01:47:06,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:06,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 01:47:06,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:06,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:06,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:47:06,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:06,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:06,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:06,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:06,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:07,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:07,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:47:08,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:08,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:08,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:08,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:08,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:08,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 01:47:08,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 01:47:09,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:09,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:47:09,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:09,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:09,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:09,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:47:09,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:09,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:47:09,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:10,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:10,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:10,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 01:47:10,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:10,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:47:10,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:10,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:10,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:47:10,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:10,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:10,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:47:11,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:11,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:47:11,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:11,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 01:47:11,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:12,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 01:47:12,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:47:13,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:13,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:13,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 01:47:13,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 01:47:13,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:13,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 01:47:13,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:13,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:47:13,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 01:47:13,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:13,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:47:13,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 01:47:13,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:13,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:14,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 01:47:14,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 01:47:14,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:14,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 01:47:14,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:14,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:14,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:14,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 01:47:14,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 01:47:14,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 01:47:14,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:14,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:14,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 01:47:14,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:14,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:14,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:15,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:47:15,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 01:47:15,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:15,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:47:15,643 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 01:47:15,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:47:15,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:15,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:16,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 01:47:16,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:16,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:16,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:17,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 01:47:17,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:17,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:17,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 01:47:18,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:18,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:19,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:19,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:19,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:19,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:47:19,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:19,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:19,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:19,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 01:47:19,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:19,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:19,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:19,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 01:47:20,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:20,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:20,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:47:20,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:47:21,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 01:47:21,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:47:21,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:21,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 01:47:21,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:47:21,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:22,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:22,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:22,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 01:47:22,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 01:47:22,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:22,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:22,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:22,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 01:47:22,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:23,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 01:47:23,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:23,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:47:23,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:23,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:23,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:23,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:23,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:23,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:23,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:23,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 01:47:23,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:47:23,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:47:24,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:24,272 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 01:47:24,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:47:24,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:24,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:24,462 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 01:47:24,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:24,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 01:47:24,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:24,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:25,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:25,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 01:47:25,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 01:47:25,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:25,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:26,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:26,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 01:47:26,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:26,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 01:47:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 01:47:26,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:26,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:27,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:27,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 01:47:27,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 01:47:27,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:27,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:27,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:28,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 01:47:28,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:28,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:47:28,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:28,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:28,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 01:47:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 01:47:28,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 01:47:28,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 01:47:29,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:30,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:30,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 01:47:30,496 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 01:47:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 01:47:30,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 01:47:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:30,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 01:47:31,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 01:47:31,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:47:31,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:31,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 01:47:31,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:47:31,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 01:47:32,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:32,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:32,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:32,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:32,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:47:32,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:32,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 01:47:32,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 01:47:32,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 01:47:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:32,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 01:47:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:32,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:47:32,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:32,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:33,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:47:33,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 01:47:33,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:33,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:33,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:33,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:33,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:33,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:47:33,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:33,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 01:47:34,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:47:34,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:47:34,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:34,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 01:47:34,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:47:35,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:47:35,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 01:47:35,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:36,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:36,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:36,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:36,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:36,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 01:47:37,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:37,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:47:37,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:37,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:37,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 01:47:37,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:38,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:47:38,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:38,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:47:38,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:47:38,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:47:38,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:38,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:39,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:39,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:47:39,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:39,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 01:47:39,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:39,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:40,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:40,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:47:40,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:40,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 01:47:40,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:47:40,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:40,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 01:47:40,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:47:40,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:47:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 01:47:41,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 01:47:41,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 01:47:41,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:41,347 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 01:47:41,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:41,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:41,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:41,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 01:47:41,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:41,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:42,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 01:47:42,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 01:47:42,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 01:47:43,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 01:47:43,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:47:43,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:47:43,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:43,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 01:47:43,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:43,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:47:43,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:43,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:47:43,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:44,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:44,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:44,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:44,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:44,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:44,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 01:47:44,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:44,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:45,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:45,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:47:45,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:47:45,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:45,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:45,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:45,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:46,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:46,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:47,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:47,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:47:47,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 01:47:47,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:47,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:15,934 INFO [train.py:1386] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:48:18,279 INFO [train.py:1386] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:48:21,315 INFO [train.py:1386] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:48:23,306 INFO [train.py:1386] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:48:29,321 INFO [train.py:1386] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:48:32,208 INFO [train.py:1386] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:48:32,220 INFO [train.py:1267] (2/4) Loading grad scaler state dict 2023-10-02 01:48:49,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:48:49,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:48:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:48:49,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:49,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:49,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:49,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:49,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:49,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:49,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:49,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:48:50,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:48:50,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:48:50,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:48:50,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:48:50,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:48:50,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:48:50,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:48:50,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:50,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:50,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:50,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:51,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:51,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:48:51,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:51,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:51,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:51,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:48:51,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:51,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:48:52,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:48:52,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:52,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:52,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:48:52,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:48:52,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:48:52,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:52,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:48:53,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:48:53,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:48:53,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:48:53,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:48:53,630 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:48:53,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:48:53,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:48:53,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:53,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:48:53,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:48:53,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:48:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:48:57,892 INFO [train.py:1046] (2/4) Epoch 21, batch 0, loss[loss=0.1643, simple_loss=0.2518, pruned_loss=0.03843, over 24649.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2518, pruned_loss=0.03843, over 24649.00 frames. ], batch size: 68, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:48:57,892 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 01:49:10,080 INFO [train.py:1078] (2/4) Epoch 21, validation: loss=0.2779, simple_loss=0.2712, pruned_loss=0.1423, over 1125622.00 frames. 2023-10-02 01:49:10,081 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20163MB 2023-10-02 01:49:13,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 01:49:13,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:49:14,005 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.26 vs. limit=15.0 2023-10-02 01:49:16,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:49:19,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:20,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:49:20,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:21,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 01:49:22,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 01:49:25,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:25,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:27,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=708346.6666666666, ans=0.125 2023-10-02 01:49:28,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:28,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:28,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:49:29,763 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.013e+02 2.311e+02 4.182e+02, threshold=4.026e+02, percent-clipped=1.0 2023-10-02 01:49:29,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:49:31,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 01:49:34,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:49:37,291 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:49:39,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=708413.3333333334, ans=0.0 2023-10-02 01:49:41,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:49:41,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 01:49:47,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:49:47,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:49:50,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:49:54,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:49:55,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=708480.0, ans=0.125 2023-10-02 01:49:57,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:01,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 01:50:04,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 01:50:04,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:50:04,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:06,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:50:06,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:50:07,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 01:50:10,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:10,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:12,317 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:50:15,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:50:15,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=708546.6666666666, ans=0.1 2023-10-02 01:50:15,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=708546.6666666666, ans=0.125 2023-10-02 01:50:18,065 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 01:50:19,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:50:22,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:50:24,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:50:26,147 INFO [train.py:1046] (2/4) Epoch 21, batch 50, loss[loss=0.1403, simple_loss=0.2185, pruned_loss=0.03107, over 24421.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2533, pruned_loss=0.04834, over 1073440.32 frames. ], batch size: 58, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:50:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 01:50:26,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:50:26,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:50:28,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:50:29,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:50:31,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:50:32,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=708613.3333333334, ans=0.125 2023-10-02 01:50:36,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 01:50:36,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:42,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:50:43,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 01:50:45,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 01:50:47,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:50:48,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:50:48,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:50,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:50:50,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:50:51,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:50:51,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:51:02,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:51:03,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:03,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:51:05,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 01:51:05,791 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.03 vs. limit=15.0 2023-10-02 01:51:06,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:51:08,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:51:08,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 01:51:09,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:51:09,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=708813.3333333334, ans=0.0 2023-10-02 01:51:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 01:51:17,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:51:17,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:51:18,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:18,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:51:18,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:51:23,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 01:51:23,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 01:51:25,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:25,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:51:27,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:51:28,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:51:28,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 01:51:29,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 01:51:30,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:51:32,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:51:33,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:51:35,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 01:51:35,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 01:51:36,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:51:38,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:39,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:51:39,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:51:40,828 INFO [train.py:1046] (2/4) Epoch 21, batch 100, loss[loss=0.1775, simple_loss=0.2616, pruned_loss=0.04668, over 23967.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2551, pruned_loss=0.04918, over 1895224.37 frames. ], batch size: 86, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:51:42,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:51:43,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:51:48,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:51:49,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 01:51:49,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:54,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:51:54,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:51:54,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:54,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:51:54,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:51:57,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 01:51:58,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:52:00,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.895e+02 2.091e+02 2.365e+02 3.412e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 01:52:00,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:00,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:00,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:52:02,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 01:52:04,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:04,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:04,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:52:08,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:52:11,427 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 01:52:11,440 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 01:52:12,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:12,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:52:17,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:52:19,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:20,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:25,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:26,658 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 01:52:28,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:52:30,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:52:32,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:52:33,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:33,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=709146.6666666666, ans=0.125 2023-10-02 01:52:36,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:38,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:52:40,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:52:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:42,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=709213.3333333334, ans=0.125 2023-10-02 01:52:43,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:44,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:44,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:52:44,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:44,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 01:52:44,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 01:52:44,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:45,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:52:46,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:46,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:46,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:52:46,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:52:48,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:52:48,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:49,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:50,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:52,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:52:52,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:52:54,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:56,944 INFO [train.py:1046] (2/4) Epoch 21, batch 150, loss[loss=0.1628, simple_loss=0.2415, pruned_loss=0.0421, over 21111.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2553, pruned_loss=0.05025, over 2519837.52 frames. ], batch size: 46, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 01:52:57,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:52:57,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:52:57,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:57,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=709280.0, ans=0.125 2023-10-02 01:52:59,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:59,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:00,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=709280.0, ans=0.125 2023-10-02 01:53:01,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:53:02,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:04,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=709280.0, ans=10.0 2023-10-02 01:53:07,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 01:53:07,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 01:53:07,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 01:53:12,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:53:12,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:53:12,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:53:13,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:53:13,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:53:14,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:15,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:18,148 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 01:53:19,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:53:26,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:53:29,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:53:30,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 01:53:32,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=709413.3333333334, ans=0.0 2023-10-02 01:53:33,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:53:33,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:53:33,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:53:36,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:53:36,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:53:36,971 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:53:38,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:53:41,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:41,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 01:53:44,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=709480.0, ans=0.125 2023-10-02 01:53:46,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:46,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:53:48,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:53:48,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:53:49,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:51,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:53:52,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:53:54,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:53:55,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:53:57,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:53:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 01:53:59,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:53:59,165 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 01:53:59,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=709546.6666666666, ans=0.1 2023-10-02 01:54:03,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:54:06,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:54:06,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:54:09,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 01:54:09,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:54:11,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:12,620 INFO [train.py:1046] (2/4) Epoch 21, batch 200, loss[loss=0.1979, simple_loss=0.2795, pruned_loss=0.05815, over 24040.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2567, pruned_loss=0.05222, over 2996630.01 frames. ], batch size: 80, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:54:12,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 01:54:12,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:54:14,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:15,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:54:20,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:54:20,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:54:21,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:32,803 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.906e+02 2.104e+02 2.322e+02 3.848e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 01:54:40,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:54:40,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:54:43,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:54:43,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:54:45,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:54:45,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:54:46,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:54:47,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:54:47,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:54:47,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:54:49,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 01:54:49,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:54:49,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=709746.6666666666, ans=0.025 2023-10-02 01:54:50,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:53,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:54:58,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=709813.3333333334, ans=0.125 2023-10-02 01:55:01,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:55:09,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:55:18,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:20,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 01:55:20,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:55:22,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:55:22,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:55:23,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:55:25,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=709946.6666666666, ans=0.05 2023-10-02 01:55:26,250 INFO [train.py:1046] (2/4) Epoch 21, batch 250, loss[loss=0.1876, simple_loss=0.2678, pruned_loss=0.05366, over 24491.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2544, pruned_loss=0.05142, over 3388461.85 frames. ], batch size: 66, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:55:26,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 01:55:26,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:55:26,380 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 01:55:26,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=709946.6666666666, ans=0.0 2023-10-02 01:55:29,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:29,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:55:30,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:31,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:55:35,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:55:35,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:35,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.60 vs. limit=15.0 2023-10-02 01:55:36,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:55:41,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:55:41,388 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:55:42,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=710013.3333333334, ans=0.125 2023-10-02 01:55:50,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:55:51,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=710013.3333333334, ans=0.1 2023-10-02 01:55:52,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:55:53,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:55:58,371 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.28 vs. limit=6.0 2023-10-02 01:55:59,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:56:00,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:56:00,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:56:00,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:56:02,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:56:02,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:56:03,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:56:05,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:56:06,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 01:56:08,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:56:08,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=710080.0, ans=0.09899494936611666 2023-10-02 01:56:10,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:56:11,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:56:11,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:56:11,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:56:14,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:56:14,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:56:15,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:17,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:56:18,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:23,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:56:26,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=710213.3333333334, ans=0.2 2023-10-02 01:56:27,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:30,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:56:36,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:37,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:56:39,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 01:56:40,514 INFO [train.py:1046] (2/4) Epoch 21, batch 300, loss[loss=0.1897, simple_loss=0.2506, pruned_loss=0.06443, over 23757.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2527, pruned_loss=0.05034, over 3691964.59 frames. ], batch size: 164, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:56:40,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:56:40,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:56:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 01:56:42,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:56:44,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:56:44,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 01:56:49,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:51,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:56:54,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:56:56,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 01:56:56,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:57,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:56:57,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 01:56:57,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:02,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.853e+02 2.062e+02 2.397e+02 3.479e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 01:57:02,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:57:04,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:57:04,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 01:57:09,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 01:57:09,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:12,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:13,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-02 01:57:13,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:13,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 01:57:13,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:57:17,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:57:17,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=710413.3333333334, ans=15.0 2023-10-02 01:57:20,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:57:20,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:57:25,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:57:25,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 01:57:25,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=710480.0, ans=15.0 2023-10-02 01:57:26,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:57:28,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:29,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 01:57:29,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:57:32,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:57:35,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:57:35,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 01:57:36,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=710480.0, ans=0.125 2023-10-02 01:57:39,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:39,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:57:42,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:44,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:57:45,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 01:57:45,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:57:46,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.09 vs. limit=15.0 2023-10-02 01:57:46,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:57:48,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 01:57:48,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:48,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:57:49,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:51,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:57:51,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:57:56,567 INFO [train.py:1046] (2/4) Epoch 21, batch 350, loss[loss=0.1572, simple_loss=0.2382, pruned_loss=0.0381, over 24452.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2506, pruned_loss=0.04959, over 3915555.00 frames. ], batch size: 58, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:57:56,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:57:56,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:57:59,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=710613.3333333334, ans=0.125 2023-10-02 01:58:00,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:05,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:58:06,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=710613.3333333334, ans=0.2 2023-10-02 01:58:08,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:08,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:11,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 01:58:11,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:58:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 01:58:16,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:16,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 01:58:16,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:58:20,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 01:58:22,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:58:22,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:58:24,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:58:25,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.04 vs. limit=15.0 2023-10-02 01:58:26,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:26,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:26,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:58:27,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:27,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:58:30,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:58:30,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:35,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:58:35,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:58:37,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:58:38,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:42,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 01:58:42,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:47,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:58:47,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:58:49,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 01:58:53,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:58:55,177 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 01:58:56,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 01:58:56,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:00,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:59:00,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 01:59:02,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:02,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=710880.0, ans=0.125 2023-10-02 01:59:04,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:59:04,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=710880.0, ans=0.0 2023-10-02 01:59:07,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:07,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:07,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:59:10,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:59:10,851 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.49 vs. limit=15.0 2023-10-02 01:59:11,354 INFO [train.py:1046] (2/4) Epoch 21, batch 400, loss[loss=0.1791, simple_loss=0.2517, pruned_loss=0.05321, over 23834.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2496, pruned_loss=0.04923, over 4077086.64 frames. ], batch size: 195, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 01:59:12,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:59:14,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:59:14,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 01:59:15,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:15,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:19,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:59:19,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:21,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:23,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:25,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 01:59:25,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=711013.3333333334, ans=0.1 2023-10-02 01:59:26,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 01:59:26,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:26,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 01:59:26,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:28,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=711013.3333333334, ans=0.2 2023-10-02 01:59:30,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:59:30,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:59:30,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 01:59:31,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:59:31,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:32,630 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.816e+02 1.987e+02 2.321e+02 3.446e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 01:59:32,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:59:32,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:33,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=711013.3333333334, ans=0.0 2023-10-02 01:59:34,205 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 01:59:34,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 01:59:37,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=711013.3333333334, ans=0.125 2023-10-02 01:59:37,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=15.0 2023-10-02 01:59:38,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:39,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:41,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 01:59:41,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 01:59:44,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:59:48,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:59:50,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=711080.0, ans=0.025 2023-10-02 01:59:54,056 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-10-02 01:59:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 01:59:57,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:59:59,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 02:00:00,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:00:03,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:00:03,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 02:00:07,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:00:09,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:00:10,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:00:12,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=711213.3333333334, ans=0.05 2023-10-02 02:00:13,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:13,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 02:00:16,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:00:17,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 02:00:19,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:00:19,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:00:21,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 02:00:23,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:00:24,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:00:24,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:00:25,793 INFO [train.py:1046] (2/4) Epoch 21, batch 450, loss[loss=0.1987, simple_loss=0.2665, pruned_loss=0.06548, over 23469.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.251, pruned_loss=0.04994, over 4207185.54 frames. ], batch size: 285, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 02:00:25,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 02:00:25,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:00:26,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:00:26,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=711280.0, ans=0.125 2023-10-02 02:00:27,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:00:27,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 02:00:28,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:00:30,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:00:31,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:00:41,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:42,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:00:44,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 02:00:44,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 02:00:47,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:00:50,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:51,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:00:54,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:00:56,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:00:57,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=711413.3333333334, ans=0.05 2023-10-02 02:00:58,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 02:00:58,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 02:01:00,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 02:01:00,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:01,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:03,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:01:04,599 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 02:01:04,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 02:01:04,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:01:06,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:01:06,719 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=12.0 2023-10-02 02:01:07,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:01:12,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:01:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:01:13,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:01:13,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 02:01:18,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:01:20,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:01:20,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:01:23,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 02:01:26,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:01:27,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 02:01:29,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 02:01:29,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:01:29,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=711546.6666666666, ans=0.0 2023-10-02 02:01:35,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:01:36,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:01:36,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=711546.6666666666, ans=0.0 2023-10-02 02:01:37,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.57 vs. limit=10.0 2023-10-02 02:01:37,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:01:37,805 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 02:01:40,465 INFO [train.py:1046] (2/4) Epoch 21, batch 500, loss[loss=0.1779, simple_loss=0.2661, pruned_loss=0.04479, over 24419.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2519, pruned_loss=0.05042, over 4310944.99 frames. ], batch size: 69, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 02:01:41,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:01:43,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:43,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 02:01:45,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 02:01:45,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:47,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:01:50,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:01:50,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=711613.3333333334, ans=0.0 2023-10-02 02:01:53,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:01:56,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:01:56,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:58,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:02,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.454e+02 2.030e+02 2.224e+02 2.686e+02 4.005e+02, threshold=4.448e+02, percent-clipped=1.0 2023-10-02 02:02:06,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:06,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:02:06,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:02:06,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 02:02:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:02:09,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:02:10,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:02:12,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:02:12,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:12,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 02:02:15,718 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 02:02:16,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-10-02 02:02:18,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=711746.6666666666, ans=0.0 2023-10-02 02:02:19,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:19,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:20,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:21,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:21,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:02:25,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 02:02:28,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:02:29,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:35,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:39,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:40,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=711880.0, ans=0.125 2023-10-02 02:02:41,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 02:02:41,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:44,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 02:02:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:02:47,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:52,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=711880.0, ans=0.0 2023-10-02 02:02:53,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 02:02:55,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 02:02:55,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:55,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 02:02:55,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:02:55,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:56,813 INFO [train.py:1046] (2/4) Epoch 21, batch 550, loss[loss=0.1539, simple_loss=0.236, pruned_loss=0.03586, over 24291.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.253, pruned_loss=0.05087, over 4404169.84 frames. ], batch size: 61, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 02:02:56,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:57,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:57,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:02:58,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:03:01,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:03:03,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 02:03:03,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:03:03,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.23 vs. limit=10.0 2023-10-02 02:03:07,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:07,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:10,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:03:10,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=712013.3333333334, ans=0.125 2023-10-02 02:03:13,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:18,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 02:03:18,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 02:03:18,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=712013.3333333334, ans=0.125 2023-10-02 02:03:19,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:03:26,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:03:26,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:03:26,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=712080.0, ans=0.125 2023-10-02 02:03:29,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:03:29,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=712080.0, ans=0.0 2023-10-02 02:03:32,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:32,293 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 02:03:32,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:33,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:03:36,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:03:36,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:03:37,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:03:37,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:39,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 02:03:40,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=712146.6666666666, ans=0.0 2023-10-02 02:03:41,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 02:03:43,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:03:43,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:03:44,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:03:44,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:03:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:03:50,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:03:52,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:03:52,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:53,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 02:03:55,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:03:56,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:03:57,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.05 vs. limit=15.0 2023-10-02 02:03:58,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:04:00,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:01,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:04:01,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 02:04:05,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=712213.3333333334, ans=0.1 2023-10-02 02:04:06,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 02:04:07,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 02:04:10,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:04:10,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:04:10,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:11,862 INFO [train.py:1046] (2/4) Epoch 21, batch 600, loss[loss=0.1602, simple_loss=0.224, pruned_loss=0.04819, over 22821.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2527, pruned_loss=0.05044, over 4481523.76 frames. ], batch size: 322, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:04:13,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=712280.0, ans=0.0 2023-10-02 02:04:17,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:04:19,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:04:20,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 02:04:20,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=712280.0, ans=0.07 2023-10-02 02:04:22,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:04:24,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:04:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:29,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 02:04:29,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:04:31,165 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:04:35,556 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.808e+02 2.033e+02 2.469e+02 3.913e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 02:04:35,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 02:04:38,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:04:38,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:38,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:04:43,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:04:43,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:04:43,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:45,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=712413.3333333334, ans=0.0 2023-10-02 02:04:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:04:51,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.87 vs. limit=15.0 2023-10-02 02:04:53,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:53,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:04:53,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:05:03,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 02:05:07,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=712480.0, ans=0.2 2023-10-02 02:05:08,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:05:08,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:05:11,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 02:05:12,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:05:15,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 02:05:15,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:05:15,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:05:20,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 02:05:20,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:05:23,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:05:24,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:05:26,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:27,954 INFO [train.py:1046] (2/4) Epoch 21, batch 650, loss[loss=0.1889, simple_loss=0.2688, pruned_loss=0.05449, over 23863.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.251, pruned_loss=0.05003, over 4524739.85 frames. ], batch size: 86, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:05:29,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 02:05:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:05:35,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:05:35,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:05:38,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=712613.3333333334, ans=0.125 2023-10-02 02:05:39,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:05:43,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 02:05:46,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:05:46,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:05:47,430 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-10-02 02:05:48,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=712680.0, ans=0.125 2023-10-02 02:05:50,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:05:52,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:05:55,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:05:55,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:56,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:05:58,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:59,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=712746.6666666666, ans=0.2 2023-10-02 02:06:00,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:06:01,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:06:01,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 02:06:01,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:06:01,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:06:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:05,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=712746.6666666666, ans=0.125 2023-10-02 02:06:06,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:06:07,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:07,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:06:09,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 02:06:09,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:06:09,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:06:10,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:06:10,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:06:13,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:06:13,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 02:06:14,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 02:06:14,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:14,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:06:15,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:06:15,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:06:17,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:06:24,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:24,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:06:26,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:06:30,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:30,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:06:30,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=712880.0, ans=0.1 2023-10-02 02:06:31,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:38,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:06:38,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:06:40,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:06:40,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:06:41,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.73 vs. limit=15.0 2023-10-02 02:06:41,698 INFO [train.py:1046] (2/4) Epoch 21, batch 700, loss[loss=0.1637, simple_loss=0.242, pruned_loss=0.0427, over 23093.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2504, pruned_loss=0.04949, over 4584467.80 frames. ], batch size: 50, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:06:42,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=712946.6666666666, ans=0.125 2023-10-02 02:06:43,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 02:06:44,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 02:06:47,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 02:06:47,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:48,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:06:50,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 02:06:51,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=712946.6666666666, ans=0.2 2023-10-02 02:06:51,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=712946.6666666666, ans=0.07 2023-10-02 02:06:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:06:57,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:06:59,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:07:00,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:07:01,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:07:03,243 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.955e+02 2.291e+02 2.559e+02 6.231e+02, threshold=4.583e+02, percent-clipped=1.0 2023-10-02 02:07:04,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:07:06,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 02:07:06,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:07:07,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 02:07:09,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=713080.0, ans=0.1 2023-10-02 02:07:09,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=713080.0, ans=0.125 2023-10-02 02:07:11,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 02:07:12,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=713080.0, ans=0.07 2023-10-02 02:07:15,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:07:15,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:07:16,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:07:19,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:07:19,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 02:07:23,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:07:23,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:07:25,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 02:07:27,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:07:29,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:07:30,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:07:35,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:07:35,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 02:07:39,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 02:07:39,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 02:07:42,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:07:43,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:07:45,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:07:47,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:07:47,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 02:07:52,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 02:07:52,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 02:07:52,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 02:07:53,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 02:07:54,806 INFO [train.py:1046] (2/4) Epoch 21, batch 750, loss[loss=0.1895, simple_loss=0.2698, pruned_loss=0.05458, over 24326.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2505, pruned_loss=0.04997, over 4601867.95 frames. ], batch size: 77, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:07:54,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 02:07:54,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:07:57,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 02:07:58,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:08:00,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:08:00,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:03,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:04,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:08:04,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:08:07,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:08:07,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:08:09,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:08:11,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:12,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:13,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 02:08:14,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:08:14,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:08:16,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:08:16,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=713346.6666666666, ans=0.125 2023-10-02 02:08:19,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:08:20,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 02:08:20,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:08:20,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=713346.6666666666, ans=0.125 2023-10-02 02:08:23,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 02:08:23,171 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 02:08:24,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 02:08:24,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:08:24,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:08:27,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:08:35,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:08:35,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:08:35,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:08:36,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:37,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:08:37,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 02:08:38,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:08:40,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 02:08:40,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:08:44,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:08:45,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 02:08:46,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:08:51,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:08:52,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:08:52,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:54,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:08:58,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 02:08:58,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:08:58,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:03,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:03,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:05,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:05,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:09:09,591 INFO [train.py:1046] (2/4) Epoch 21, batch 800, loss[loss=0.1623, simple_loss=0.2325, pruned_loss=0.04603, over 23739.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2516, pruned_loss=0.05038, over 4618035.71 frames. ], batch size: 135, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:09:15,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:15,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:18,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:09:18,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:18,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:19,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:20,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:24,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:24,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:09:27,208 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-10-02 02:09:27,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 02:09:27,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:29,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:29,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:09:29,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:09:30,649 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.851e+02 2.052e+02 2.465e+02 3.868e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-02 02:09:30,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 02:09:30,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:30,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 02:09:32,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=713680.0, ans=0.0 2023-10-02 02:09:34,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=713680.0, ans=0.0 2023-10-02 02:09:35,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:37,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:40,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:40,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:09:42,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:42,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:09:47,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:09:47,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 02:09:49,276 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 02:09:49,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 02:09:49,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:09:49,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:52,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:52,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:09:56,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 02:09:57,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 02:09:59,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:10:00,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:10:05,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:10:08,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:10:10,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 02:10:10,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:10:15,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 02:10:16,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=713880.0, ans=0.125 2023-10-02 02:10:22,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:10:23,522 INFO [train.py:1046] (2/4) Epoch 21, batch 850, loss[loss=0.2307, simple_loss=0.2918, pruned_loss=0.08474, over 19361.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2525, pruned_loss=0.05039, over 4648536.46 frames. ], batch size: 388, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:10:23,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:10:23,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 02:10:25,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:10:25,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:10:26,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 02:10:26,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:27,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:10:28,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:10:28,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=713946.6666666666, ans=0.0 2023-10-02 02:10:29,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:10:30,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:10:33,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 02:10:33,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 02:10:33,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 02:10:35,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:10:35,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:10:37,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:10:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:10:39,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:10:40,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=714013.3333333334, ans=0.125 2023-10-02 02:10:44,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:44,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:10:44,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 02:10:47,524 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:10:48,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 02:10:50,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=714013.3333333334, ans=0.125 2023-10-02 02:10:51,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:52,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 02:10:53,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.39 vs. limit=10.0 2023-10-02 02:10:55,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 02:10:57,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 02:11:00,004 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 02:11:00,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:11:00,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:11:00,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:11:02,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:04,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:04,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 02:11:06,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:11:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:11:08,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:11:10,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:11:10,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:11:12,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:11:12,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 02:11:17,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:11:17,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:11:18,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:11:18,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:11:18,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:11:20,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:21,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:11:23,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:11:24,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:11:26,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:11:33,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:11:33,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=714213.3333333334, ans=0.125 2023-10-02 02:11:34,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:11:34,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 02:11:34,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=714213.3333333334, ans=0.05 2023-10-02 02:11:35,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:11:35,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:11:37,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 02:11:38,248 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-10-02 02:11:38,614 INFO [train.py:1046] (2/4) Epoch 21, batch 900, loss[loss=0.1743, simple_loss=0.2441, pruned_loss=0.05224, over 24453.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2522, pruned_loss=0.05008, over 4667208.31 frames. ], batch size: 58, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:11:42,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:11:45,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:11:47,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 02:11:48,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=714280.0, ans=0.09899494936611666 2023-10-02 02:11:51,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:11:51,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 02:11:52,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:11:52,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=714346.6666666666, ans=0.125 2023-10-02 02:11:54,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:11:54,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:11:54,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:11:54,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:12:00,924 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.833e+02 2.065e+02 2.368e+02 4.209e+02, threshold=4.129e+02, percent-clipped=1.0 2023-10-02 02:12:02,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:02,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:12:02,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:12:04,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:12:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 02:12:13,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:12:20,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:12:20,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:12:21,438 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 02:12:21,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 02:12:27,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:12:27,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:12:28,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:12:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:35,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:12:35,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714480.0, ans=0.1 2023-10-02 02:12:38,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 02:12:38,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:12:41,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 02:12:41,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=714546.6666666666, ans=0.125 2023-10-02 02:12:43,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:12:43,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:43,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=714546.6666666666, ans=0.5 2023-10-02 02:12:43,804 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.16 vs. limit=6.0 2023-10-02 02:12:44,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:12:46,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:12:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 02:12:51,681 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 02:12:52,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:12:52,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 02:12:54,268 INFO [train.py:1046] (2/4) Epoch 21, batch 950, loss[loss=0.2071, simple_loss=0.2712, pruned_loss=0.07154, over 19719.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2521, pruned_loss=0.05034, over 4674606.46 frames. ], batch size: 388, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:12:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:59,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 02:13:02,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:05,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:05,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:07,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:13:09,891 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 02:13:14,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:14,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:13:14,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:14,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:13:16,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 02:13:16,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:13:17,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:20,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 02:13:20,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:13:22,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.13 vs. limit=15.0 2023-10-02 02:13:24,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:26,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:13:26,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:13:27,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 02:13:28,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 02:13:30,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:13:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:13:34,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=714746.6666666666, ans=0.125 2023-10-02 02:13:35,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:13:35,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:38,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 02:13:38,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=714813.3333333334, ans=0.125 2023-10-02 02:13:40,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 02:13:40,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:13:42,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:13:43,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:43,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:13:46,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 02:13:47,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:13:49,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:13:51,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:51,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 02:13:51,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:51,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:13:51,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 02:13:56,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:13:58,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:59,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=714880.0, ans=0.125 2023-10-02 02:14:01,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:14:03,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 02:14:03,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 02:14:06,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:14:08,787 INFO [train.py:1046] (2/4) Epoch 21, batch 1000, loss[loss=0.152, simple_loss=0.2182, pruned_loss=0.04288, over 23645.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2509, pruned_loss=0.05024, over 4678009.35 frames. ], batch size: 232, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:14:08,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 02:14:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:11,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.71 vs. limit=15.0 2023-10-02 02:14:14,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:14:16,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 02:14:16,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 02:14:17,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=714946.6666666666, ans=0.125 2023-10-02 02:14:22,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:22,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:14:24,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:27,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 02:14:30,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 02:14:31,673 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.852e+02 2.059e+02 2.423e+02 3.876e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-02 02:14:31,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 02:14:32,302 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.57 vs. limit=15.0 2023-10-02 02:14:32,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.44 vs. limit=12.0 2023-10-02 02:14:33,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:14:34,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 02:14:37,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 02:14:37,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 02:14:37,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:38,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:47,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:47,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:14:49,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:51,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:51,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 02:14:51,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:14:51,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=715080.0, ans=0.1 2023-10-02 02:14:52,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:14:54,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 02:14:57,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 02:14:58,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 02:15:01,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 02:15:02,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:15:08,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:08,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:15:09,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:09,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:15:10,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 02:15:12,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:15:12,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 02:15:12,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 02:15:13,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:15:13,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:15:16,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:15:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:15:21,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:15:23,777 INFO [train.py:1046] (2/4) Epoch 21, batch 1050, loss[loss=0.1783, simple_loss=0.248, pruned_loss=0.05431, over 23719.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2493, pruned_loss=0.0501, over 4674405.11 frames. ], batch size: 179, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:15:23,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:15:25,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:15:26,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:15:28,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:15:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:15:33,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:15:36,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:15:36,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:15:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:15:37,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:15:39,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 02:15:39,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:15:39,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 02:15:40,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:15:40,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 02:15:42,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:15:48,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:48,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:15:48,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:15:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 02:15:53,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 02:15:54,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:15:56,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 02:15:59,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 02:16:00,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:03,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:16:04,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:16:05,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:16:05,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:16:08,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:16:08,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=715480.0, ans=0.0 2023-10-02 02:16:11,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 02:16:13,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 02:16:13,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 02:16:13,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=715480.0, ans=0.125 2023-10-02 02:16:14,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:16:14,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:16:16,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 02:16:20,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:16:23,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:16:23,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:16:24,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:16:24,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:28,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:28,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 02:16:29,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:16:29,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 02:16:31,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 02:16:31,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:16:33,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:16:36,661 INFO [train.py:1046] (2/4) Epoch 21, batch 1100, loss[loss=0.1687, simple_loss=0.2571, pruned_loss=0.04012, over 24491.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.249, pruned_loss=0.04972, over 4664524.51 frames. ], batch size: 69, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:16:39,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:16:44,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:16:47,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:16:47,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:16:48,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 02:16:49,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=715613.3333333334, ans=0.0 2023-10-02 02:16:50,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:16:53,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:16:54,241 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-10-02 02:16:54,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:16:57,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:16:57,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 02:16:59,257 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.784e+02 1.995e+02 2.356e+02 3.579e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-02 02:16:59,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:17:00,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:17:00,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:17:03,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:17:04,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:17:07,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=715746.6666666666, ans=0.0 2023-10-02 02:17:09,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:17:13,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 02:17:13,388 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 02:17:14,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:17,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:17:19,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:17:22,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 02:17:22,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:17:22,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:17:22,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:17:22,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:24,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 02:17:30,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:17:30,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 02:17:33,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:17:33,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=715813.3333333334, ans=0.0 2023-10-02 02:17:33,790 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:17:36,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:17:39,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 02:17:39,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:17:40,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:42,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:17:42,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:17:44,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 02:17:46,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:17:46,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:17:47,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 02:17:47,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:17:48,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 02:17:48,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:17:48,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:17:50,265 INFO [train.py:1046] (2/4) Epoch 21, batch 1150, loss[loss=0.1735, simple_loss=0.2455, pruned_loss=0.0508, over 23362.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2496, pruned_loss=0.04964, over 4685639.95 frames. ], batch size: 134, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:17:50,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:17:56,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=715946.6666666666, ans=0.125 2023-10-02 02:17:57,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:17:59,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:18:01,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:18:01,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:18:01,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 02:18:01,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=715946.6666666666, ans=0.0 2023-10-02 02:18:03,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:18:05,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 02:18:06,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:18:06,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:18:10,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 02:18:12,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:18:15,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.42 vs. limit=10.0 2023-10-02 02:18:16,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:18:16,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:16,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 02:18:16,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:18:16,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:18:23,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 02:18:23,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=716080.0, ans=0.125 2023-10-02 02:18:24,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:18:25,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:18:32,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:37,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:37,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 02:18:39,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:39,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:47,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.30 vs. limit=15.0 2023-10-02 02:18:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 02:18:49,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:57,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 02:18:58,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.94 vs. limit=15.0 2023-10-02 02:19:00,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:01,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:19:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:19:01,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:19:05,837 INFO [train.py:1046] (2/4) Epoch 21, batch 1200, loss[loss=0.1578, simple_loss=0.228, pruned_loss=0.0438, over 24462.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2496, pruned_loss=0.04955, over 4692016.43 frames. ], batch size: 58, lr: 4.93e-03, grad_scale: 32.0 2023-10-02 02:19:05,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:19:08,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=716280.0, ans=0.125 2023-10-02 02:19:10,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:19:10,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:19:13,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:13,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:13,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:19:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:19:16,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:19:18,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:19:19,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:19:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 02:19:23,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 02:19:27,888 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.837e+02 2.069e+02 2.341e+02 3.988e+02, threshold=4.139e+02, percent-clipped=0.0 2023-10-02 02:19:28,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:19:30,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:19:32,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:33,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:19:33,663 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 02:19:35,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:39,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=716413.3333333334, ans=0.05 2023-10-02 02:19:41,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:19:41,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:19:41,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=716413.3333333334, ans=0.125 2023-10-02 02:19:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 02:19:44,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:19:48,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 02:19:52,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 02:19:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:54,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:19:55,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:19:57,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:19:58,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:58,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:19:58,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=716480.0, ans=0.2 2023-10-02 02:20:00,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:20:00,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 02:20:02,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:20:02,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:20:02,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:20:05,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:20:05,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:20:08,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:20:10,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:20:12,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 02:20:16,212 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 02:20:18,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:20:20,214 INFO [train.py:1046] (2/4) Epoch 21, batch 1250, loss[loss=0.1686, simple_loss=0.2497, pruned_loss=0.04373, over 24645.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.251, pruned_loss=0.04986, over 4704511.37 frames. ], batch size: 65, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:20:20,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:20:22,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.69 vs. limit=12.0 2023-10-02 02:20:22,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:20:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:20:28,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 02:20:30,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:20:32,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:20:32,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 02:20:33,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=716613.3333333334, ans=0.125 2023-10-02 02:20:35,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:20:37,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:20:40,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:20:41,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:20:41,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:20:41,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:20:42,171 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.33 vs. limit=10.0 2023-10-02 02:20:44,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:20:48,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=716680.0, ans=0.125 2023-10-02 02:20:49,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 02:20:49,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:20:49,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:20:50,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:20:51,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.78 vs. limit=15.0 2023-10-02 02:20:51,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:20:54,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:20:56,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:21:00,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 02:21:02,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:21:04,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=716813.3333333334, ans=0.0 2023-10-02 02:21:05,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:21:06,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 02:21:06,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:21:06,714 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 02:21:06,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:06,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:09,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=716813.3333333334, ans=0.0 2023-10-02 02:21:11,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:21:13,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:21:13,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:21:14,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 02:21:14,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 02:21:16,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 02:21:17,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=716813.3333333334, ans=0.04949747468305833 2023-10-02 02:21:18,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:21:19,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.16 vs. limit=15.0 2023-10-02 02:21:20,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 02:21:20,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:23,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 02:21:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:21:24,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 02:21:25,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:21:26,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:21:26,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:21:26,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:21:26,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=716880.0, ans=0.125 2023-10-02 02:21:29,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 02:21:31,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:21:33,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:21:33,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:21:36,627 INFO [train.py:1046] (2/4) Epoch 21, batch 1300, loss[loss=0.1995, simple_loss=0.2788, pruned_loss=0.06006, over 23730.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2512, pruned_loss=0.04993, over 4704562.97 frames. ], batch size: 85, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:21:36,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:21:39,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:21:39,931 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.23 vs. limit=22.5 2023-10-02 02:21:40,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 02:21:44,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:21:46,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:21:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:21:48,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:48,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:21:50,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 02:21:55,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:21:55,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=717013.3333333334, ans=0.02 2023-10-02 02:21:56,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:21:56,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 02:21:59,129 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 1.796e+02 1.993e+02 2.237e+02 3.308e+02, threshold=3.985e+02, percent-clipped=0.0 2023-10-02 02:22:01,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:22:05,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:05,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:22:07,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:22:08,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:10,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:22:11,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:22:11,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 02:22:11,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=717080.0, ans=0.1 2023-10-02 02:22:11,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=717080.0, ans=0.0 2023-10-02 02:22:15,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:22:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:22:18,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 02:22:18,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:22:20,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:22:21,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:22:23,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 02:22:23,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:22:24,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 02:22:25,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:22:28,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:22:28,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:22:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 02:22:34,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 02:22:37,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 02:22:40,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:22:43,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 02:22:44,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:50,630 INFO [train.py:1046] (2/4) Epoch 21, batch 1350, loss[loss=0.1775, simple_loss=0.2616, pruned_loss=0.04672, over 23692.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2508, pruned_loss=0.04983, over 4711780.52 frames. ], batch size: 85, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:22:50,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 02:22:53,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:22:55,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:22:56,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:56,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=717280.0, ans=0.0 2023-10-02 02:22:58,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:23:00,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:23:01,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:23:04,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:23:04,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=717346.6666666666, ans=0.0 2023-10-02 02:23:06,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 02:23:07,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:23:08,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.32 vs. limit=10.0 2023-10-02 02:23:09,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:23:11,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=717346.6666666666, ans=0.0 2023-10-02 02:23:12,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 02:23:13,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:23:13,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:23:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 02:23:15,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 02:23:17,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 02:23:20,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 02:23:32,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:41,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:42,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:23:42,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 02:23:45,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:23:45,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=717480.0, ans=0.2 2023-10-02 02:23:48,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 02:23:48,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:23:48,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:23:51,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:23:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 02:23:53,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:23:54,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=717546.6666666666, ans=0.05 2023-10-02 02:23:59,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 02:24:01,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 02:24:06,662 INFO [train.py:1046] (2/4) Epoch 21, batch 1400, loss[loss=0.1838, simple_loss=0.2617, pruned_loss=0.05299, over 23704.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2487, pruned_loss=0.04958, over 4706858.64 frames. ], batch size: 85, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:24:06,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 02:24:06,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=717613.3333333334, ans=0.125 2023-10-02 02:24:08,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:24:11,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:24:12,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:24:14,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=717613.3333333334, ans=0.1 2023-10-02 02:24:15,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 02:24:17,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 02:24:20,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=717680.0, ans=0.0 2023-10-02 02:24:26,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:24:30,121 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.074e+02 2.449e+02 3.328e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 02:24:30,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:24:32,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:24:32,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:24:37,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:24:38,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 02:24:47,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=717746.6666666666, ans=0.05 2023-10-02 02:24:47,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=717746.6666666666, ans=0.95 2023-10-02 02:24:48,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:24:48,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:24:54,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 02:24:54,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:24:54,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:24:55,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:24:55,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:24:57,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:24:57,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:24:57,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:24:59,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 02:24:59,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:25:05,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:05,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=717880.0, ans=0.0 2023-10-02 02:25:07,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:25:09,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=717880.0, ans=0.1 2023-10-02 02:25:10,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.84 vs. limit=22.5 2023-10-02 02:25:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 02:25:15,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:25:16,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:25:18,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 02:25:19,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:21,071 INFO [train.py:1046] (2/4) Epoch 21, batch 1450, loss[loss=0.187, simple_loss=0.2536, pruned_loss=0.06015, over 23686.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2487, pruned_loss=0.04957, over 4714065.34 frames. ], batch size: 232, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:25:21,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:25:22,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=7.06 vs. limit=10.0 2023-10-02 02:25:23,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:25:26,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:25:26,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:26,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 02:25:32,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:33,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=717946.6666666666, ans=0.1 2023-10-02 02:25:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:25:36,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:25:36,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 02:25:37,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:25:39,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 02:25:40,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:40,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:40,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 02:25:42,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:25:43,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:25:43,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 02:25:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:45,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:25:46,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:49,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:53,164 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.13 vs. limit=22.5 2023-10-02 02:25:53,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:25:53,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:25:54,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=718080.0, ans=0.0 2023-10-02 02:25:55,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:55,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:56,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:25:56,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:58,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:01,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=12.0 2023-10-02 02:26:02,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 02:26:03,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:26:09,078 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 02:26:10,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:26:11,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:26:13,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:13,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 02:26:18,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:18,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 02:26:20,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 02:26:20,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:24,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:26:24,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:26:25,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=718213.3333333334, ans=0.125 2023-10-02 02:26:26,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 02:26:28,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 02:26:29,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 02:26:30,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:31,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:26:32,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=718213.3333333334, ans=0.125 2023-10-02 02:26:32,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.29 vs. limit=15.0 2023-10-02 02:26:34,529 INFO [train.py:1046] (2/4) Epoch 21, batch 1500, loss[loss=0.176, simple_loss=0.2648, pruned_loss=0.0436, over 24294.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2499, pruned_loss=0.04961, over 4724305.37 frames. ], batch size: 74, lr: 4.92e-03, grad_scale: 8.0 2023-10-02 02:26:42,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 02:26:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:26:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:26:42,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=718280.0, ans=0.015 2023-10-02 02:26:44,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:44,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:26:45,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:26:45,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 02:26:47,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:26:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:26:47,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:26:49,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:26:49,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=718346.6666666666, ans=0.125 2023-10-02 02:26:50,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:26:51,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:26:53,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=718346.6666666666, ans=0.05 2023-10-02 02:26:57,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:26:57,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 02:26:58,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:27:00,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.376e+02 1.883e+02 2.099e+02 2.535e+02 4.584e+02, threshold=4.198e+02, percent-clipped=1.0 2023-10-02 02:27:00,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:27:01,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:27:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 02:27:07,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 02:27:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:27:09,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 02:27:12,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:27:12,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.92 vs. limit=10.0 2023-10-02 02:27:13,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:27:14,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:27:15,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:27:15,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 02:27:16,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:27:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:27:18,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 02:27:18,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:27:22,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:27:22,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 02:27:27,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:27:29,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:27:32,141 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 02:27:32,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:32,851 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 02:27:34,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:27:35,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:27:35,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=718546.6666666666, ans=0.125 2023-10-02 02:27:36,814 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 02:27:38,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:27:38,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=718546.6666666666, ans=0.2 2023-10-02 02:27:40,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 02:27:42,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:44,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:27:46,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:46,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:27:48,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:48,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:27:48,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 02:27:49,558 INFO [train.py:1046] (2/4) Epoch 21, batch 1550, loss[loss=0.1982, simple_loss=0.2622, pruned_loss=0.06715, over 23737.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2513, pruned_loss=0.05064, over 4712141.91 frames. ], batch size: 179, lr: 4.92e-03, grad_scale: 8.0 2023-10-02 02:27:49,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 02:27:49,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:27:50,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 02:27:52,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 02:27:53,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:27:55,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:27:56,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:27:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:27:58,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:27:59,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:28:01,609 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 02:28:01,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:02,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:28:04,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:28:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:28:06,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 02:28:07,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:28:09,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 02:28:09,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 02:28:11,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 02:28:11,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:13,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=718680.0, ans=0.125 2023-10-02 02:28:16,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:28:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 02:28:19,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 02:28:28,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:31,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:28:31,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:28:31,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:28:32,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 02:28:32,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=718813.3333333334, ans=0.0 2023-10-02 02:28:34,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=718813.3333333334, ans=0.09899494936611666 2023-10-02 02:28:37,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:28:38,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:39,411 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.65 vs. limit=12.0 2023-10-02 02:28:42,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:28:46,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:28:46,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:47,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 02:28:47,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:28:47,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:28:49,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:49,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 02:28:49,210 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 02:28:53,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:28:55,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=718880.0, ans=0.125 2023-10-02 02:28:58,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 02:29:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:02,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=718946.6666666666, ans=0.125 2023-10-02 02:29:02,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=718946.6666666666, ans=0.1 2023-10-02 02:29:03,920 INFO [train.py:1046] (2/4) Epoch 21, batch 1600, loss[loss=0.1595, simple_loss=0.2458, pruned_loss=0.03666, over 24351.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2507, pruned_loss=0.05008, over 4712778.23 frames. ], batch size: 74, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:29:03,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:04,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 02:29:05,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:29:05,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=718946.6666666666, ans=0.2 2023-10-02 02:29:06,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:06,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:29:07,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:29:08,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:29:10,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:11,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 02:29:12,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 02:29:15,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 02:29:19,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:29:20,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 02:29:20,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:29:24,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:29:26,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:29:28,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 02:29:29,736 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.788e+02 1.991e+02 2.210e+02 3.444e+02, threshold=3.981e+02, percent-clipped=0.0 2023-10-02 02:29:31,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:29:32,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 02:29:32,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:32,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 02:29:36,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 02:29:44,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:46,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 02:29:46,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:48,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:29:48,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:29:49,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 02:29:52,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 02:29:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:54,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:55,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:57,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:29:57,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:29:58,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:30:00,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:30:05,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:30:05,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:30:09,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 02:30:09,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:30:10,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 02:30:14,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:30:17,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:30:17,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:30:17,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 02:30:17,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 02:30:17,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 02:30:17,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 02:30:19,679 INFO [train.py:1046] (2/4) Epoch 21, batch 1650, loss[loss=0.1867, simple_loss=0.2683, pruned_loss=0.05253, over 24427.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2517, pruned_loss=0.05019, over 4723828.17 frames. ], batch size: 77, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:30:22,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:30:22,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:30:22,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:30:23,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:30:26,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:30:28,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 02:30:32,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:30:32,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:30:32,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:30:32,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:30:33,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 02:30:33,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 02:30:39,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:30:42,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:30:51,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 02:30:52,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:30:54,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 02:30:57,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:30:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:31:00,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:31:02,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:03,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:31:03,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:03,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:05,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:05,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:31:06,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:31:07,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:31:07,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:31:10,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:31:12,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 02:31:15,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:31:15,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 02:31:17,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 02:31:17,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 02:31:17,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:31:18,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:31:18,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:31:19,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:19,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 02:31:23,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:31:24,240 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.07 vs. limit=15.0 2023-10-02 02:31:26,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:31:26,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:28,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 02:31:33,167 INFO [train.py:1046] (2/4) Epoch 21, batch 1700, loss[loss=0.1716, simple_loss=0.2367, pruned_loss=0.05324, over 23598.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2505, pruned_loss=0.05007, over 4723583.03 frames. ], batch size: 256, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:31:33,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:33,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:31:33,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 02:31:33,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:31:33,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:31:33,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:36,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:31:36,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:31:37,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 02:31:39,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:31:48,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:51,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:31:56,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:31:56,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:31:56,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:31:58,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:31:58,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=719680.0, ans=0.0 2023-10-02 02:31:59,511 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.914e+02 2.162e+02 2.469e+02 4.106e+02, threshold=4.325e+02, percent-clipped=1.0 2023-10-02 02:32:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 02:32:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:32:02,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:05,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:32:05,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:32:07,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 02:32:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 02:32:09,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:10,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 02:32:12,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:32:22,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:23,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=719813.3333333334, ans=0.125 2023-10-02 02:32:24,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:24,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:32:26,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:32:26,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 02:32:26,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:32:29,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 02:32:29,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:32:29,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:32:31,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:32:32,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:32:32,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:32:34,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:34,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:32:34,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:36,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=719880.0, ans=0.125 2023-10-02 02:32:39,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:32:40,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 02:32:42,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:43,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:32:43,702 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:32:45,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=719880.0, ans=0.0 2023-10-02 02:32:47,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 02:32:48,774 INFO [train.py:1046] (2/4) Epoch 21, batch 1750, loss[loss=0.1745, simple_loss=0.2577, pruned_loss=0.04563, over 24576.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.25, pruned_loss=0.05013, over 4712820.91 frames. ], batch size: 71, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:32:51,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:53,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:32:54,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:32:54,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 02:32:56,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:58,216 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=15.0 2023-10-02 02:32:59,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:32:59,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:06,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 02:33:08,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:09,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 02:33:09,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:33:11,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:33:12,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=720013.3333333334, ans=0.0 2023-10-02 02:33:13,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:33:15,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 02:33:17,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:33:18,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 02:33:26,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:33:28,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:33:28,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:33:32,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:32,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:33:34,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:33:34,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=720080.0, ans=0.0 2023-10-02 02:33:35,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:33:37,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:33:37,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=12.0 2023-10-02 02:33:38,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 02:33:39,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:33:41,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 02:33:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:33:42,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:43,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=720146.6666666666, ans=0.0 2023-10-02 02:33:44,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:33:47,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:33:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:33:47,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:51,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:33:55,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:56,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=720213.3333333334, ans=0.125 2023-10-02 02:33:58,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:33:59,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:34:00,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 02:34:00,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:34:01,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:34:01,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:01,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:34:01,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:34:03,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:34:03,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=720213.3333333334, ans=0.1 2023-10-02 02:34:07,460 INFO [train.py:1046] (2/4) Epoch 21, batch 1800, loss[loss=0.1771, simple_loss=0.2501, pruned_loss=0.05207, over 23647.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2493, pruned_loss=0.04989, over 4701401.04 frames. ], batch size: 149, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:34:07,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:34:07,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:34:07,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=720280.0, ans=0.0 2023-10-02 02:34:08,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:34:10,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:34:15,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:34:15,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:34:17,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:34:20,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:20,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:22,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:34:23,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:34:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 02:34:25,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:28,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:28,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=720346.6666666666, ans=0.1 2023-10-02 02:34:29,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=22.5 2023-10-02 02:34:32,579 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.823e+02 2.052e+02 2.264e+02 3.155e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 02:34:32,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 02:34:32,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=720346.6666666666, ans=0.0 2023-10-02 02:34:34,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 02:34:34,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=720346.6666666666, ans=0.0 2023-10-02 02:34:35,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 02:34:35,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:34:35,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:35,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:34:37,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:34:44,449 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 02:34:47,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:34:48,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:50,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 02:34:50,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 02:34:52,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:34:53,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:34:55,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:34:57,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=720480.0, ans=0.125 2023-10-02 02:34:57,396 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:34:59,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 02:35:05,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:35:05,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 02:35:06,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:35:06,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:35:08,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:35:08,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 02:35:11,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:35:11,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:35:14,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 02:35:14,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:35:15,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:35:16,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:35:16,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:35:17,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:35:18,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:35:20,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:35:20,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:35:22,585 INFO [train.py:1046] (2/4) Epoch 21, batch 1850, loss[loss=0.1865, simple_loss=0.2526, pruned_loss=0.06018, over 23793.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2503, pruned_loss=0.04998, over 4701172.81 frames. ], batch size: 212, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:35:24,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:35:25,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:35:34,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:35:34,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 02:35:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 02:35:42,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 02:35:45,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:35:47,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 02:35:47,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 02:35:49,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.57 vs. limit=15.0 2023-10-02 02:35:53,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:35:55,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 02:35:57,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:35:57,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:36:00,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=720746.6666666666, ans=0.0 2023-10-02 02:36:02,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 02:36:02,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:02,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:36:03,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:36:05,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:36:08,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:36:10,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:36:10,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:10,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:36:10,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:13,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:36:15,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:36:18,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 02:36:18,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:36:23,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:36:23,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:36:23,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 02:36:23,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 02:36:26,044 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 02:36:26,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 02:36:28,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:36:28,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:36:28,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:36:28,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:30,291 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 02:36:30,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:36:31,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:31,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:36:32,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.00 vs. limit=15.0 2023-10-02 02:36:33,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:36:34,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:36:34,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 02:36:36,493 INFO [train.py:1046] (2/4) Epoch 21, batch 1900, loss[loss=0.1897, simple_loss=0.2587, pruned_loss=0.0604, over 23669.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2513, pruned_loss=0.05042, over 4711226.94 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 16.0 2023-10-02 02:36:38,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:38,066 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 02:36:38,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:36:39,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:47,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:36:47,126 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 02:36:48,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 02:36:49,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:36:49,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:36:51,229 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 02:36:51,262 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 02:36:54,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 02:36:57,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:37:00,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 02:37:01,998 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.796e+02 1.986e+02 2.247e+02 3.290e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 02:37:02,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 02:37:07,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=721080.0, ans=0.125 2023-10-02 02:37:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 02:37:16,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 02:37:16,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:37:17,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 02:37:17,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 02:37:17,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 02:37:17,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 02:37:17,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:37:22,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 02:37:25,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:37:30,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:37:30,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 02:37:30,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:37:34,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 02:37:34,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:37:40,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=721213.3333333334, ans=0.125 2023-10-02 02:37:42,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:37:42,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:37:42,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:37:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:37:43,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:37:43,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:37:44,027 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:37:45,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:37:47,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:37:47,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:37:49,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:37:49,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:37:49,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:37:50,867 INFO [train.py:1046] (2/4) Epoch 21, batch 1950, loss[loss=0.244, simple_loss=0.3006, pruned_loss=0.0937, over 19133.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2525, pruned_loss=0.05109, over 4711099.92 frames. ], batch size: 388, lr: 4.91e-03, grad_scale: 16.0 2023-10-02 02:37:50,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:37:54,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:37:57,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:37:57,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:37:57,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:38:00,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 02:38:00,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:38:00,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:02,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:05,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:38:05,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:06,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:38:12,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:38:12,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:38:12,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:38:12,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:15,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:38:18,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:18,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:38:18,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 02:38:19,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:38:20,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:38:20,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:23,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:25,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:38:30,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:38:33,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:38:33,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:38:33,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 02:38:33,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:38:40,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:38:40,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:38:41,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:38:45,378 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:38:49,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:50,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:52,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:56,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:59,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:39:00,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:39:01,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 02:39:01,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:39:02,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:39:04,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 02:39:06,060 INFO [train.py:1046] (2/4) Epoch 21, batch 2000, loss[loss=0.1815, simple_loss=0.2708, pruned_loss=0.04608, over 24629.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2524, pruned_loss=0.0512, over 4711086.14 frames. ], batch size: 73, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:39:06,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:39:10,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:39:10,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:39:10,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:39:11,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:39:14,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 02:39:17,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:39:20,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:39:22,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 02:39:23,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:39:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:39:25,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:39:26,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 02:39:26,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:28,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:28,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:30,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 02:39:31,579 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.854e+02 2.083e+02 2.376e+02 3.627e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 02:39:31,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:39:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 02:39:34,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:39:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:39:37,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:39:38,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:39,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:39:40,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:39:40,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 02:39:41,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.25 vs. limit=15.0 2023-10-02 02:39:43,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 02:39:43,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:39:43,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:39:48,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:49,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:39:51,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:39:51,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:39:53,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:39:53,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:55,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:39:55,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:57,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:57,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=721813.3333333334, ans=0.125 2023-10-02 02:40:00,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:40:00,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 02:40:05,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:40:05,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:10,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:40:13,007 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.65 vs. limit=15.0 2023-10-02 02:40:13,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:14,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:40:14,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:16,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:40:16,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:40:19,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:20,611 INFO [train.py:1046] (2/4) Epoch 21, batch 2050, loss[loss=0.1598, simple_loss=0.2174, pruned_loss=0.05105, over 23448.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2517, pruned_loss=0.05096, over 4709426.90 frames. ], batch size: 285, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:40:20,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:23,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:40:24,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:29,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:40:30,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:40:31,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:33,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:40:34,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 02:40:34,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:40:36,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=722013.3333333334, ans=0.125 2023-10-02 02:40:37,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:40:37,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:40:42,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=722013.3333333334, ans=0.0 2023-10-02 02:40:44,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=722013.3333333334, ans=0.1 2023-10-02 02:40:46,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:40:46,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:49,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 02:40:50,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:52,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 02:40:53,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:40:56,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:40:58,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:40:58,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:41:00,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:41:01,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:41:02,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:41:04,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:41:05,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:41:07,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:41:09,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:41:10,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:41:14,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:41:14,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=722146.6666666666, ans=0.2 2023-10-02 02:41:20,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:41:22,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 02:41:27,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:41:27,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:41:28,472 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.52 vs. limit=22.5 2023-10-02 02:41:30,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:41:32,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 02:41:32,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=722213.3333333334, ans=0.125 2023-10-02 02:41:35,354 INFO [train.py:1046] (2/4) Epoch 21, batch 2100, loss[loss=0.1667, simple_loss=0.2485, pruned_loss=0.04243, over 24475.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2503, pruned_loss=0.05022, over 4711413.43 frames. ], batch size: 66, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:41:35,448 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 02:41:35,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:41:35,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:41:36,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:41:36,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:41:36,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 02:41:38,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 02:41:38,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:41:41,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:41:42,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:41:43,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:41:44,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:41:44,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 02:41:45,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:41:47,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 02:41:47,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 02:41:49,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:41:49,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:41:49,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 02:41:51,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 02:41:54,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 02:41:54,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:41:58,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:41:59,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:42:00,601 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.890e+02 2.103e+02 2.367e+02 4.500e+02, threshold=4.205e+02, percent-clipped=1.0 2023-10-02 02:42:02,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:42:04,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 02:42:05,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:05,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:42:06,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 02:42:08,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:08,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 02:42:08,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 02:42:10,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 02:42:11,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:42:13,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:42:13,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=722413.3333333334, ans=0.2 2023-10-02 02:42:15,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:42:16,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:42:17,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:18,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:18,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 02:42:18,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:18,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:20,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:20,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 02:42:22,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 02:42:22,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 02:42:28,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:42:30,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.92 vs. limit=10.0 2023-10-02 02:42:31,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:42:33,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 02:42:37,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:37,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=722546.6666666666, ans=0.125 2023-10-02 02:42:40,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:42:40,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:42:40,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:42:40,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 02:42:40,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:42:42,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:42,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:42:43,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:42:43,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:45,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 02:42:46,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 02:42:46,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:42:49,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:49,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:42:49,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=722613.3333333334, ans=15.0 2023-10-02 02:42:50,486 INFO [train.py:1046] (2/4) Epoch 21, batch 2150, loss[loss=0.1362, simple_loss=0.2105, pruned_loss=0.03092, over 24306.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2498, pruned_loss=0.04995, over 4707526.47 frames. ], batch size: 56, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:42:50,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:42:50,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:42:50,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=722613.3333333334, ans=0.0 2023-10-02 02:42:55,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:42:55,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=722613.3333333334, ans=0.05 2023-10-02 02:42:56,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:42:57,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:59,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:42:59,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:42:59,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:43:04,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:05,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:43:05,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:43:08,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:08,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 02:43:13,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:13,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:43:14,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:14,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:16,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:16,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:43:16,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:43:17,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:43:17,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:43:19,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 02:43:20,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:43:20,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:20,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:23,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:43:24,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:43:25,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:26,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:43:28,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:28,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 02:43:28,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:43:28,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-10-02 02:43:29,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-10-02 02:43:32,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:32,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:33,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:36,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:43:37,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:38,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 02:43:40,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 02:43:40,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:43:40,409 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 02:43:40,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:42,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:43:42,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 02:43:42,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:43:42,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 02:43:42,329 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 02:43:42,330 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 02:43:42,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=722813.3333333334, ans=0.125 2023-10-02 02:43:43,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 02:43:46,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:47,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.83 vs. limit=15.0 2023-10-02 02:43:47,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:43:47,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:43:47,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:49,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:43:50,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:50,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:50,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=722880.0, ans=0.125 2023-10-02 02:43:57,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=722880.0, ans=0.125 2023-10-02 02:44:00,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:44:00,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 02:44:05,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:44:06,411 INFO [train.py:1046] (2/4) Epoch 21, batch 2200, loss[loss=0.167, simple_loss=0.2552, pruned_loss=0.03943, over 23998.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2501, pruned_loss=0.04964, over 4716497.64 frames. ], batch size: 86, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:44:09,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:10,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:44:10,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:11,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:44:15,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:44:15,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:44:15,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 02:44:19,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 02:44:21,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:44:25,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 02:44:28,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:28,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=723013.3333333334, ans=0.125 2023-10-02 02:44:29,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:44:29,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:44:31,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:44:33,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 02:44:35,306 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.860e+02 2.025e+02 2.303e+02 3.643e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 02:44:36,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:44:38,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:39,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 02:44:39,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=723080.0, ans=0.1 2023-10-02 02:44:41,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=723080.0, ans=0.125 2023-10-02 02:44:42,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:44:44,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:44:47,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:44:48,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:49,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 02:44:51,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:52,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 02:44:54,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:54,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:44:55,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:56,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:44:56,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:44:56,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:58,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:45:00,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:45:00,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:45:00,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=723146.6666666666, ans=0.1 2023-10-02 02:45:01,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:45:06,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:45:06,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:45:08,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:45:08,396 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 02:45:11,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:45:11,215 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 02:45:13,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:45:13,749 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 02:45:15,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:45:15,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:45:17,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:45:19,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 02:45:21,295 INFO [train.py:1046] (2/4) Epoch 21, batch 2250, loss[loss=0.194, simple_loss=0.2629, pruned_loss=0.06256, over 23660.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2504, pruned_loss=0.04975, over 4716349.74 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:45:21,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:45:22,353 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.81 vs. limit=15.0 2023-10-02 02:45:22,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:45:24,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=723280.0, ans=0.1 2023-10-02 02:45:28,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:45:28,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:45:32,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:33,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:45:34,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:45:38,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 02:45:38,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:45:38,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:45:39,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 02:45:41,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:45:41,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:43,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:45:48,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:45:49,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 02:45:50,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:45:51,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 02:45:52,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:54,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:45:57,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.20 vs. limit=6.0 2023-10-02 02:45:59,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:46:02,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:46:03,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:03,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:46:06,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:46:08,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:46:12,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:46:15,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:46:21,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:46:21,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:46:21,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:46:25,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:46:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:46:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 02:46:29,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:29,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:46:32,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 02:46:35,176 INFO [train.py:1046] (2/4) Epoch 21, batch 2300, loss[loss=0.1718, simple_loss=0.2411, pruned_loss=0.05127, over 23460.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2506, pruned_loss=0.04949, over 4716195.61 frames. ], batch size: 134, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:46:35,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:46:35,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:41,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:41,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:46:45,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 02:46:46,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:54,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:46:54,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:46:55,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:46:56,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:56,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 02:46:56,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:46:58,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:46:59,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:47:03,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:47:04,818 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.907e+02 2.156e+02 2.525e+02 3.499e+02, threshold=4.312e+02, percent-clipped=0.0 2023-10-02 02:47:06,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:47:06,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=723746.6666666666, ans=0.2 2023-10-02 02:47:10,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:47:16,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:47:16,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:47:20,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:47:20,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=723813.3333333334, ans=0.125 2023-10-02 02:47:22,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:47:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:47:27,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:47:28,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:47:28,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 02:47:33,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:47:33,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:47:33,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:47:33,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:47:34,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:47:34,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 02:47:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:47:35,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 02:47:35,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:47:35,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:47:35,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 02:47:41,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:47:43,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=723880.0, ans=0.125 2023-10-02 02:47:44,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:47:44,781 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:47:49,007 INFO [train.py:1046] (2/4) Epoch 21, batch 2350, loss[loss=0.1459, simple_loss=0.2238, pruned_loss=0.034, over 24455.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2519, pruned_loss=0.04994, over 4718663.50 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 8.0 2023-10-02 02:47:50,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:47:51,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:47:51,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:47:52,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:47:52,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:47:53,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:47:53,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 02:47:55,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=723946.6666666666, ans=0.1 2023-10-02 02:47:55,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=723946.6666666666, ans=0.125 2023-10-02 02:48:00,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=723946.6666666666, ans=0.125 2023-10-02 02:48:01,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:48:01,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 02:48:05,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 02:48:09,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:48:10,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:10,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:10,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:48:10,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:48:11,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=724013.3333333334, ans=0.2 2023-10-02 02:48:12,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 02:48:12,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=724013.3333333334, ans=0.2 2023-10-02 02:48:12,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=724013.3333333334, ans=0.2 2023-10-02 02:48:13,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.24 vs. limit=10.0 2023-10-02 02:48:13,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:48:20,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 02:48:21,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:48:24,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:48:24,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:48:26,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:48:28,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 02:48:29,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:48:31,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:48:31,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:48:32,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:48:33,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:48:36,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 02:48:36,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:48:40,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:40,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:48:42,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 02:48:42,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:48:45,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 02:48:45,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:48:49,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 02:48:51,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=724213.3333333334, ans=10.0 2023-10-02 02:48:53,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 02:48:53,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:48:53,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:48:53,916 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 02:48:55,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 02:48:57,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 02:48:57,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=724213.3333333334, ans=0.125 2023-10-02 02:49:01,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:49:04,166 INFO [train.py:1046] (2/4) Epoch 21, batch 2400, loss[loss=0.1885, simple_loss=0.2665, pruned_loss=0.05526, over 23339.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.04972, over 4709536.96 frames. ], batch size: 93, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:49:04,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:49:07,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:49:10,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:49:11,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 02:49:11,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 02:49:19,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:49:19,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:49:21,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 02:49:21,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:49:21,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:23,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 02:49:28,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:30,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 02:49:35,035 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.859e+02 2.075e+02 2.399e+02 3.778e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-02 02:49:36,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:49:40,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 02:49:43,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:49:43,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:48,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:49:48,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 02:49:49,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:49:56,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:49:58,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:49:59,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=724480.0, ans=0.125 2023-10-02 02:50:01,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:02,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:50:02,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:50:02,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:50:02,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:50:02,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:50:02,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:50:05,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=724546.6666666666, ans=0.125 2023-10-02 02:50:07,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:50:07,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:50:07,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 02:50:09,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 02:50:12,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:50:12,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:50:12,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 02:50:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 02:50:14,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 02:50:14,110 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 02:50:15,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 02:50:15,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:50:16,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=724546.6666666666, ans=0.125 2023-10-02 02:50:18,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:18,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:50:19,694 INFO [train.py:1046] (2/4) Epoch 21, batch 2450, loss[loss=0.1719, simple_loss=0.2608, pruned_loss=0.04151, over 24484.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2503, pruned_loss=0.04891, over 4699963.56 frames. ], batch size: 69, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:50:19,784 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 02:50:19,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:50:23,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:50:24,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:50:27,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:27,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:50:27,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=724613.3333333334, ans=0.0 2023-10-02 02:50:29,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 02:50:35,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:50:35,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:38,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:50:38,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:50:38,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:50:39,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 02:50:41,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=724680.0, ans=0.2 2023-10-02 02:50:44,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:45,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:50:47,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:50:49,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:50:49,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:50:53,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:50:53,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:55,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 02:50:55,864 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.10 vs. limit=15.0 2023-10-02 02:50:56,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:51:04,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:05,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:51:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:07,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:51:07,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:08,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:51:09,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 02:51:12,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:51:12,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:51:14,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=724813.3333333334, ans=0.125 2023-10-02 02:51:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:51:17,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:17,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=724813.3333333334, ans=0.1 2023-10-02 02:51:22,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:51:22,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 02:51:24,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:51:24,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:51:24,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 02:51:24,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:51:27,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:51:29,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:51:29,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=724880.0, ans=0.1 2023-10-02 02:51:29,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=724880.0, ans=0.09899494936611666 2023-10-02 02:51:30,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=724880.0, ans=0.125 2023-10-02 02:51:31,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:33,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:51:34,472 INFO [train.py:1046] (2/4) Epoch 21, batch 2500, loss[loss=0.1838, simple_loss=0.2553, pruned_loss=0.05618, over 23524.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2496, pruned_loss=0.04893, over 4699035.90 frames. ], batch size: 134, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:51:36,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 02:51:36,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:51:42,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:51:49,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:51:51,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:52,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:51:52,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 02:51:59,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:51:59,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:52:00,003 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.33 vs. limit=10.0 2023-10-02 02:52:00,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:52:00,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 02:52:02,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 02:52:03,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:04,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.878e+02 2.187e+02 2.689e+02 3.332e+02, threshold=4.374e+02, percent-clipped=0.0 2023-10-02 02:52:04,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:52:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 02:52:04,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:06,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 02:52:06,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:11,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:52:12,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:52:15,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:52:17,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 02:52:17,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:52:17,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:21,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:25,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:26,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=725146.6666666666, ans=0.0 2023-10-02 02:52:28,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:52:29,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=725146.6666666666, ans=0.1 2023-10-02 02:52:34,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:52:35,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 02:52:35,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:52:37,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:52:39,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:52:39,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:52:39,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=12.0 2023-10-02 02:52:40,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 02:52:40,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 02:52:40,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 02:52:43,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:46,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 02:52:46,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 02:52:46,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:52:49,406 INFO [train.py:1046] (2/4) Epoch 21, batch 2550, loss[loss=0.1932, simple_loss=0.2738, pruned_loss=0.05629, over 24469.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2498, pruned_loss=0.04912, over 4706998.70 frames. ], batch size: 69, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:52:49,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 02:52:49,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=725280.0, ans=0.0 2023-10-02 02:52:50,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 02:52:53,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:52:55,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:52:55,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:52:57,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:52:59,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 02:52:59,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:53:01,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=725280.0, ans=0.025 2023-10-02 02:53:01,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=725280.0, ans=0.125 2023-10-02 02:53:01,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=725280.0, ans=0.125 2023-10-02 02:53:03,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 02:53:06,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:53:07,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:07,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:53:09,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 02:53:09,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:53:11,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:53:11,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:53:13,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:53:13,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 02:53:13,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:53:13,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:13,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 02:53:25,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:53:25,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725413.3333333334, ans=0.125 2023-10-02 02:53:32,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:53:32,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:32,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:53:32,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:53:37,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=725480.0, ans=0.5 2023-10-02 02:53:39,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:53:40,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:53:41,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:53:42,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:53:42,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:53:42,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:53:45,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:53:46,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:50,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:53:50,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 02:53:50,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:53:50,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:51,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:53:52,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:53:54,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:01,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:54:03,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=725613.3333333334, ans=0.0 2023-10-02 02:54:04,384 INFO [train.py:1046] (2/4) Epoch 21, batch 2600, loss[loss=0.1733, simple_loss=0.252, pruned_loss=0.04736, over 23359.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2508, pruned_loss=0.04951, over 4719204.65 frames. ], batch size: 105, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:54:04,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 02:54:09,994 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 02:54:10,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:54:10,059 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 02:54:10,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 02:54:11,473 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 02:54:12,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:54:12,993 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 02:54:14,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 02:54:16,228 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 02:54:17,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:54:20,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 02:54:21,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 02:54:21,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:54:23,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 02:54:24,713 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 02:54:24,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 02:54:30,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=725680.0, ans=0.1 2023-10-02 02:54:32,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:54:32,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:32,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:54:32,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 02:54:34,847 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.872e+02 2.087e+02 2.390e+02 4.163e+02, threshold=4.174e+02, percent-clipped=0.0 2023-10-02 02:54:36,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:54:42,940 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 02:54:45,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=725746.6666666666, ans=0.0 2023-10-02 02:54:47,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:48,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:54:49,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 02:54:49,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:54:49,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:54:50,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 02:54:53,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:54:53,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:54:53,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725813.3333333334, ans=0.125 2023-10-02 02:54:54,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:54:58,609 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 02:54:58,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:54:58,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:55:06,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:55:06,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:55:06,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 02:55:07,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:55:09,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:55:10,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:55:16,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 02:55:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:17,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:55:19,856 INFO [train.py:1046] (2/4) Epoch 21, batch 2650, loss[loss=0.2265, simple_loss=0.2848, pruned_loss=0.08408, over 19769.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2511, pruned_loss=0.04976, over 4719463.32 frames. ], batch size: 388, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:55:22,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 02:55:22,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:22,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:55:23,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=725946.6666666666, ans=0.04949747468305833 2023-10-02 02:55:24,114 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 02:55:25,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:55:28,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:29,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:55:32,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:55:34,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:55:35,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 02:55:35,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:55:35,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:55:37,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 02:55:40,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 02:55:40,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=726013.3333333334, ans=0.125 2023-10-02 02:55:41,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:55:41,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=726013.3333333334, ans=0.0 2023-10-02 02:55:45,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 02:55:45,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:55:45,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=12.0 2023-10-02 02:55:46,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 02:55:50,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:55:50,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:55:52,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:55:52,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:55:58,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 02:55:58,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 02:56:01,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:56:04,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 02:56:04,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:56:04,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=726146.6666666666, ans=0.2 2023-10-02 02:56:06,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:06,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:56:07,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:56:07,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:56:08,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:56:10,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:56:11,032 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.82 vs. limit=22.5 2023-10-02 02:56:11,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:56:12,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:56:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:56:14,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:16,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:56:17,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:20,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:56:20,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:56:23,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:23,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:56:23,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:23,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 02:56:28,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:56:28,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:31,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:32,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:56:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:33,932 INFO [train.py:1046] (2/4) Epoch 21, batch 2700, loss[loss=0.1723, simple_loss=0.258, pruned_loss=0.04332, over 24083.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2519, pruned_loss=0.04974, over 4726176.99 frames. ], batch size: 80, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:56:36,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:56:36,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 02:56:40,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:56:41,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 02:56:44,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:56:44,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:44,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:44,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=726280.0, ans=0.09899494936611666 2023-10-02 02:56:45,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:56:45,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:45,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:56:45,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:56:45,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 02:56:47,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:56:47,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=726346.6666666666, ans=0.125 2023-10-02 02:56:51,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:56:51,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:56:51,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:52,619 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:56:55,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:56:56,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 02:56:56,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:57:01,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:57:01,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:01,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=726346.6666666666, ans=0.125 2023-10-02 02:57:04,041 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.850e+02 2.012e+02 2.260e+02 2.930e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-02 02:57:05,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:57:05,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:57:05,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:57:05,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:57:08,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:08,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=726413.3333333334, ans=0.2 2023-10-02 02:57:11,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:57:11,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:57:11,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:57:16,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:16,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:57:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:57:25,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:57:28,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:57:28,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:32,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:34,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:34,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:57:34,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:35,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:35,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:57:38,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:57:40,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:40,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:44,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 02:57:46,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:48,972 INFO [train.py:1046] (2/4) Epoch 21, batch 2750, loss[loss=0.1871, simple_loss=0.2677, pruned_loss=0.05323, over 23780.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2523, pruned_loss=0.04982, over 4726734.25 frames. ], batch size: 85, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:57:49,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:57:49,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 02:57:49,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 02:57:49,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:50,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=726613.3333333334, ans=0.125 2023-10-02 02:57:54,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:57:55,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:56,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:56,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:57:56,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:57,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=726613.3333333334, ans=0.1 2023-10-02 02:58:02,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:02,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:58:02,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:58:03,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:03,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 02:58:03,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:58:03,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:58:07,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 02:58:09,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:58:09,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:11,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:58:11,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:58:12,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:58:13,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:58:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:13,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:18,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:58:18,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:58:20,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:58:21,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:23,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:58:30,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:58:33,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:37,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:37,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:58:37,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:58:38,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.68 vs. limit=15.0 2023-10-02 02:58:43,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:58:43,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:58:43,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 02:58:49,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:51,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 02:58:54,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:58:57,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:58:57,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 02:58:58,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:59:00,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:59:01,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 02:59:01,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:59:04,141 INFO [train.py:1046] (2/4) Epoch 21, batch 2800, loss[loss=0.1567, simple_loss=0.2394, pruned_loss=0.03702, over 24352.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2514, pruned_loss=0.04948, over 4722356.22 frames. ], batch size: 61, lr: 4.89e-03, grad_scale: 32.0 2023-10-02 02:59:04,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 02:59:06,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:06,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:07,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 02:59:07,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:08,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:10,414 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 02:59:10,415 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 02:59:13,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:15,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:59:15,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:59:15,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=726946.6666666666, ans=0.125 2023-10-02 02:59:16,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=726946.6666666666, ans=0.2 2023-10-02 02:59:17,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:59:20,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 02:59:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 02:59:22,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=727013.3333333334, ans=0.125 2023-10-02 02:59:24,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 02:59:25,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:25,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:59:26,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:59:29,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:59:29,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:29,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:59:31,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:59:34,440 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.948e+02 2.235e+02 2.690e+02 3.655e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-02 02:59:38,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:59:40,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:42,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=727080.0, ans=0.125 2023-10-02 02:59:42,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=727080.0, ans=0.125 2023-10-02 02:59:43,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:43,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=727080.0, ans=0.125 2023-10-02 02:59:44,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:59:46,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:59:49,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:59:49,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 02:59:49,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:50,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:59:50,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:59:55,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:56,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:58,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:00:01,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:00:01,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:00:01,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:00:02,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:00:02,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:00:04,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:00:04,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 03:00:04,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:07,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:00:07,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:07,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 03:00:07,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:07,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:00:09,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:00:09,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 03:00:14,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:00:14,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:00:16,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:00:18,732 INFO [train.py:1046] (2/4) Epoch 21, batch 2850, loss[loss=0.181, simple_loss=0.264, pruned_loss=0.04903, over 24017.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2498, pruned_loss=0.04927, over 4705609.17 frames. ], batch size: 80, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:00:18,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:00:23,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:00:23,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:00:23,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:00:26,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:26,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:00:28,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:00:29,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 03:00:37,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 03:00:37,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:00:38,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 03:00:40,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:41,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 03:00:43,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 03:00:44,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:49,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=727413.3333333334, ans=0.125 2023-10-02 03:00:54,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:56,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:00:56,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:00:58,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:00:58,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:00:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:00:59,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:00:59,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 03:01:00,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.19 vs. limit=10.0 2023-10-02 03:01:01,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:01:01,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:01:03,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:01:03,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:05,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:07,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:01:10,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:01:10,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=727480.0, ans=0.125 2023-10-02 03:01:11,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:13,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:16,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:01:19,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=727546.6666666666, ans=0.1 2023-10-02 03:01:20,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:01:22,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 03:01:22,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 03:01:23,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:01:25,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:25,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 03:01:25,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:01:26,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:26,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:01:26,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:01:26,679 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 03:01:27,977 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 03:01:27,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:01:28,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:32,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 03:01:32,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:01:33,952 INFO [train.py:1046] (2/4) Epoch 21, batch 2900, loss[loss=0.172, simple_loss=0.252, pruned_loss=0.04603, over 23185.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2501, pruned_loss=0.04902, over 4719879.89 frames. ], batch size: 93, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:01:34,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:01:35,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 03:01:39,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:40,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 03:01:41,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 03:01:43,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:01:43,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:01:44,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:46,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:01:49,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:01:49,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:52,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:01:52,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 03:01:53,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:01:55,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:56,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 03:01:56,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 03:01:59,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:59,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 03:01:59,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:02:02,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:02:02,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 03:02:06,013 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 1.860e+02 2.101e+02 2.423e+02 4.328e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-02 03:02:06,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:02:06,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=727746.6666666666, ans=0.125 2023-10-02 03:02:07,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:02:11,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:02:14,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:16,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 03:02:16,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 03:02:16,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:02:19,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:02:20,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 03:02:22,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:02:26,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:02:37,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:02:37,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:02:37,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 03:02:41,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 03:02:43,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:02:43,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:02:49,257 INFO [train.py:1046] (2/4) Epoch 21, batch 2950, loss[loss=0.151, simple_loss=0.2389, pruned_loss=0.03157, over 24481.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2513, pruned_loss=0.04932, over 4723128.75 frames. ], batch size: 66, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:02:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:02:49,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=727946.6666666666, ans=0.125 2023-10-02 03:02:49,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=727946.6666666666, ans=0.1 2023-10-02 03:02:50,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 03:02:50,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:02:50,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:52,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:02:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:02:55,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 03:02:56,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 03:02:58,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:02:58,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:03:04,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:03:05,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:03:06,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=728013.3333333334, ans=0.07 2023-10-02 03:03:07,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:09,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:03:13,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:03:13,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:03:14,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:03:16,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:03:16,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:03:17,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 03:03:23,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 03:03:24,004 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 03:03:25,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:03:26,694 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 03:03:28,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 03:03:28,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:03:28,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:03:28,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 03:03:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:03:31,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 03:03:32,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:03:32,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:03:34,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:03:37,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:03:37,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:37,162 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 03:03:37,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:03:39,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 03:03:43,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:43,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=728146.6666666666, ans=0.05 2023-10-02 03:03:44,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:03:44,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 03:03:44,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:03:46,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 03:03:49,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:03:49,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:03:50,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:03:52,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:52,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:03:53,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:03:54,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:54,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:03:54,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:03:56,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:03:56,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:03:58,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:58,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 03:03:59,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:04:01,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:04:01,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:04:04,045 INFO [train.py:1046] (2/4) Epoch 21, batch 3000, loss[loss=0.1755, simple_loss=0.2464, pruned_loss=0.05234, over 23801.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2518, pruned_loss=0.0498, over 4728010.30 frames. ], batch size: 212, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:04:04,046 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 03:04:18,540 INFO [train.py:1078] (2/4) Epoch 21, validation: loss=0.3071, simple_loss=0.2764, pruned_loss=0.1689, over 1125622.00 frames. 2023-10-02 03:04:18,541 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20919MB 2023-10-02 03:04:20,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 03:04:20,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 03:04:24,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:04:24,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:04:24,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 03:04:25,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:04:31,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:04:33,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=728346.6666666666, ans=0.125 2023-10-02 03:04:37,015 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-10-02 03:04:40,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:04:46,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 03:04:46,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:04:51,182 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.800e+02 1.986e+02 2.235e+02 3.212e+02, threshold=3.971e+02, percent-clipped=0.0 2023-10-02 03:04:51,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:04:51,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:04:51,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:04:51,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=728413.3333333334, ans=0.09899494936611666 2023-10-02 03:04:54,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:04:54,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 03:04:55,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 03:04:58,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:04:58,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:05:01,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:05:01,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:05:01,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:01,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:05:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:05:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:05:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:05:09,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:05:12,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 03:05:14,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:05:14,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:14,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:05:14,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=728480.0, ans=0.0 2023-10-02 03:05:15,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:17,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:18,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 03:05:18,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 03:05:19,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:05:20,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 03:05:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:05:22,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 03:05:24,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=728546.6666666666, ans=0.125 2023-10-02 03:05:25,083 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.92 vs. limit=22.5 2023-10-02 03:05:25,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:05:27,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:05:27,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 03:05:27,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 03:05:27,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:05:27,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:05:28,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:28,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:05:28,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:30,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:05:31,277 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-10-02 03:05:32,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 03:05:33,384 INFO [train.py:1046] (2/4) Epoch 21, batch 3050, loss[loss=0.1795, simple_loss=0.2632, pruned_loss=0.04788, over 24433.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2526, pruned_loss=0.04997, over 4735612.23 frames. ], batch size: 69, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:05:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:05:36,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:05:37,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:05:42,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:45,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 03:05:45,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=728613.3333333334, ans=0.125 2023-10-02 03:05:49,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 03:05:51,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 03:05:51,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:05:55,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:05:59,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:59,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:05:59,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:00,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=728680.0, ans=0.2 2023-10-02 03:06:02,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:06:04,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:06:04,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:05,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:06:05,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:06:08,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:08,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=728746.6666666666, ans=0.125 2023-10-02 03:06:10,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:10,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 03:06:11,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:06:11,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:06:16,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:06:17,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:06:17,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:06:17,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:22,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:23,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:29,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:29,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:06:29,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:31,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:06:32,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:06:32,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:06:34,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 03:06:35,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:06:35,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:37,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 03:06:38,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:39,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-10-02 03:06:46,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:47,545 INFO [train.py:1046] (2/4) Epoch 21, batch 3100, loss[loss=0.1524, simple_loss=0.2304, pruned_loss=0.03722, over 20728.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2518, pruned_loss=0.04942, over 4717153.69 frames. ], batch size: 45, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:06:49,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:06:51,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:06:52,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 03:06:53,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 03:06:55,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=728946.6666666666, ans=0.0 2023-10-02 03:06:56,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 03:06:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:06:59,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:06:59,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:03,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 03:07:04,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=729013.3333333334, ans=0.1 2023-10-02 03:07:07,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:11,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 03:07:16,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:07:16,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:16,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:07:16,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:07:18,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 03:07:21,276 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.950e+02 2.237e+02 2.681e+02 4.003e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-02 03:07:21,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:07:21,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 03:07:21,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:07:21,989 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-10-02 03:07:22,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:24,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 03:07:25,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:07:28,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:07:29,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 03:07:31,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 03:07:31,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:33,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:35,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:07:37,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:37,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:07:38,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:07:38,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:07:40,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:07:40,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:07:40,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:40,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:07:46,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:07:46,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 03:07:49,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:07:49,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 03:07:49,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:07:50,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:51,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 03:07:52,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.08 vs. limit=10.0 2023-10-02 03:08:01,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 03:08:02,732 INFO [train.py:1046] (2/4) Epoch 21, batch 3150, loss[loss=0.1899, simple_loss=0.2743, pruned_loss=0.05274, over 24365.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2509, pruned_loss=0.0492, over 4726481.61 frames. ], batch size: 77, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:08:02,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:08:06,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:08:07,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:08:07,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 03:08:08,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:08,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:08:10,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 03:08:11,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:13,690 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 03:08:15,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 03:08:15,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:08:17,175 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 03:08:17,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 03:08:18,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 03:08:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 03:08:20,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 03:08:20,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:20,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:08:21,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:21,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 03:08:23,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:24,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:24,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:08:24,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=729346.6666666666, ans=0.0 2023-10-02 03:08:27,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:08:31,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 03:08:31,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:08:34,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:08:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:08:35,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 03:08:37,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 03:08:39,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:08:39,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:08:39,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:08:40,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:08:40,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:08:42,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:08:42,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:08:43,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 03:08:44,114 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:08:45,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:08:45,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:08:47,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:08:47,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:08:48,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 03:08:50,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:08:51,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 03:08:51,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:08:53,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 03:08:53,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 03:08:54,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:08:55,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:08:57,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 03:08:57,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 03:08:58,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:09:01,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:09:02,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:02,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:09:07,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:09:07,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:10,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 03:09:14,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=729546.6666666666, ans=0.125 2023-10-02 03:09:16,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:09:16,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 03:09:17,970 INFO [train.py:1046] (2/4) Epoch 21, batch 3200, loss[loss=0.1927, simple_loss=0.2612, pruned_loss=0.06208, over 23812.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2491, pruned_loss=0.04909, over 4712135.27 frames. ], batch size: 179, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:09:20,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:22,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:09:22,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 03:09:24,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:09:29,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:09:29,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=729613.3333333334, ans=0.2 2023-10-02 03:09:32,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:34,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=729680.0, ans=0.125 2023-10-02 03:09:38,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:09:46,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=729746.6666666666, ans=0.2 2023-10-02 03:09:49,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 03:09:49,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=729746.6666666666, ans=0.125 2023-10-02 03:09:50,801 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 1.933e+02 2.083e+02 2.478e+02 3.380e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 03:09:50,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:09:51,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=729746.6666666666, ans=0.1 2023-10-02 03:09:51,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=729746.6666666666, ans=0.1 2023-10-02 03:09:54,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 03:09:54,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.20 vs. limit=15.0 2023-10-02 03:09:55,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:09:59,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:09:59,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:10:00,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:10:05,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 03:10:07,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 03:10:08,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 03:10:09,421 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.36 vs. limit=22.5 2023-10-02 03:10:11,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 03:10:13,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:10:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:20,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:10:20,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:20,761 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 03:10:20,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:10:23,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:10:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 03:10:26,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 03:10:27,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 03:10:29,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 03:10:31,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:10:32,612 INFO [train.py:1046] (2/4) Epoch 21, batch 3250, loss[loss=0.175, simple_loss=0.2677, pruned_loss=0.04117, over 24485.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2493, pruned_loss=0.04894, over 4711077.10 frames. ], batch size: 69, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:10:33,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:10:34,003 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 03:10:34,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:10:35,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:36,015 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 03:10:38,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:10:41,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:10:44,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=729946.6666666666, ans=0.1 2023-10-02 03:10:48,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:10:48,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 03:10:50,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:10:50,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:50,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:10:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:10:53,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:10:56,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:56,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:10:56,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:10:57,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:57,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:57,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:11:01,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:02,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:11:04,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:11:04,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:11:05,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:11:07,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:11:07,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:11:11,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 03:11:11,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730080.0, ans=0.1 2023-10-02 03:11:12,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:11:12,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:11:13,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:13,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:11:19,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:11:25,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:11:26,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:26,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 03:11:26,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:11:26,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:11:26,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:29,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 03:11:29,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 03:11:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:11:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:33,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:11:33,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 03:11:35,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:11:37,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:11:37,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:11:39,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 03:11:39,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:11:40,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=730213.3333333334, ans=0.07 2023-10-02 03:11:41,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:11:41,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 03:11:44,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:11:44,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 03:11:46,132 INFO [train.py:1046] (2/4) Epoch 21, batch 3300, loss[loss=0.2211, simple_loss=0.2785, pruned_loss=0.08179, over 19182.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2497, pruned_loss=0.04927, over 4700960.75 frames. ], batch size: 388, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:11:46,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 03:11:47,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 03:11:48,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.36 vs. limit=15.0 2023-10-02 03:11:49,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:49,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=730280.0, ans=0.125 2023-10-02 03:11:54,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:11:55,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:11:55,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:58,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 03:11:58,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:12:01,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:01,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=730346.6666666666, ans=0.125 2023-10-02 03:12:02,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:12:08,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 03:12:08,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:12:08,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:08,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=730346.6666666666, ans=0.1 2023-10-02 03:12:08,804 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.70 vs. limit=15.0 2023-10-02 03:12:09,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:09,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 03:12:11,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:12:12,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:12:13,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:12:13,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:12:15,101 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 03:12:17,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:12:17,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:12:19,241 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.870e+02 2.047e+02 2.284e+02 2.829e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-02 03:12:20,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:20,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 03:12:21,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 03:12:22,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:23,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:12:26,599 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 03:12:26,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=730413.3333333334, ans=0.5 2023-10-02 03:12:28,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 03:12:30,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:12:31,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 03:12:32,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:12:35,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:12:37,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:12:39,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:12:40,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:40,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:12:41,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:12:43,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:12:43,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:45,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:12:46,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 03:12:47,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 03:12:49,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:12:49,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:12:49,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:12:50,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:50,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:12:52,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:12:52,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:12:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:12:54,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:54,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:12:58,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 03:12:58,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:00,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:01,441 INFO [train.py:1046] (2/4) Epoch 21, batch 3350, loss[loss=0.1776, simple_loss=0.2483, pruned_loss=0.05342, over 23220.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2501, pruned_loss=0.04937, over 4703924.36 frames. ], batch size: 119, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:13:02,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:13:02,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:13:05,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:05,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:13:05,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:10,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:13:11,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:12,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:13:13,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:16,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:13:17,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:18,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:13:20,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 03:13:21,699 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 03:13:21,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:24,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 03:13:24,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 03:13:26,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:13:26,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:13:28,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:28,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 03:13:28,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:30,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:13:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:33,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:35,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:35,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:13:38,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:39,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:39,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:43,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:13:44,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:47,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:47,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:49,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:51,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 03:13:51,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:13:51,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 03:13:51,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:13:53,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 03:13:53,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:54,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:14:02,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:14:04,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 03:14:05,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:14:06,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:14:06,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:14:11,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:14:14,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 03:14:14,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:14:14,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:14:15,754 INFO [train.py:1046] (2/4) Epoch 21, batch 3400, loss[loss=0.1841, simple_loss=0.2702, pruned_loss=0.04901, over 24347.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2512, pruned_loss=0.04971, over 4707082.54 frames. ], batch size: 77, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:14:17,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:14:18,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 03:14:19,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:14:19,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 03:14:21,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:14:21,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:14:21,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:14:22,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:14:22,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 03:14:26,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 03:14:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 03:14:26,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:14:32,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:14:32,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:14:33,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:14:34,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:14:36,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=731013.3333333334, ans=0.125 2023-10-02 03:14:38,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.50 vs. limit=10.0 2023-10-02 03:14:39,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:14:39,821 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.46 vs. limit=15.0 2023-10-02 03:14:40,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 03:14:44,741 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=6.02 vs. limit=12.0 2023-10-02 03:14:48,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:14:48,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:14:49,938 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.878e+02 1.982e+02 2.218e+02 2.946e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 03:14:50,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:14:50,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 03:14:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:15:01,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 03:15:05,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:15:05,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:15:06,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 03:15:06,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:15:07,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:08,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:15:08,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:15:11,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:15:15,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:15:15,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:15:18,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:15:21,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 03:15:26,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:15:30,850 INFO [train.py:1046] (2/4) Epoch 21, batch 3450, loss[loss=0.1603, simple_loss=0.2346, pruned_loss=0.04304, over 20688.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2517, pruned_loss=0.05007, over 4707266.48 frames. ], batch size: 45, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:15:30,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 03:15:36,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 03:15:36,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:15:37,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:15:37,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 03:15:39,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:15:40,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:15:46,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:15:46,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:15:47,575 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.11 vs. limit=10.0 2023-10-02 03:15:48,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:15:48,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:49,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:51,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=731346.6666666666, ans=0.125 2023-10-02 03:15:55,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 03:16:01,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 03:16:02,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:16:02,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:16:04,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:05,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.06 vs. limit=10.0 2023-10-02 03:16:09,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 03:16:09,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:16:14,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:16:15,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:16:15,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:16:17,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:16:20,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 03:16:20,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:16:22,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:16:25,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:16:26,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 03:16:28,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=731480.0, ans=0.0 2023-10-02 03:16:29,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:16:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:16:36,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:38,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:42,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:42,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:16:44,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:16:44,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:16:45,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=731613.3333333334, ans=0.125 2023-10-02 03:16:46,170 INFO [train.py:1046] (2/4) Epoch 21, batch 3500, loss[loss=0.1772, simple_loss=0.241, pruned_loss=0.05674, over 23785.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.25, pruned_loss=0.04986, over 4704372.90 frames. ], batch size: 164, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:16:48,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:52,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:16:52,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 03:16:53,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:16:56,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:16:59,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:59,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 03:17:02,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=731680.0, ans=0.0 2023-10-02 03:17:05,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:17:05,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=731680.0, ans=0.0 2023-10-02 03:17:06,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:17:06,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:17:06,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:06,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:17:07,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:08,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:17:08,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 03:17:09,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.67 vs. limit=15.0 2023-10-02 03:17:11,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:12,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:17:14,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:17:17,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:18,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 03:17:18,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:17:19,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=731746.6666666666, ans=0.125 2023-10-02 03:17:20,267 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.924e+02 2.111e+02 2.557e+02 4.190e+02, threshold=4.222e+02, percent-clipped=1.0 2023-10-02 03:17:20,934 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.59 vs. limit=10.0 2023-10-02 03:17:21,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:17:23,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:17:23,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:25,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:17:26,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:17:26,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 03:17:27,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 03:17:29,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 03:17:30,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:17:31,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:32,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:17:33,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.11 vs. limit=10.0 2023-10-02 03:17:34,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:17:35,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:17:41,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:17:41,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 03:17:41,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 03:17:41,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:17:46,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:17:46,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:17:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:50,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 03:17:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:17:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:53,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 03:17:56,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 03:17:58,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:58,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:18:00,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:00,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:01,516 INFO [train.py:1046] (2/4) Epoch 21, batch 3550, loss[loss=0.1498, simple_loss=0.198, pruned_loss=0.0508, over 19190.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.248, pruned_loss=0.04874, over 4696179.99 frames. ], batch size: 389, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:18:03,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:18:11,329 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.27 vs. limit=22.5 2023-10-02 03:18:13,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 03:18:17,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:18:20,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:18:21,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:21,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:18:21,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:18:23,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.81 vs. limit=15.0 2023-10-02 03:18:24,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:18:24,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=732013.3333333334, ans=0.0 2023-10-02 03:18:26,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:18:27,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:27,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:18:29,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:18:32,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=732080.0, ans=0.0 2023-10-02 03:18:33,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:18:33,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:18:36,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:18:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:36,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:18:37,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 03:18:37,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:39,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:40,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.87 vs. limit=12.0 2023-10-02 03:18:40,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:18:45,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:18:45,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=732146.6666666666, ans=0.125 2023-10-02 03:18:46,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:18:47,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:18:47,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=732146.6666666666, ans=0.5 2023-10-02 03:18:49,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 03:18:49,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:18:51,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 03:18:53,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:18:54,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:18:54,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:18:57,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 03:18:58,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:04,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:06,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 03:19:06,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:10,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:19:12,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 03:19:17,019 INFO [train.py:1046] (2/4) Epoch 21, batch 3600, loss[loss=0.1631, simple_loss=0.2338, pruned_loss=0.0462, over 21107.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2484, pruned_loss=0.04834, over 4703006.23 frames. ], batch size: 46, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:19:17,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 03:19:18,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:19:19,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:19:21,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:22,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:23,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:19:26,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:19:27,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:27,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:19:28,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:19:30,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:30,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 03:19:33,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:19:34,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:39,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:19:43,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:19:44,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:19:44,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:19:44,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 03:19:45,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:19:47,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:19:50,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:51,256 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.800e+02 2.043e+02 2.517e+02 4.317e+02, threshold=4.086e+02, percent-clipped=1.0 2023-10-02 03:19:52,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:19:54,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:19:54,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 03:20:00,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:01,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:20:03,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 03:20:08,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:20:14,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:18,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:20,905 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.95 vs. limit=22.5 2023-10-02 03:20:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:20:25,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:20:25,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 03:20:25,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 03:20:27,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 03:20:29,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:20:29,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:20:30,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 03:20:31,673 INFO [train.py:1046] (2/4) Epoch 21, batch 3650, loss[loss=0.1536, simple_loss=0.2315, pruned_loss=0.03785, over 24317.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2493, pruned_loss=0.04839, over 4713043.30 frames. ], batch size: 61, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:20:31,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:20:31,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:20:31,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:31,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 03:20:33,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 03:20:36,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:37,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 03:20:42,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 03:20:44,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:20:47,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 03:20:49,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 03:20:50,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=732680.0, ans=0.0 2023-10-02 03:20:53,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:20:53,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:20:53,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:20:53,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=732680.0, ans=0.2 2023-10-02 03:20:54,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:20:54,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:56,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 03:20:57,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:20:57,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:20:59,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 03:21:00,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:21:00,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:21:00,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:03,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:21:04,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 03:21:06,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 03:21:06,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:21:08,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 03:21:10,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:21:10,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:21:10,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=732746.6666666666, ans=0.05 2023-10-02 03:21:15,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:21:17,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:17,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:21:20,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:21:20,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:21:23,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:21:26,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:21:26,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:26,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:21:26,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=732813.3333333334, ans=0.07 2023-10-02 03:21:28,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:21:28,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:29,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:21:35,176 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 03:21:37,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.02 vs. limit=15.0 2023-10-02 03:21:38,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:21:38,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:21:38,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:21:40,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:41,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:21:43,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:45,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 03:21:45,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:45,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=732946.6666666666, ans=0.0 2023-10-02 03:21:47,721 INFO [train.py:1046] (2/4) Epoch 21, batch 3700, loss[loss=0.1882, simple_loss=0.2751, pruned_loss=0.05065, over 24550.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2504, pruned_loss=0.04918, over 4720655.00 frames. ], batch size: 71, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:21:49,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:21:52,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:21:52,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:21:55,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:55,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 03:21:55,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:56,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:21:56,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:21:59,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:22:03,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:05,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:05,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:22:05,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:22:06,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:22:09,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:09,359 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 03:22:11,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=12.0 2023-10-02 03:22:16,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=733080.0, ans=0.2 2023-10-02 03:22:17,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:22:17,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:22:19,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:22:19,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 03:22:19,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:22:21,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.71 vs. limit=15.0 2023-10-02 03:22:22,357 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.917e+02 2.171e+02 2.489e+02 3.807e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-02 03:22:23,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:25,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 03:22:26,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:26,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:22:29,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:29,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:22:31,787 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.28 vs. limit=15.0 2023-10-02 03:22:32,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:22:35,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:22:35,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 03:22:35,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:35,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 03:22:40,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:22:42,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:22:44,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:44,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 03:22:47,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:22:47,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:22:48,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:22:48,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:51,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:22:51,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 03:22:53,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 03:22:54,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:22:54,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:22:56,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:22:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:23:00,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:23:01,483 INFO [train.py:1046] (2/4) Epoch 21, batch 3750, loss[loss=0.2404, simple_loss=0.2995, pruned_loss=0.09066, over 19592.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2521, pruned_loss=0.0494, over 4723066.79 frames. ], batch size: 388, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:23:01,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:23:01,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:04,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 03:23:06,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 03:23:09,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:23:09,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 03:23:10,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:23:10,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:23:12,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:23:12,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=733280.0, ans=0.125 2023-10-02 03:23:14,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:23:17,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:23:20,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:23:20,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:23:23,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:23:26,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:23:26,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 03:23:26,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:23:26,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=733346.6666666666, ans=0.1 2023-10-02 03:23:27,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:23:27,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:23:30,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=733413.3333333334, ans=0.125 2023-10-02 03:23:31,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 03:23:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 03:23:36,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:23:37,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:23:39,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:23:42,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:46,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 03:23:47,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.56 vs. limit=22.5 2023-10-02 03:23:51,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 03:23:54,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:54,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=733480.0, ans=0.125 2023-10-02 03:23:57,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:23:57,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:24:01,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:24:04,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:24:06,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:24:07,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:24:08,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=733546.6666666666, ans=0.0 2023-10-02 03:24:09,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:24:13,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:24:16,559 INFO [train.py:1046] (2/4) Epoch 21, batch 3800, loss[loss=0.1869, simple_loss=0.2719, pruned_loss=0.05089, over 24694.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2523, pruned_loss=0.04941, over 4736194.61 frames. ], batch size: 73, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:24:16,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=733613.3333333334, ans=0.125 2023-10-02 03:24:21,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:24:24,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:26,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:24:27,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 03:24:28,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:24:31,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:24:31,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:24:34,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 03:24:34,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:24:35,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:24:37,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:24:37,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:38,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 03:24:40,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 03:24:42,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:24:42,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=733680.0, ans=0.125 2023-10-02 03:24:42,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=733680.0, ans=0.07 2023-10-02 03:24:46,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:24:48,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:24:49,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:24:51,632 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.912e+02 2.251e+02 2.632e+02 4.326e+02, threshold=4.502e+02, percent-clipped=0.0 2023-10-02 03:24:51,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:24:51,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:53,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:54,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=733746.6666666666, ans=0.125 2023-10-02 03:24:55,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:56,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=733746.6666666666, ans=0.125 2023-10-02 03:24:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:24:59,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 03:25:00,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:25:00,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=733813.3333333334, ans=0.07 2023-10-02 03:25:01,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.38 vs. limit=6.0 2023-10-02 03:25:06,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:25:11,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:25:12,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 03:25:14,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 03:25:15,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:25:15,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=733880.0, ans=0.0 2023-10-02 03:25:16,372 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.73 vs. limit=15.0 2023-10-02 03:25:18,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:25:19,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:21,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 03:25:24,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 03:25:24,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 03:25:24,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:25,401 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.61 vs. limit=22.5 2023-10-02 03:25:26,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:25:30,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:25:30,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:25:32,221 INFO [train.py:1046] (2/4) Epoch 21, batch 3850, loss[loss=0.1785, simple_loss=0.2683, pruned_loss=0.04434, over 24442.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2513, pruned_loss=0.04929, over 4731041.25 frames. ], batch size: 69, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:25:37,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:25:38,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 03:25:38,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:25:39,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:42,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:25:47,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:25:48,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:25:48,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 03:25:49,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=734013.3333333334, ans=0.0 2023-10-02 03:25:56,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:25:57,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:26:01,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=734080.0, ans=15.0 2023-10-02 03:26:01,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:01,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:26:04,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:04,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:26:06,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:06,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:26:06,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734080.0, ans=0.1 2023-10-02 03:26:07,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:08,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:10,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:10,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:26:10,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 03:26:10,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 03:26:12,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:12,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:15,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:15,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:15,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 03:26:16,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=734146.6666666666, ans=0.125 2023-10-02 03:26:18,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 03:26:19,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:23,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 03:26:24,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:26:30,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:30,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:30,963 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.15 vs. limit=15.0 2023-10-02 03:26:33,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:34,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 03:26:36,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 03:26:40,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:40,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:40,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734213.3333333334, ans=0.1 2023-10-02 03:26:40,879 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.63 vs. limit=22.5 2023-10-02 03:26:43,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:26:43,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:26:43,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:44,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:44,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:26:44,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 03:26:45,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:46,310 INFO [train.py:1046] (2/4) Epoch 21, batch 3900, loss[loss=0.1815, simple_loss=0.2647, pruned_loss=0.04911, over 24643.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2505, pruned_loss=0.04904, over 4725099.60 frames. ], batch size: 68, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:26:47,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 03:26:47,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:49,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:26:49,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:51,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:26:51,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:51,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:53,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:26:53,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 03:26:53,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:53,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=734280.0, ans=0.0 2023-10-02 03:26:54,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:26:56,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:26:56,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:26:57,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:27:02,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:27:02,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:27:03,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:27:05,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 03:27:06,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:27:06,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 03:27:06,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:27:09,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 03:27:09,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 03:27:09,509 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:27:13,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.67 vs. limit=10.0 2023-10-02 03:27:15,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:27:16,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:27:16,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:27:18,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:21,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:27:22,591 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.899e+02 2.124e+02 2.634e+02 1.113e+03, threshold=4.247e+02, percent-clipped=1.0 2023-10-02 03:27:24,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:27:26,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:27:26,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:27:27,257 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.59 vs. limit=10.0 2023-10-02 03:27:27,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:27:33,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:27:33,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:27:40,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:27:40,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:27:47,587 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.31 vs. limit=15.0 2023-10-02 03:27:51,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:27:53,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:54,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 03:27:54,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 03:27:56,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:56,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 03:27:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:27:58,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=734546.6666666666, ans=0.125 2023-10-02 03:27:59,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 03:28:02,206 INFO [train.py:1046] (2/4) Epoch 21, batch 3950, loss[loss=0.1675, simple_loss=0.2561, pruned_loss=0.03941, over 24291.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.25, pruned_loss=0.04831, over 4722968.84 frames. ], batch size: 74, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:28:02,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.78 vs. limit=22.5 2023-10-02 03:28:05,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:28:07,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 03:28:07,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:28:10,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:28:12,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:28:16,177 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 03:28:17,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:28:18,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 03:28:19,302 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 03:28:19,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:28:21,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:28:22,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:28:22,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:28:25,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 03:28:26,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:28:27,522 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.27 vs. limit=10.0 2023-10-02 03:28:28,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:28:28,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:28:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:28:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:28:31,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734746.6666666666, ans=0.1 2023-10-02 03:28:35,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=734746.6666666666, ans=0.125 2023-10-02 03:28:39,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:28:39,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:28:44,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 03:28:50,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=734813.3333333334, ans=0.0 2023-10-02 03:28:51,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 03:28:51,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 03:28:51,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:28:53,181 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:28:54,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:28:55,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=734813.3333333334, ans=0.125 2023-10-02 03:29:02,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:29:02,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:29:03,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:29:03,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:29:03,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 03:29:04,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=734880.0, ans=0.2 2023-10-02 03:29:07,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:29:07,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=734880.0, ans=0.125 2023-10-02 03:29:08,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:29:10,414 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:29:13,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 03:29:17,012 INFO [train.py:1046] (2/4) Epoch 21, batch 4000, loss[loss=0.1773, simple_loss=0.2505, pruned_loss=0.05208, over 23678.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2503, pruned_loss=0.04881, over 4708863.03 frames. ], batch size: 149, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:29:19,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734946.6666666666, ans=0.1 2023-10-02 03:29:23,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:29,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:33,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.09 vs. limit=22.5 2023-10-02 03:29:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:29:35,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:29:36,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:36,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 03:29:38,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:29:38,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 03:29:38,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:29:38,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 03:29:41,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:29:44,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:29:44,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:29:44,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:29:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:29:44,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:29:47,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:29:48,521 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 03:29:49,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:29:50,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:29:53,497 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 03:29:54,647 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.771e+02 2.002e+02 2.200e+02 3.082e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 03:29:54,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:29:54,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:29:59,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 03:30:01,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:30:02,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.07 vs. limit=15.0 2023-10-02 03:30:03,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:30:04,608 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 03:30:05,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:30:06,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 03:30:06,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:30:07,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:30:07,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=735146.6666666666, ans=0.0 2023-10-02 03:30:09,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:30:10,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:30:10,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:30:10,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:30:12,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=735146.6666666666, ans=0.0 2023-10-02 03:30:13,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 03:30:13,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:30:16,137 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 03:30:18,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=735213.3333333334, ans=0.125 2023-10-02 03:30:19,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:30:22,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 03:30:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:30:25,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:30:26,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:30:28,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:30:33,391 INFO [train.py:1046] (2/4) Epoch 21, batch 4050, loss[loss=0.1663, simple_loss=0.2386, pruned_loss=0.04701, over 23288.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2503, pruned_loss=0.04859, over 4724020.62 frames. ], batch size: 119, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:30:34,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:30:36,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:30:37,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 03:30:39,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:30:40,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:30:40,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:30:41,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:30:43,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:30:44,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=735280.0, ans=0.5 2023-10-02 03:30:46,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:30:49,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:30:50,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:30:52,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:30:52,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:30:56,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:30:58,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:31:01,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 03:31:02,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 03:31:02,778 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 03:31:04,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:31:12,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 03:31:12,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:31:15,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:31:18,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:31:19,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:31:19,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:31:22,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:31:27,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 03:31:27,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:31:28,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:31:29,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 03:31:31,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=735546.6666666666, ans=0.0 2023-10-02 03:31:32,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:31:40,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 03:31:42,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:31:42,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:31:42,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=735546.6666666666, ans=0.0 2023-10-02 03:31:43,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 03:31:43,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 03:31:43,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:31:47,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:31:48,516 INFO [train.py:1046] (2/4) Epoch 21, batch 4100, loss[loss=0.1506, simple_loss=0.2326, pruned_loss=0.03437, over 24656.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2507, pruned_loss=0.04892, over 4718411.91 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:31:48,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:31:48,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:31:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 03:31:57,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 03:31:57,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 03:31:58,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=735613.3333333334, ans=0.125 2023-10-02 03:31:59,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 03:31:59,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:31:59,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=735613.3333333334, ans=0.125 2023-10-02 03:32:00,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:00,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:02,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:32:02,098 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 03:32:06,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:32:06,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:32:06,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:06,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:32:10,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:32:11,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:32:12,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:32:12,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 03:32:13,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:13,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:32:14,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:32:14,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:32:15,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 03:32:17,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:18,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 03:32:20,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:32:23,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:32:23,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 03:32:25,649 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.845e+02 2.063e+02 2.238e+02 3.608e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-02 03:32:25,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:32:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:32:27,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:32:28,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 03:32:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:32:32,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:32:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 03:32:35,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:35,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:32:39,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:45,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:32:45,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=735813.3333333334, ans=0.2 2023-10-02 03:32:48,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:32:50,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:32:55,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:57,982 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.63 vs. limit=15.0 2023-10-02 03:32:58,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:33:00,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:33:02,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=735946.6666666666, ans=0.1 2023-10-02 03:33:03,329 INFO [train.py:1046] (2/4) Epoch 21, batch 4150, loss[loss=0.1696, simple_loss=0.2517, pruned_loss=0.04375, over 23347.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2509, pruned_loss=0.04883, over 4714038.61 frames. ], batch size: 93, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:33:04,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:33:06,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:33:08,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:33:08,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:33:11,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 03:33:11,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:33:12,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 03:33:12,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 03:33:14,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 03:33:14,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:33:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:33:18,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:33:21,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=736013.3333333334, ans=0.125 2023-10-02 03:33:23,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:33:23,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:33:24,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:33:26,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:33:26,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:33:27,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:33:30,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=736013.3333333334, ans=0.5 2023-10-02 03:33:31,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:33:35,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:33:35,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-10-02 03:33:37,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 03:33:39,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 03:33:39,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:33:42,674 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.25 vs. limit=15.0 2023-10-02 03:33:43,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 03:33:43,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:33:43,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:33:44,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:33:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:33:50,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 03:33:52,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=736146.6666666666, ans=0.125 2023-10-02 03:33:53,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:33:54,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:33:55,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.25 vs. limit=15.0 2023-10-02 03:33:56,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 03:33:56,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:33:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 03:34:00,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:34:00,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:34:01,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:03,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 03:34:03,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:03,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:34:05,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:34:07,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 03:34:07,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:07,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:34:07,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:34:08,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=736213.3333333334, ans=0.0 2023-10-02 03:34:09,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 03:34:09,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:34:09,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:34:11,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:34:11,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:11,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 03:34:12,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:34:16,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:34:18,534 INFO [train.py:1046] (2/4) Epoch 21, batch 4200, loss[loss=0.1656, simple_loss=0.2133, pruned_loss=0.05894, over 19542.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2503, pruned_loss=0.04856, over 4711073.46 frames. ], batch size: 388, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:34:18,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 03:34:20,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:34:22,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:34:24,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:34:24,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:34:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:34:27,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 03:34:29,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 03:34:31,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:32,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:34:35,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:34:39,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:34:39,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:34:39,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:42,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 03:34:42,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:34:43,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:44,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:34:44,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:34:46,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:34:47,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 03:34:47,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:52,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:34:52,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:34:54,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:34:54,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=736413.3333333334, ans=0.2 2023-10-02 03:34:55,340 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.914e+02 2.073e+02 2.283e+02 3.336e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-02 03:34:55,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=736413.3333333334, ans=0.1 2023-10-02 03:34:56,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:34:58,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:34:58,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 03:34:58,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:34:59,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:35:01,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=736480.0, ans=0.125 2023-10-02 03:35:02,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:35:05,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:35:13,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:35:14,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 03:35:17,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:35:23,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:35:23,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:24,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=736546.6666666666, ans=0.2 2023-10-02 03:35:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 03:35:30,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:35:33,676 INFO [train.py:1046] (2/4) Epoch 21, batch 4250, loss[loss=0.175, simple_loss=0.2512, pruned_loss=0.04941, over 24435.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2497, pruned_loss=0.04874, over 4716064.78 frames. ], batch size: 63, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:35:35,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:35:35,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:35:39,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:44,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:35:44,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 03:35:44,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:35:49,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:53,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:35:56,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:35:56,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:35:59,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:35:59,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:36:00,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:02,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:03,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:05,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:36:05,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:06,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 03:36:09,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 03:36:09,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:09,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:36:11,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:11,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:36:11,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:11,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:11,979 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.23 vs. limit=15.0 2023-10-02 03:36:13,864 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.30 vs. limit=15.0 2023-10-02 03:36:16,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:36:17,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:36:17,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=736813.3333333334, ans=0.2 2023-10-02 03:36:18,558 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.28 vs. limit=15.0 2023-10-02 03:36:21,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:36:23,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:25,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 03:36:25,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:36:25,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 03:36:26,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:36:28,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:36:29,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:29,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:36:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 03:36:32,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=736880.0, ans=0.125 2023-10-02 03:36:33,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:36:35,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:36:39,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:42,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:44,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:36:44,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:36:45,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:36:47,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:36:49,216 INFO [train.py:1046] (2/4) Epoch 21, batch 4300, loss[loss=0.1785, simple_loss=0.2583, pruned_loss=0.04933, over 24022.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2489, pruned_loss=0.04871, over 4709862.83 frames. ], batch size: 80, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:36:49,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:36:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 03:36:51,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:36:53,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=736946.6666666666, ans=0.2 2023-10-02 03:36:54,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:36:56,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:37:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:37:09,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:37:09,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 03:37:09,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:37:12,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:37:12,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:37:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 03:37:15,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:37:16,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:37:20,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 03:37:20,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:37:21,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 03:37:22,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:37:24,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:37:25,611 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.828e+02 2.109e+02 2.471e+02 4.007e+02, threshold=4.219e+02, percent-clipped=0.0 2023-10-02 03:37:27,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:37:27,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:37:29,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:37:30,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:37:30,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:37:31,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 03:37:32,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 03:37:34,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:37:37,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:37,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:37:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:37,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:37:37,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 03:37:38,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 03:37:38,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 03:37:40,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:37:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 03:37:41,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 03:37:44,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:37:46,118 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 03:37:46,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:37:49,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:37:49,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:37:50,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 03:37:52,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:37:52,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:53,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:37:54,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:37:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:37:56,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:37:59,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:00,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:38:02,689 INFO [train.py:1046] (2/4) Epoch 21, batch 4350, loss[loss=0.1794, simple_loss=0.2519, pruned_loss=0.05347, over 23577.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2503, pruned_loss=0.04885, over 4724529.33 frames. ], batch size: 232, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:38:05,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 03:38:05,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:38:06,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.68 vs. limit=15.0 2023-10-02 03:38:09,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:38:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:16,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:38:16,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:38:21,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:38:25,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:27,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:38:28,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:38:31,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:38:33,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:38:33,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:38:34,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=737413.3333333334, ans=0.125 2023-10-02 03:38:37,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 03:38:38,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:38:38,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:44,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 03:38:51,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:38:52,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:38:57,157 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 03:38:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:38:59,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:39:01,185 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 03:39:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 03:39:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:39:02,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:02,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:39:04,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:05,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:39:05,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:39:08,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 03:39:08,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:08,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:39:08,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:10,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 03:39:11,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 03:39:11,411 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 03:39:11,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 03:39:12,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:39:14,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:39:14,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:14,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:39:17,362 INFO [train.py:1046] (2/4) Epoch 21, batch 4400, loss[loss=0.1752, simple_loss=0.2462, pruned_loss=0.05209, over 23800.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2511, pruned_loss=0.04939, over 4718463.22 frames. ], batch size: 150, lr: 4.86e-03, grad_scale: 16.0 2023-10-02 03:39:17,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 03:39:18,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 03:39:18,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:22,268 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.62 vs. limit=6.0 2023-10-02 03:39:23,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:39:23,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:23,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=737613.3333333334, ans=0.09899494936611666 2023-10-02 03:39:25,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:39:26,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 03:39:26,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 03:39:27,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 03:39:27,973 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 03:39:28,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:39:29,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:39:30,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 03:39:32,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:34,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:34,160 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 03:39:34,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=737680.0, ans=0.125 2023-10-02 03:39:38,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:38,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 03:39:38,293 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 03:39:41,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 03:39:42,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 03:39:42,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 03:39:42,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:42,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:43,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:43,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:39:45,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 03:39:45,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 03:39:47,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:47,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=737746.6666666666, ans=0.125 2023-10-02 03:39:48,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:39:48,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:50,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=737746.6666666666, ans=0.125 2023-10-02 03:39:51,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:52,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 03:39:53,347 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 03:39:55,150 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.905e+02 2.159e+02 2.393e+02 3.383e+02, threshold=4.317e+02, percent-clipped=0.0 2023-10-02 03:39:55,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=737746.6666666666, ans=0.1 2023-10-02 03:39:56,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:04,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:40:05,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 03:40:09,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:40:11,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:40:15,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:40:15,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 03:40:16,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:40:16,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:40:16,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:40:16,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:40:20,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 03:40:23,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 03:40:26,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 03:40:26,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:40:26,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 03:40:26,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:40:29,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:40:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 03:40:31,974 INFO [train.py:1046] (2/4) Epoch 21, batch 4450, loss[loss=0.1578, simple_loss=0.2303, pruned_loss=0.04271, over 17161.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2521, pruned_loss=0.04934, over 4725759.04 frames. ], batch size: 36, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:40:32,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=737946.6666666666, ans=0.125 2023-10-02 03:40:33,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:40:36,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:36,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:40:36,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=737946.6666666666, ans=0.125 2023-10-02 03:40:38,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=737946.6666666666, ans=0.125 2023-10-02 03:40:41,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:40:41,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:40:43,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-10-02 03:40:45,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:47,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:40:49,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:40:49,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:40:52,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 03:40:52,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:40:53,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:53,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:40:53,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:40:56,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:41:01,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:01,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:02,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738080.0, ans=0.1 2023-10-02 03:41:03,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:41:04,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:41:06,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:41:07,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=738080.0, ans=0.125 2023-10-02 03:41:10,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 03:41:11,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 03:41:13,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 03:41:13,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:41:14,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:41:15,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 03:41:18,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:41:21,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:23,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 03:41:25,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:25,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:41:25,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:41:25,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:41:26,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:31,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:41:31,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 03:41:32,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:41:34,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:41:36,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:41:37,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:37,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:41:40,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:41:41,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 03:41:42,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=738213.3333333334, ans=0.125 2023-10-02 03:41:43,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:41:46,095 INFO [train.py:1046] (2/4) Epoch 21, batch 4500, loss[loss=0.1798, simple_loss=0.2561, pruned_loss=0.05175, over 23273.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2528, pruned_loss=0.04931, over 4720877.39 frames. ], batch size: 105, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:41:47,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:41:48,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 03:41:48,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 03:41:51,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:41:52,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.30 vs. limit=15.0 2023-10-02 03:41:53,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=738280.0, ans=0.0 2023-10-02 03:41:54,072 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.91 vs. limit=6.0 2023-10-02 03:41:56,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:57,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:41:58,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=738280.0, ans=0.125 2023-10-02 03:41:59,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:42:00,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:42:01,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:01,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:03,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=738346.6666666666, ans=0.1 2023-10-02 03:42:11,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:42:13,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:42:15,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:42:17,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:42:17,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:42:21,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:42:24,900 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.913e+02 2.114e+02 2.419e+02 4.024e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-02 03:42:25,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:42:29,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:42:33,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:42:34,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 03:42:35,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:35,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:42:39,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:42:39,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:42:40,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 03:42:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:42:40,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:44,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:42:44,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:42:45,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=738546.6666666666, ans=0.1 2023-10-02 03:42:47,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:48,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:42:48,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:42:51,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 03:42:51,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=738546.6666666666, ans=0.07 2023-10-02 03:42:53,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 03:42:53,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 03:42:56,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 03:43:00,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 03:43:01,955 INFO [train.py:1046] (2/4) Epoch 21, batch 4550, loss[loss=0.1555, simple_loss=0.2153, pruned_loss=0.04784, over 22713.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2519, pruned_loss=0.04919, over 4722988.71 frames. ], batch size: 322, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:43:02,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:43:06,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:43:06,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:43:06,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738613.3333333334, ans=0.1 2023-10-02 03:43:08,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:13,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:43:15,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:43:15,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=738680.0, ans=0.125 2023-10-02 03:43:16,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:16,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:43:16,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:20,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:20,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:43:24,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:43:26,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 03:43:26,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 03:43:28,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:43:30,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 03:43:34,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 03:43:34,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:43:35,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738746.6666666666, ans=0.1 2023-10-02 03:43:36,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 03:43:37,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:43:40,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:40,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:40,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:43:43,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 03:43:45,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:43:45,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=738813.3333333334, ans=0.125 2023-10-02 03:43:46,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:48,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:43:49,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:52,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 03:43:52,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 03:43:52,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:43:53,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 03:43:54,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=738813.3333333334, ans=0.02 2023-10-02 03:43:55,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 03:43:55,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=738813.3333333334, ans=0.125 2023-10-02 03:43:56,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:56,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:56,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:43:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:58,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:44:00,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:44:02,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 03:44:03,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:44:03,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 03:44:03,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 03:44:03,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:44:04,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 03:44:07,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:44:07,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:44:10,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:44:11,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:44:11,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:44:13,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:44:15,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:44:16,618 INFO [train.py:1046] (2/4) Epoch 21, batch 4600, loss[loss=0.1734, simple_loss=0.2413, pruned_loss=0.05276, over 23816.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2501, pruned_loss=0.0489, over 4716471.98 frames. ], batch size: 164, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:44:18,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:18,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:44:18,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=738946.6666666666, ans=0.0 2023-10-02 03:44:20,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:44:20,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:44:21,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=738946.6666666666, ans=0.125 2023-10-02 03:44:22,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:23,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 03:44:26,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:44:30,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:44:30,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:30,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=739013.3333333334, ans=0.125 2023-10-02 03:44:33,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:37,017 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.77 vs. limit=15.0 2023-10-02 03:44:39,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 03:44:40,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:42,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:44,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=739013.3333333334, ans=0.0 2023-10-02 03:44:46,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:44:46,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:51,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 03:44:51,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:44:52,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:44:54,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=739080.0, ans=0.0 2023-10-02 03:44:55,817 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.844e+02 1.997e+02 2.333e+02 3.874e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 03:44:56,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=739080.0, ans=0.0 2023-10-02 03:44:58,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:58,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:45:00,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:45:06,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 03:45:07,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:45:10,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:45:15,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:15,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 03:45:15,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:16,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 03:45:16,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:16,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:18,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:19,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:45:19,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:21,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 03:45:21,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 03:45:22,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 03:45:22,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:22,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:45:23,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:24,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=739213.3333333334, ans=0.125 2023-10-02 03:45:25,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:32,094 INFO [train.py:1046] (2/4) Epoch 21, batch 4650, loss[loss=0.1624, simple_loss=0.2354, pruned_loss=0.04465, over 23562.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2497, pruned_loss=0.04875, over 4716180.51 frames. ], batch size: 134, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:45:35,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:45:38,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:45:38,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:39,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:45:39,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:39,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:45:39,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:43,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 03:45:46,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:45:47,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 03:45:47,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:45:49,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 03:45:49,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:45:51,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 03:45:51,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 03:45:51,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:51,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:45:55,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:45:56,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:45:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 03:46:00,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:00,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 03:46:04,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:04,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:46:04,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 03:46:05,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=739413.3333333334, ans=0.125 2023-10-02 03:46:06,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:46:09,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:46:12,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:14,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=739413.3333333334, ans=0.125 2023-10-02 03:46:17,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:20,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:22,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:23,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:46:26,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 03:46:26,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 03:46:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 03:46:28,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 03:46:29,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:37,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:46:37,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:46:38,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 03:46:38,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:39,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:46:39,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:46:41,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:46:43,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:46:43,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:46:45,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:46,497 INFO [train.py:1046] (2/4) Epoch 21, batch 4700, loss[loss=0.1658, simple_loss=0.2412, pruned_loss=0.04519, over 23405.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2503, pruned_loss=0.0492, over 4719034.17 frames. ], batch size: 134, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:46:46,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=739613.3333333334, ans=0.1 2023-10-02 03:46:48,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:49,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:46:49,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:46:49,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 03:46:50,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:46:52,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 03:46:58,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:59,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:59,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:01,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:47:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:47:02,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=739680.0, ans=0.125 2023-10-02 03:47:04,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=739680.0, ans=0.1 2023-10-02 03:47:08,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 03:47:08,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 03:47:12,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:13,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:47:13,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:47:16,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:22,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:47:22,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:47:25,316 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.835e+02 1.990e+02 2.333e+02 4.154e+02, threshold=3.980e+02, percent-clipped=1.0 2023-10-02 03:47:25,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:47:27,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=739746.6666666666, ans=0.1 2023-10-02 03:47:31,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 03:47:31,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:47:34,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:38,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 03:47:40,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:47:44,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:47:46,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 03:47:47,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:47,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:49,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:50,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:47:50,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 03:47:52,175 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 03:47:53,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:53,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=739880.0, ans=0.125 2023-10-02 03:47:54,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:54,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:54,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 03:47:56,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:59,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=739946.6666666666, ans=0.07 2023-10-02 03:48:00,876 INFO [train.py:1046] (2/4) Epoch 21, batch 4750, loss[loss=0.1751, simple_loss=0.2592, pruned_loss=0.0455, over 24476.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2506, pruned_loss=0.04896, over 4727693.71 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:48:00,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 03:48:02,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:48:03,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:05,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=739946.6666666666, ans=0.0 2023-10-02 03:48:07,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:07,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:48:08,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 03:48:08,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:10,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=739946.6666666666, ans=0.2 2023-10-02 03:48:13,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 03:48:16,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:48:16,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:48:17,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:48:21,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.39 vs. limit=15.0 2023-10-02 03:48:21,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 03:48:26,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=740013.3333333334, ans=0.2 2023-10-02 03:48:27,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:48:28,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 03:48:28,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:48:30,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=740080.0, ans=0.1 2023-10-02 03:48:33,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:48:33,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:48:33,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:35,073 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 03:48:35,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 03:48:40,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=740080.0, ans=0.125 2023-10-02 03:48:41,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 03:48:43,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:45,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:48:47,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:48:47,886 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 03:48:47,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:48:50,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:48:52,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:48:53,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 03:48:53,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 03:48:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:54,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:48:56,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:56,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:48:56,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 03:48:59,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 03:49:01,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:04,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:49:05,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 03:49:05,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:49:05,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:06,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:49:08,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:10,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:49:12,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:49:12,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 03:49:13,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 03:49:13,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=740213.3333333334, ans=0.125 2023-10-02 03:49:15,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 03:49:17,014 INFO [train.py:1046] (2/4) Epoch 21, batch 4800, loss[loss=0.1554, simple_loss=0.237, pruned_loss=0.03693, over 24473.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2517, pruned_loss=0.04952, over 4710968.22 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:49:18,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:49:18,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:49:19,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 03:49:24,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:24,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=740280.0, ans=10.0 2023-10-02 03:49:25,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:29,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:49:31,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:49:31,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:31,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 03:49:33,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=740346.6666666666, ans=0.2 2023-10-02 03:49:34,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:49:34,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:49:34,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:49:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:49:39,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:39,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:49:41,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:41,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 03:49:41,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:43,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:49:46,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:49,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:50,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:50,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:49:51,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:49:53,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:55,817 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.875e+02 2.035e+02 2.300e+02 3.149e+02, threshold=4.071e+02, percent-clipped=0.0 2023-10-02 03:49:55,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 03:49:55,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 03:49:57,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:57,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:49:57,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:49:57,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:49:57,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:49:58,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:50:00,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:50:02,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.88 vs. limit=22.5 2023-10-02 03:50:03,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:50:07,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:08,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:15,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 03:50:15,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:50:15,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:15,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:50:16,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:50:20,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:50:21,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:50:21,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:21,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:50:22,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:50:24,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:50:24,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=740546.6666666666, ans=0.0 2023-10-02 03:50:24,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.90 vs. limit=6.0 2023-10-02 03:50:27,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:27,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:27,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:50:28,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 03:50:31,105 INFO [train.py:1046] (2/4) Epoch 21, batch 4850, loss[loss=0.1489, simple_loss=0.2286, pruned_loss=0.03458, over 24625.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2523, pruned_loss=0.04937, over 4724368.92 frames. ], batch size: 60, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:50:31,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 03:50:31,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:50:31,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:50:31,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:50:31,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:34,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:50:43,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 03:50:45,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:49,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:50:49,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:50:50,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:54,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:55,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:50:56,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:50:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 03:51:01,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:51:01,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=740746.6666666666, ans=0.025 2023-10-02 03:51:02,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:51:02,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:51:02,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=740746.6666666666, ans=0.1 2023-10-02 03:51:03,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:51:03,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 03:51:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:51:07,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:10,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:10,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 03:51:10,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 03:51:11,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:51:19,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:51:21,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 03:51:21,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:51:21,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:51:22,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:51:24,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=740813.3333333334, ans=0.125 2023-10-02 03:51:25,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 03:51:25,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:27,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 03:51:27,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:51:28,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:51:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 03:51:31,093 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.68 vs. limit=5.0 2023-10-02 03:51:33,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=740880.0, ans=0.125 2023-10-02 03:51:36,310 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.59 vs. limit=15.0 2023-10-02 03:51:38,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:43,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:51:43,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:51:46,610 INFO [train.py:1046] (2/4) Epoch 21, batch 4900, loss[loss=0.1652, simple_loss=0.245, pruned_loss=0.04268, over 24343.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2509, pruned_loss=0.04912, over 4728851.63 frames. ], batch size: 61, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:51:48,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 03:51:48,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:51:52,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:51:53,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:51:53,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:51:54,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=740946.6666666666, ans=0.2 2023-10-02 03:51:57,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 03:52:01,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 03:52:04,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 03:52:05,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 03:52:05,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:52:05,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:52:05,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:52:05,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:52:05,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:52:07,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 03:52:12,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 03:52:14,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:52:15,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:52:17,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:52:18,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:52:20,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:52:20,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:52:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 03:52:22,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:52:23,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:52:23,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 03:52:23,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 03:52:24,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=741080.0, ans=0.125 2023-10-02 03:52:25,620 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.942e+02 2.183e+02 2.601e+02 5.042e+02, threshold=4.365e+02, percent-clipped=7.0 2023-10-02 03:52:26,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.85 vs. limit=22.5 2023-10-02 03:52:27,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 03:52:28,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.59 vs. limit=6.0 2023-10-02 03:52:29,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:52:31,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:52:31,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:52:33,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:52:33,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 03:52:33,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:52:33,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 03:52:35,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:52:37,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:52:39,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:52:42,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 03:52:44,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:52:44,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 03:52:45,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 03:52:47,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=741213.3333333334, ans=0.95 2023-10-02 03:52:53,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:52:54,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:52:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 03:52:56,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:52:56,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:52:57,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:01,702 INFO [train.py:1046] (2/4) Epoch 21, batch 4950, loss[loss=0.1856, simple_loss=0.2497, pruned_loss=0.06074, over 23702.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2498, pruned_loss=0.04847, over 4735666.63 frames. ], batch size: 164, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:53:01,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:53:01,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:53:03,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:53:03,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 03:53:04,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:53:07,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:53:07,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:53:11,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 03:53:11,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 03:53:11,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:53:12,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 03:53:12,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:12,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:53:14,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:53:14,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:16,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:17,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:53:19,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:53:20,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:53:22,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:22,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:53:25,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:53:29,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:30,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:53:32,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:33,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:35,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:53:35,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 03:53:35,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 03:53:38,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:41,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:53:41,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:53:43,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:53:43,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:53:43,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=741413.3333333334, ans=0.125 2023-10-02 03:53:44,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:53:46,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-10-02 03:53:47,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:49,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:53:51,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:53:52,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:53,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:53,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 03:53:54,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:53:55,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:53:58,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:54:00,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:54:00,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:54:00,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:54:00,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:54:02,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:54:03,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:54:05,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:54:05,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:54:06,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 03:54:11,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:14,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 03:54:15,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:54:15,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=741546.6666666666, ans=0.125 2023-10-02 03:54:18,266 INFO [train.py:1046] (2/4) Epoch 21, batch 5000, loss[loss=0.1894, simple_loss=0.2655, pruned_loss=0.05659, over 23444.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2493, pruned_loss=0.048, over 4745236.26 frames. ], batch size: 106, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:54:19,998 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:54:22,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:54:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:54:23,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 03:54:25,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 03:54:26,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:54:29,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 03:54:29,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:54:30,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:54:30,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 03:54:32,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:54:32,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:54:33,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 03:54:33,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:33,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:54:36,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 03:54:38,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 03:54:38,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:54:38,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 03:54:38,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:54:39,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:40,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:54:40,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 03:54:40,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 03:54:43,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 03:54:43,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:54:45,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:45,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 03:54:45,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:54:47,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:48,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:50,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 03:54:50,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=741746.6666666666, ans=0.0 2023-10-02 03:54:51,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 03:54:53,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:54:53,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:54:55,823 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.891e+02 2.105e+02 2.416e+02 3.579e+02, threshold=4.211e+02, percent-clipped=0.0 2023-10-02 03:54:57,361 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 03:55:00,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:55:01,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:55:01,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:01,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=741813.3333333334, ans=0.125 2023-10-02 03:55:04,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 03:55:05,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:55:05,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:55:05,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:55:08,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 03:55:08,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:55:10,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:55:11,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:55:17,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 03:55:21,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:30,927 INFO [train.py:1046] (2/4) Epoch 21, batch 5050, loss[loss=0.1842, simple_loss=0.2567, pruned_loss=0.05585, over 15571.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.25, pruned_loss=0.04865, over 4728275.49 frames. ], batch size: 33, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:55:31,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:55:32,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:32,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:55:32,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:55:32,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:55:33,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:55:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:36,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.33 vs. limit=15.0 2023-10-02 03:55:38,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:38,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 03:55:40,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:55:43,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:55:43,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=741946.6666666666, ans=0.125 2023-10-02 03:55:44,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:55:44,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 03:55:45,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:55:47,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:55:48,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:55:48,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:55:50,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:55:59,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 03:55:59,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=742080.0, ans=0.0 2023-10-02 03:56:01,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:56:01,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:56:02,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 03:56:03,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:56:03,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:03,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:04,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:56:04,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 03:56:05,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 03:56:05,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=742080.0, ans=0.125 2023-10-02 03:56:06,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:09,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:12,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:12,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 03:56:15,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:56:18,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 03:56:19,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:56:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:56:19,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:56:19,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:56:21,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:56:25,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:56:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:25,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:56:25,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:56:27,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 03:56:28,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:56:28,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=742146.6666666666, ans=0.0 2023-10-02 03:56:29,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:56:33,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:56:33,887 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 03:56:33,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:56:35,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:56:35,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:35,296 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 03:56:36,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:36,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 03:56:36,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:41,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:56:41,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=742213.3333333334, ans=0.05 2023-10-02 03:56:42,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:42,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 03:56:44,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 03:56:45,282 INFO [train.py:1046] (2/4) Epoch 21, batch 5100, loss[loss=0.1581, simple_loss=0.2296, pruned_loss=0.0433, over 19564.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2505, pruned_loss=0.04873, over 4728333.52 frames. ], batch size: 42, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:56:46,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:46,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:56:46,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:56:49,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 03:56:52,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:54,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 03:56:54,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 03:56:54,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:57,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:57:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:57:00,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 03:57:01,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 03:57:04,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:57:06,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:57:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:57:09,430 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.41 vs. limit=15.0 2023-10-02 03:57:12,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 03:57:13,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:57:14,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:57:14,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:57:15,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=742413.3333333334, ans=0.125 2023-10-02 03:57:16,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:19,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:19,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 03:57:20,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 03:57:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:21,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 03:57:21,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 03:57:22,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=742413.3333333334, ans=0.2 2023-10-02 03:57:22,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=742413.3333333334, ans=0.0 2023-10-02 03:57:23,361 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.859e+02 2.079e+02 2.348e+02 3.705e+02, threshold=4.159e+02, percent-clipped=0.0 2023-10-02 03:57:25,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:57:35,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:57:35,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=742480.0, ans=0.0 2023-10-02 03:57:38,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 03:57:38,333 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 03:57:38,340 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 03:57:39,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 03:57:39,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:41,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 03:57:45,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 03:57:47,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:57:48,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:57:51,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 03:57:52,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 03:57:54,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 03:57:59,327 INFO [train.py:1046] (2/4) Epoch 21, batch 5150, loss[loss=0.1941, simple_loss=0.2618, pruned_loss=0.06322, over 23762.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2517, pruned_loss=0.04954, over 4719191.66 frames. ], batch size: 195, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:57:59,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:57:59,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:57:59,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:58:00,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:58:00,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=742613.3333333334, ans=0.05 2023-10-02 03:58:02,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:58:03,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:58:03,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=742613.3333333334, ans=0.125 2023-10-02 03:58:04,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 03:58:04,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 03:58:04,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 03:58:04,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:58:06,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 03:58:06,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:06,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 03:58:07,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:09,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:09,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=742613.3333333334, ans=0.125 2023-10-02 03:58:14,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:58:14,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 03:58:16,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:16,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:58:19,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:58:19,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:58:19,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:58:19,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:58:19,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:58:19,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 03:58:22,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:58:22,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:58:25,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:58:27,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 03:58:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:58:33,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:58:33,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 03:58:38,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:58:46,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:58:46,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:49,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:58:49,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:58:51,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 03:58:57,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:58,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:58:58,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:59:00,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:01,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:59:03,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 03:59:07,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=742880.0, ans=0.125 2023-10-02 03:59:08,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:59:09,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:59:10,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:59:10,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:59:11,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:59:11,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:59:11,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:59:13,713 INFO [train.py:1046] (2/4) Epoch 21, batch 5200, loss[loss=0.1652, simple_loss=0.2484, pruned_loss=0.04106, over 24327.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2526, pruned_loss=0.04974, over 4712987.69 frames. ], batch size: 61, lr: 4.84e-03, grad_scale: 32.0 2023-10-02 03:59:13,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:59:16,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:59:18,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:59:20,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:23,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 03:59:24,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:59:26,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:28,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:29,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:59:29,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:30,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 03:59:35,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:59:35,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:38,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 03:59:40,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:59:41,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:59:42,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 03:59:42,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 03:59:44,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 03:59:46,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:46,090 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 03:59:46,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:47,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:59:47,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:59:48,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 03:59:48,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:59:50,601 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.63 vs. limit=15.0 2023-10-02 03:59:51,334 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.817e+02 2.050e+02 2.419e+02 3.713e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 03:59:51,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:52,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 03:59:53,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 03:59:54,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 03:59:59,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=743146.6666666666, ans=0.0 2023-10-02 04:00:00,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 04:00:00,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:00:07,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:00:07,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:07,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=743146.6666666666, ans=0.125 2023-10-02 04:00:08,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 04:00:08,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:00:09,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:00:09,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:09,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:00:12,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:00:14,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:00:18,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:00:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:21,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=743213.3333333334, ans=0.125 2023-10-02 04:00:24,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:24,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 04:00:24,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=743213.3333333334, ans=0.1 2023-10-02 04:00:25,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:00:25,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:00:27,631 INFO [train.py:1046] (2/4) Epoch 21, batch 5250, loss[loss=0.1691, simple_loss=0.224, pruned_loss=0.05714, over 19535.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2522, pruned_loss=0.04962, over 4707334.92 frames. ], batch size: 388, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 04:00:27,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:29,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:00:29,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:00:30,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=743280.0, ans=0.125 2023-10-02 04:00:31,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:00:34,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:35,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:00:36,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:00:41,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:42,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:00:44,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:00:47,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:00:48,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 04:00:48,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:50,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:12,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=743480.0, ans=0.0 2023-10-02 04:01:37,127 INFO [train.py:1046] (2/4) Epoch 21, batch 5300, loss[loss=0.189, simple_loss=0.2695, pruned_loss=0.05419, over 23707.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2519, pruned_loss=0.04939, over 4721350.05 frames. ], batch size: 85, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 04:01:40,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=743613.3333333334, ans=0.2 2023-10-02 04:01:41,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=743613.3333333334, ans=0.125 2023-10-02 04:01:51,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:01:51,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 04:01:51,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 04:01:51,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:51,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:51,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:51,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:51,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:51,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:01:51,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:51,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:01:51,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:01:51,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 04:01:52,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 04:01:52,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 04:01:52,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:01:52,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 04:01:52,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 04:01:52,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:52,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:53,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:53,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:01:53,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:01:53,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:01:53,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:53,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:53,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:53,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:53,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:01:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:53,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:01:54,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 04:01:54,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:01:54,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:54,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 04:01:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 04:01:54,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:01:54,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:01:54,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 04:01:54,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 04:01:54,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:01:55,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:01:55,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:01:55,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 04:01:55,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 04:01:55,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:01:55,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:56,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 04:01:56,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 04:01:56,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 04:01:56,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:02:03,121 INFO [train.py:1046] (2/4) Epoch 22, batch 0, loss[loss=0.1631, simple_loss=0.2482, pruned_loss=0.03894, over 24494.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2482, pruned_loss=0.03894, over 24494.00 frames. ], batch size: 66, lr: 4.73e-03, grad_scale: 32.0 2023-10-02 04:02:03,122 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 04:02:16,020 INFO [train.py:1078] (2/4) Epoch 22, validation: loss=0.3002, simple_loss=0.2661, pruned_loss=0.1671, over 1125622.00 frames. 2023-10-02 04:02:16,022 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 04:02:17,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 04:02:19,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:02:20,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:02:25,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:02:26,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:26,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 04:02:27,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 04:02:31,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:31,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:34,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:34,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:35,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:02:35,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:02:38,929 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 2.142e+02 2.497e+02 3.157e+02 5.918e+02, threshold=4.995e+02, percent-clipped=12.0 2023-10-02 04:02:39,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 04:02:40,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:02:48,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:02:48,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:48,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=743826.6666666666, ans=0.025 2023-10-02 04:02:49,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 04:02:55,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:02:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:02:58,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:01,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:03:06,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:06,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=743893.3333333334, ans=0.125 2023-10-02 04:03:10,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 04:03:14,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 04:03:15,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:03:15,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:15,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:03:16,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:03:18,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 04:03:20,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:22,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:27,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:03:29,115 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 04:03:30,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:03:31,745 INFO [train.py:1046] (2/4) Epoch 22, batch 50, loss[loss=0.1838, simple_loss=0.27, pruned_loss=0.04882, over 24506.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2539, pruned_loss=0.04937, over 1077679.05 frames. ], batch size: 71, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:03:32,414 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.03 vs. limit=15.0 2023-10-02 04:03:35,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:03:36,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:03:36,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 04:03:36,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:03:36,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:03:37,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.77 vs. limit=15.0 2023-10-02 04:03:39,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:03:39,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:03:41,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=744026.6666666666, ans=0.07 2023-10-02 04:03:42,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:03:45,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=744093.3333333334, ans=0.1 2023-10-02 04:03:46,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 04:03:46,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:52,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.25 vs. limit=12.0 2023-10-02 04:03:52,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:03:54,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 04:03:55,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 04:03:57,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:04:00,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:00,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:04:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:04:01,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:04:03,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 04:04:03,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:04:09,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:04:10,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744160.0, ans=0.1 2023-10-02 04:04:12,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:12,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:04:12,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 04:04:14,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:04:16,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:04:16,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 04:04:16,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:04:17,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 04:04:25,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:04:25,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:04:26,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:30,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:30,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:04:31,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 04:04:31,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 04:04:33,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:33,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:04:34,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:04:36,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:04:37,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 04:04:37,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 04:04:38,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 04:04:41,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:04:41,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:04:41,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 04:04:41,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 04:04:43,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:04:43,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:44,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:04:45,991 INFO [train.py:1046] (2/4) Epoch 22, batch 100, loss[loss=0.1562, simple_loss=0.2306, pruned_loss=0.04092, over 24439.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2518, pruned_loss=0.04855, over 1899582.52 frames. ], batch size: 58, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:04:46,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:04:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:04:49,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=744360.0, ans=0.125 2023-10-02 04:04:52,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:04:54,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:04:54,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 04:04:54,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:59,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:04:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:04:59,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:59,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:59,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:05:01,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 04:05:03,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:05:04,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:04,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:04,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:05:04,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:06,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:08,459 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.848e+02 2.160e+02 2.576e+02 4.696e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-02 04:05:08,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 04:05:08,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:08,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:09,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:11,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:05:13,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:05:17,213 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 04:05:17,236 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 04:05:17,570 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:05:18,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:05:18,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:05:22,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:05:23,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:23,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:28,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=744493.3333333334, ans=0.1 2023-10-02 04:05:29,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:31,007 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 04:05:33,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:05:34,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=744560.0, ans=0.125 2023-10-02 04:05:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:05:37,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:05:39,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:43,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:46,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:05:47,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:05:50,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:50,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:50,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=744626.6666666666, ans=0.5 2023-10-02 04:05:53,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:53,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:05:53,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:54,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 04:05:54,937 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 04:05:54,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:56,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:05:56,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:05:56,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:05:56,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 04:05:56,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:05:56,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:05:58,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:05:58,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=15.0 2023-10-02 04:05:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:59,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:00,985 INFO [train.py:1046] (2/4) Epoch 22, batch 150, loss[loss=0.1491, simple_loss=0.2326, pruned_loss=0.03274, over 24273.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2528, pruned_loss=0.05001, over 2523784.54 frames. ], batch size: 61, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:06:01,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:06:02,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:06:05,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:08,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:06:08,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:08,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:11,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:06:11,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:14,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:06:15,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:20,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 04:06:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 04:06:20,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 04:06:23,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:06:23,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:06:24,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:06:26,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:06:26,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:06:26,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:26,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:27,718 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 04:06:30,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:06:35,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:38,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.92 vs. limit=10.0 2023-10-02 04:06:38,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:06:39,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 04:06:43,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:06:44,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:45,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:06:46,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:06:48,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:06:50,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:06:50,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:50,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 04:06:54,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:55,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:06:55,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:06:55,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:06:59,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:00,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 04:07:02,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:07:02,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=744960.0, ans=0.0 2023-10-02 04:07:03,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:07:05,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:07,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:07:08,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 04:07:08,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:07:08,479 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 04:07:12,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:07:15,450 INFO [train.py:1046] (2/4) Epoch 22, batch 200, loss[loss=0.1874, simple_loss=0.2711, pruned_loss=0.05188, over 24021.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2536, pruned_loss=0.05002, over 3014608.89 frames. ], batch size: 80, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:07:15,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:07:15,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:07:17,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=745026.6666666666, ans=0.2 2023-10-02 04:07:20,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 04:07:20,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:21,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:23,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 04:07:24,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:07:25,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:27,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:31,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:07:31,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:07:31,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:40,070 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.856e+02 2.062e+02 2.415e+02 5.556e+02, threshold=4.124e+02, percent-clipped=1.0 2023-10-02 04:07:50,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:07:50,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.27 vs. limit=12.0 2023-10-02 04:07:51,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:07:51,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:07:53,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:07:53,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=745160.0, ans=0.2 2023-10-02 04:07:54,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:07:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:07:54,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:56,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:07:57,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:57,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:07:58,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.03 vs. limit=15.0 2023-10-02 04:07:59,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 04:07:59,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:07:59,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:05,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:08:10,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:08:16,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:16,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:08:21,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=745293.3333333334, ans=0.0 2023-10-02 04:08:25,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:26,452 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.18 vs. limit=15.0 2023-10-02 04:08:28,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 04:08:28,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:28,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:08:29,820 INFO [train.py:1046] (2/4) Epoch 22, batch 250, loss[loss=0.1716, simple_loss=0.2512, pruned_loss=0.04603, over 24444.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2529, pruned_loss=0.05014, over 3394519.17 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:08:29,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:08:29,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:08:30,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=745360.0, ans=0.125 2023-10-02 04:08:31,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 04:08:31,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:08:31,442 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 04:08:33,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:34,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.54 vs. limit=15.0 2023-10-02 04:08:36,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:08:36,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:38,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:39,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:08:40,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:42,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:08:45,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:08:55,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=745426.6666666666, ans=0.125 2023-10-02 04:08:57,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:09:00,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:09:00,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:09:01,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=745493.3333333334, ans=0.125 2023-10-02 04:09:06,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:09:06,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:09:09,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:09:09,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:09:10,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:09:10,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:09:10,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:09:15,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:09:18,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 04:09:18,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:09:19,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:09:19,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:09:19,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:09:19,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:09:20,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:09:21,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:09:22,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:24,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:09:25,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:09:27,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:09:31,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:31,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=745626.6666666666, ans=0.125 2023-10-02 04:09:34,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:09:39,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:09:41,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:09:45,216 INFO [train.py:1046] (2/4) Epoch 22, batch 300, loss[loss=0.1718, simple_loss=0.2544, pruned_loss=0.04461, over 24484.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2508, pruned_loss=0.0495, over 3688060.56 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:09:45,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 04:09:45,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:09:45,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:09:45,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-10-02 04:09:48,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 04:09:48,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:09:50,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:09:50,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 04:09:54,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:55,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.10 vs. limit=22.5 2023-10-02 04:09:55,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:09:59,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:10:00,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 04:10:00,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=745760.0, ans=0.95 2023-10-02 04:10:01,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:10:03,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:10:03,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 04:10:03,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:08,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.896e+02 2.196e+02 2.526e+02 3.714e+02, threshold=4.393e+02, percent-clipped=0.0 2023-10-02 04:10:08,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:10:11,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:10:11,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 04:10:12,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=745760.0, ans=0.2 2023-10-02 04:10:15,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 04:10:15,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:17,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=745826.6666666666, ans=0.2 2023-10-02 04:10:18,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:20,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:20,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 04:10:20,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:10:22,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:10:22,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=745826.6666666666, ans=0.125 2023-10-02 04:10:24,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:10:24,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:10:27,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:10:27,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 04:10:28,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:10:31,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:33,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 04:10:33,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:10:37,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:10:40,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:10:40,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 04:10:44,908 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:10:45,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:45,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:10:47,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:48,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:10:50,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 04:10:50,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:10:50,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:10:50,748 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=12.0 2023-10-02 04:10:51,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 04:10:52,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:52,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:10:54,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:56,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:10:56,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:00,118 INFO [train.py:1046] (2/4) Epoch 22, batch 350, loss[loss=0.1587, simple_loss=0.2376, pruned_loss=0.03987, over 24464.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2501, pruned_loss=0.04888, over 3928787.35 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:11:01,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:01,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 04:11:01,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=746026.6666666666, ans=0.125 2023-10-02 04:11:05,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:05,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=746026.6666666666, ans=0.125 2023-10-02 04:11:08,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:11:13,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:13,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:16,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 04:11:18,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:18,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 04:11:19,017 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.02 vs. limit=22.5 2023-10-02 04:11:21,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:21,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 04:11:22,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:11:25,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 04:11:28,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:11:30,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:11:31,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:11:32,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:11:32,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:11:32,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:11:32,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:32,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:11:35,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:11:35,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:38,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=746160.0, ans=0.0 2023-10-02 04:11:42,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:11:42,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:11:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:11:42,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:48,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 04:11:48,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:52,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:52,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:11:52,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:53,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 04:11:54,221 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=15.0 2023-10-02 04:11:55,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:11:55,137 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 04:11:57,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 04:11:58,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:01,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:12:01,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 04:12:03,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:05,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:12:07,811 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.50 vs. limit=15.0 2023-10-02 04:12:08,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:09,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:09,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:12:11,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:12:13,730 INFO [train.py:1046] (2/4) Epoch 22, batch 400, loss[loss=0.1529, simple_loss=0.2273, pruned_loss=0.03931, over 24443.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2489, pruned_loss=0.04813, over 4103133.70 frames. ], batch size: 58, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:12:13,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:12:15,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:12:17,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 04:12:17,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:18,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:18,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:12:20,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:20,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=746360.0, ans=0.2 2023-10-02 04:12:22,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:24,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:24,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.30 vs. limit=6.0 2023-10-02 04:12:25,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 04:12:28,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 04:12:28,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:30,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 04:12:31,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:34,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:12:34,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:12:34,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 04:12:35,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:12:35,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:37,213 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.706e+02 1.907e+02 2.140e+02 3.847e+02, threshold=3.815e+02, percent-clipped=0.0 2023-10-02 04:12:37,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:12:37,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:38,735 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 04:12:40,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 04:12:44,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:44,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=746493.3333333334, ans=0.2 2023-10-02 04:12:46,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:46,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 04:12:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 04:12:52,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:12:53,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:12:55,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=746493.3333333334, ans=0.0 2023-10-02 04:12:59,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 04:13:02,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:13:05,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 04:13:06,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:13:08,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:13:08,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 04:13:10,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:13:11,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=746626.6666666666, ans=0.2 2023-10-02 04:13:13,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:13:15,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:13:18,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:18,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 04:13:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:13:24,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 04:13:25,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:13:25,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:13:26,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=746626.6666666666, ans=0.125 2023-10-02 04:13:27,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 04:13:27,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:13:28,217 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.92 vs. limit=15.0 2023-10-02 04:13:28,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:13:30,143 INFO [train.py:1046] (2/4) Epoch 22, batch 450, loss[loss=0.1801, simple_loss=0.2687, pruned_loss=0.04578, over 24452.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2491, pruned_loss=0.04843, over 4236796.15 frames. ], batch size: 69, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:13:30,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:13:30,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=746693.3333333334, ans=0.2 2023-10-02 04:13:31,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 04:13:32,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:13:32,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:13:34,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:13:34,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 04:13:34,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:13:35,731 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.53 vs. limit=15.0 2023-10-02 04:13:36,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:13:38,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:13:45,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:47,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:13:49,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 04:13:49,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 04:13:52,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:13:55,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:56,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:00,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=746826.6666666666, ans=0.125 2023-10-02 04:14:02,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:14:03,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:14:04,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 04:14:06,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 04:14:08,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 04:14:08,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:09,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:14:11,053 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 04:14:11,060 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 04:14:12,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:14:13,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:14:13,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 04:14:18,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:14:18,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:14:19,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:14:19,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 04:14:21,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:14:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:14:23,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:14:25,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 04:14:29,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:14:30,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 04:14:30,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 04:14:31,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:14:36,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:14:39,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:14:40,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:14:40,765 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 04:14:43,465 INFO [train.py:1046] (2/4) Epoch 22, batch 500, loss[loss=0.1629, simple_loss=0.2458, pruned_loss=0.04002, over 24453.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2502, pruned_loss=0.04895, over 4342332.92 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:14:43,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:43,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=747026.6666666666, ans=0.125 2023-10-02 04:14:44,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:14:44,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:44,995 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 04:14:46,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 04:14:46,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:46,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=747026.6666666666, ans=0.035 2023-10-02 04:14:51,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:14:54,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 04:14:57,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:14:58,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:14:58,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:59,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.63 vs. limit=15.0 2023-10-02 04:15:00,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:00,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=747093.3333333334, ans=0.125 2023-10-02 04:15:02,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.01 vs. limit=10.0 2023-10-02 04:15:09,055 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.808e+02 1.987e+02 2.210e+02 3.214e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 04:15:09,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:10,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:15:10,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:15:10,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:11,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 04:15:11,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:15:16,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:15:17,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:15:17,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:15:17,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:18,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 04:15:22,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=747160.0, ans=0.125 2023-10-02 04:15:23,916 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 04:15:25,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:26,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=747226.6666666666, ans=0.0 2023-10-02 04:15:30,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:15:31,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=747226.6666666666, ans=0.0 2023-10-02 04:15:32,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 04:15:35,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:15:37,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:40,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:15:44,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:47,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=747293.3333333334, ans=0.1 2023-10-02 04:15:50,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:51,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 04:15:51,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:51,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:55,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 04:15:57,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:15:57,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.78 vs. limit=12.0 2023-10-02 04:15:58,493 INFO [train.py:1046] (2/4) Epoch 22, batch 550, loss[loss=0.1782, simple_loss=0.2523, pruned_loss=0.05203, over 23298.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.251, pruned_loss=0.04903, over 4428983.22 frames. ], batch size: 105, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:15:58,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:59,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.78 vs. limit=6.0 2023-10-02 04:16:01,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 04:16:04,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 04:16:04,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:04,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 04:16:04,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:16:05,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:05,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:06,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:06,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:16:07,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:16:10,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:16:12,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 04:16:12,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:16:17,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:17,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:19,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:16:19,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=747426.6666666666, ans=0.0 2023-10-02 04:16:20,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:25,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 04:16:25,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 04:16:27,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:16:33,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:16:33,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:16:34,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:16:34,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=747493.3333333334, ans=0.2 2023-10-02 04:16:37,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=747493.3333333334, ans=0.125 2023-10-02 04:16:38,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:38,808 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 04:16:40,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:41,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:16:43,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:16:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:16:44,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:16:46,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:46,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 04:16:47,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 04:16:49,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:16:49,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:16:49,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:16:49,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:51,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:16:53,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:16:56,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:16:56,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:58,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 04:16:59,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:17:00,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:02,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:17:02,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:05,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:17:05,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:17:10,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 04:17:13,291 INFO [train.py:1046] (2/4) Epoch 22, batch 600, loss[loss=0.2205, simple_loss=0.2864, pruned_loss=0.07732, over 19649.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2517, pruned_loss=0.04946, over 4489308.62 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:17:14,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 04:17:15,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:17:15,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:17:16,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:17,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=747693.3333333334, ans=0.0 2023-10-02 04:17:21,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:17:22,449 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.45 vs. limit=12.0 2023-10-02 04:17:24,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:17:26,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 04:17:27,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:17:31,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:17:32,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:35,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 04:17:35,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:17:38,704 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.855e+02 2.187e+02 2.560e+02 3.889e+02, threshold=4.374e+02, percent-clipped=0.0 2023-10-02 04:17:41,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 04:17:44,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:17:44,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:46,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:17:50,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:17:50,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:17:51,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:58,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:18:01,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:18:01,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:18:01,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:18:08,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 04:18:13,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:18:14,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.02 vs. limit=15.0 2023-10-02 04:18:15,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:18:18,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 04:18:19,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:18:22,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 04:18:22,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:18:22,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:18:28,665 INFO [train.py:1046] (2/4) Epoch 22, batch 650, loss[loss=0.1622, simple_loss=0.2436, pruned_loss=0.04041, over 24610.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2499, pruned_loss=0.04909, over 4530390.24 frames. ], batch size: 65, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:18:28,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 04:18:30,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:18:30,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=748026.6666666666, ans=0.125 2023-10-02 04:18:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:18:34,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:18:34,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=748026.6666666666, ans=0.05 2023-10-02 04:18:35,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:18:37,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 04:18:39,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:18:40,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=748026.6666666666, ans=0.0 2023-10-02 04:18:42,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-10-02 04:18:45,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:18:45,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:18:49,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:18:52,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 04:18:53,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:18:55,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:18:58,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:18:58,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:19:01,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:01,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:01,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:19:04,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:04,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:19:07,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:19:07,492 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 04:19:07,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:07,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:19:11,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:12,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:19:13,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:13,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:19:13,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 04:19:16,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:19:16,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:19:18,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:19:18,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:19:18,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=748226.6666666666, ans=0.0 2023-10-02 04:19:19,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:19:20,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 04:19:22,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 04:19:22,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:22,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:19:23,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:19:23,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:19:24,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:19:27,752 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-02 04:19:31,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:31,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:19:33,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:34,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:34,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:19:36,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:42,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:19:42,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:19:42,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:19:42,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:19:43,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.23 vs. limit=12.0 2023-10-02 04:19:43,623 INFO [train.py:1046] (2/4) Epoch 22, batch 700, loss[loss=0.1753, simple_loss=0.2425, pruned_loss=0.05407, over 23911.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2495, pruned_loss=0.04848, over 4575073.22 frames. ], batch size: 195, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:19:47,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 04:19:47,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 04:19:49,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 04:19:50,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:51,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:19:54,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 04:19:58,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:20:02,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:20:03,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:20:06,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:20:06,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:20:08,875 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.977e+02 2.354e+02 2.667e+02 4.737e+02, threshold=4.709e+02, percent-clipped=1.0 2023-10-02 04:20:09,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:20:09,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.37 vs. limit=15.0 2023-10-02 04:20:11,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=748426.6666666666, ans=0.125 2023-10-02 04:20:13,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 04:20:13,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:20:13,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 04:20:15,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=748493.3333333334, ans=0.125 2023-10-02 04:20:17,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 04:20:21,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:20:21,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:20:21,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=748493.3333333334, ans=0.5 2023-10-02 04:20:24,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:20:26,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:20:28,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 04:20:29,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.33 vs. limit=15.0 2023-10-02 04:20:32,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:20:32,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:20:33,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 04:20:35,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:20:36,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:20:39,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:20:42,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=748626.6666666666, ans=0.125 2023-10-02 04:20:44,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:20:45,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 04:20:48,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 04:20:48,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 04:20:51,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:20:54,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:20:55,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:20:55,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:20:55,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 04:20:55,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=748626.6666666666, ans=0.0 2023-10-02 04:20:58,668 INFO [train.py:1046] (2/4) Epoch 22, batch 750, loss[loss=0.1693, simple_loss=0.2489, pruned_loss=0.04486, over 24456.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2486, pruned_loss=0.04794, over 4616602.70 frames. ], batch size: 63, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:21:00,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 04:21:00,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 04:21:00,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 04:21:02,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 04:21:02,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 04:21:02,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:21:03,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 04:21:03,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=748693.3333333334, ans=0.125 2023-10-02 04:21:04,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:21:05,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:21:06,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:07,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:08,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=748693.3333333334, ans=0.125 2023-10-02 04:21:09,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:21:09,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:21:13,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:21:15,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:21:16,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:21:19,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:19,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:19,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 04:21:22,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:21:22,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:21:24,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:21:25,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:21:26,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 04:21:27,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:21:29,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 04:21:29,170 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 04:21:30,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 04:21:30,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:21:31,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:21:33,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:21:39,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:21:39,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:21:39,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:21:39,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:42,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:21:42,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 04:21:43,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:21:43,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 04:21:44,767 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.91 vs. limit=22.5 2023-10-02 04:21:45,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:21:48,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=748893.3333333334, ans=6.0 2023-10-02 04:21:49,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:21:51,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 04:21:51,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:21:55,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:21:56,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:21:58,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:01,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:22:06,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 04:22:06,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:22:06,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:07,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:09,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:10,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:11,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:22:13,155 INFO [train.py:1046] (2/4) Epoch 22, batch 800, loss[loss=0.2146, simple_loss=0.2758, pruned_loss=0.07673, over 19608.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2492, pruned_loss=0.04877, over 4632321.28 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 32.0 2023-10-02 04:22:13,981 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.08 vs. limit=22.5 2023-10-02 04:22:18,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:18,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:19,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=749026.6666666666, ans=0.125 2023-10-02 04:22:20,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:22:20,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:22,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:22,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:24,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=749026.6666666666, ans=0.09899494936611666 2023-10-02 04:22:25,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:28,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:22:31,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 04:22:32,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:33,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:33,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:22:34,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:22:34,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 04:22:35,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:35,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 04:22:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:40,033 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.751e+02 1.954e+02 2.249e+02 3.221e+02, threshold=3.908e+02, percent-clipped=0.0 2023-10-02 04:22:41,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:44,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:45,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:22:47,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=749160.0, ans=0.1 2023-10-02 04:22:48,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:48,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:48,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=749160.0, ans=0.125 2023-10-02 04:22:49,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:22:50,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=749160.0, ans=0.125 2023-10-02 04:22:51,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:22:51,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 04:22:55,158 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 04:22:55,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 04:22:55,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:22:56,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:57,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:57,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:23:02,688 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 04:23:02,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 04:23:04,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:23:06,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:23:08,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:23:11,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:23:13,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 04:23:14,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:23:15,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 04:23:23,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:23:26,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:23:26,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 04:23:26,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:23:26,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=749360.0, ans=0.125 2023-10-02 04:23:28,147 INFO [train.py:1046] (2/4) Epoch 22, batch 850, loss[loss=0.1688, simple_loss=0.2555, pruned_loss=0.04105, over 24519.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2503, pruned_loss=0.04916, over 4652501.09 frames. ], batch size: 71, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:23:28,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:23:29,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 04:23:29,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:31,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:23:33,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:23:34,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:23:36,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:23:36,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 04:23:37,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 04:23:37,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 04:23:39,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:23:39,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:23:41,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:23:41,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:23:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:23:46,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:46,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:23:46,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 04:23:48,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 04:23:53,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:53,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 04:23:55,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 04:23:57,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 04:24:00,389 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 04:24:00,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:24:00,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:24:00,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:24:01,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=749493.3333333334, ans=0.2 2023-10-02 04:24:04,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:04,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:06,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 04:24:06,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:24:07,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:24:07,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:24:08,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:24:10,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:24:10,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=749493.3333333334, ans=0.1 2023-10-02 04:24:11,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:24:11,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 04:24:14,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:24:14,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:24:16,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:24:16,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:24:16,330 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:24:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:24:21,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:22,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:24:26,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:24:26,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:24:28,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:24:37,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:24:39,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:24:39,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 04:24:39,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:24:39,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:24:42,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 04:24:43,313 INFO [train.py:1046] (2/4) Epoch 22, batch 900, loss[loss=0.1803, simple_loss=0.2667, pruned_loss=0.04693, over 24671.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2514, pruned_loss=0.04956, over 4660706.53 frames. ], batch size: 68, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:24:48,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:24:52,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:24:52,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 04:24:55,095 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.75 vs. limit=15.0 2023-10-02 04:24:55,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:24:55,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 04:24:57,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 04:24:58,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:24:58,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:24:58,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:24:59,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:25:03,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=749760.0, ans=0.1 2023-10-02 04:25:10,079 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.859e+02 2.170e+02 2.493e+02 3.460e+02, threshold=4.340e+02, percent-clipped=0.0 2023-10-02 04:25:10,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:10,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:25:10,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:25:11,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:25:11,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=749826.6666666666, ans=0.125 2023-10-02 04:25:15,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 04:25:17,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:25:18,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=749826.6666666666, ans=0.125 2023-10-02 04:25:19,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.70 vs. limit=15.0 2023-10-02 04:25:20,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:25:21,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:25:21,578 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 04:25:22,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 04:25:30,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:25:30,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:25:30,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:25:38,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:38,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:25:39,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 04:25:40,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:25:42,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 04:25:45,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:25:45,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:46,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:25:46,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:25:52,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 04:25:53,391 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 04:25:53,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:25:53,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 04:25:56,132 INFO [train.py:1046] (2/4) Epoch 22, batch 950, loss[loss=0.1533, simple_loss=0.232, pruned_loss=0.03732, over 24604.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.252, pruned_loss=0.05007, over 4667550.07 frames. ], batch size: 60, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:25:56,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:58,218 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-10-02 04:25:59,198 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:26:00,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 04:26:01,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=750026.6666666666, ans=0.1 2023-10-02 04:26:05,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:08,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:08,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:10,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:26:12,948 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 04:26:14,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:15,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:26:17,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:17,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:26:17,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 04:26:17,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:26:19,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:21,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 04:26:21,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:26:24,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:24,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:26:24,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:26:24,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 04:26:27,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 04:26:28,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:26:29,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:26:35,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:26:35,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:39,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=750160.0, ans=0.125 2023-10-02 04:26:40,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 04:26:43,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 04:26:43,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:26:43,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:26:44,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:44,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:26:48,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 04:26:48,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:26:51,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:26:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:52,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 04:26:52,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:52,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:26:52,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 04:26:56,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:26:59,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:27:01,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=750293.3333333334, ans=0.2 2023-10-02 04:27:02,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.30 vs. limit=15.0 2023-10-02 04:27:04,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:27:05,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 04:27:06,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 04:27:10,326 INFO [train.py:1046] (2/4) Epoch 22, batch 1000, loss[loss=0.1697, simple_loss=0.2609, pruned_loss=0.0393, over 24652.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2508, pruned_loss=0.04965, over 4685373.10 frames. ], batch size: 73, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:27:13,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:27:14,114 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-10-02 04:27:16,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 04:27:16,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:20,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:27:22,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 04:27:22,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 04:27:26,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:26,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:27:28,071 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.23 vs. limit=15.0 2023-10-02 04:27:28,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:30,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 04:27:33,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 04:27:35,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=750426.6666666666, ans=0.1 2023-10-02 04:27:36,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 04:27:36,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:27:38,285 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.816e+02 2.103e+02 2.458e+02 3.993e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-02 04:27:40,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 04:27:40,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 04:27:41,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 04:27:43,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:43,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:50,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=750493.3333333334, ans=0.125 2023-10-02 04:27:51,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:53,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:27:53,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:53,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:53,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 04:27:54,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:27:55,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:27:55,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:56,029 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 04:28:00,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 04:28:01,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 04:28:02,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 04:28:04,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:28:06,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=750560.0, ans=0.125 2023-10-02 04:28:08,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=750626.6666666666, ans=0.125 2023-10-02 04:28:09,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:09,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:28:09,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:11,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:28:13,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 04:28:13,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:28:13,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 04:28:14,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 04:28:14,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:28:14,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:28:15,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=750626.6666666666, ans=0.5 2023-10-02 04:28:19,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:28:20,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:28:20,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=750626.6666666666, ans=0.125 2023-10-02 04:28:23,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:28:24,500 INFO [train.py:1046] (2/4) Epoch 22, batch 1050, loss[loss=0.1516, simple_loss=0.229, pruned_loss=0.0371, over 18964.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2485, pruned_loss=0.04882, over 4679699.00 frames. ], batch size: 41, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:28:26,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:28:28,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:28:30,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:28:31,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:31,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:28:31,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=750693.3333333334, ans=0.125 2023-10-02 04:28:36,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:28:37,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:28:40,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:28:41,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:28:41,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:28:41,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:28:43,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 04:28:43,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:28:44,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 04:28:47,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:28:47,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 04:28:47,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:28:54,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:56,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:28:56,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:28:57,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 04:28:59,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 04:28:59,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:29:01,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 04:29:02,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 04:29:03,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:06,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:29:08,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:29:09,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:29:09,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:29:12,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.25 vs. limit=15.0 2023-10-02 04:29:15,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:29:18,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 04:29:19,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 04:29:19,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 04:29:19,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:29:20,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:29:22,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 04:29:25,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:29:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:29:26,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:29:26,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:29:26,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:30,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:30,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 04:29:33,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:29:33,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 04:29:33,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 04:29:33,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=750960.0, ans=0.125 2023-10-02 04:29:35,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:29:38,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:29:39,816 INFO [train.py:1046] (2/4) Epoch 22, batch 1100, loss[loss=0.1671, simple_loss=0.2565, pruned_loss=0.03886, over 24318.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2483, pruned_loss=0.04855, over 4692273.33 frames. ], batch size: 74, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:29:44,255 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:29:45,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:29:48,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=751026.6666666666, ans=0.125 2023-10-02 04:29:50,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:29:52,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:29:52,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:29:52,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 04:29:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:29:54,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=751093.3333333334, ans=0.0 2023-10-02 04:29:54,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff3.min_abs, batch_count=751093.3333333334, ans=0.2 2023-10-02 04:29:55,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:29:57,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:29:58,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=751093.3333333334, ans=0.0 2023-10-02 04:29:59,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:30:00,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 04:30:00,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=751093.3333333334, ans=0.125 2023-10-02 04:30:02,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:30:03,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:03,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:30:06,500 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.819e+02 2.068e+02 2.421e+02 4.208e+02, threshold=4.136e+02, percent-clipped=1.0 2023-10-02 04:30:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:30:06,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:30:13,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:30:17,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 04:30:17,101 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 04:30:18,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:19,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:21,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:30:21,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:30:22,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 04:30:23,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:30:23,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:30:23,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:30:25,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:25,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 04:30:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:30:29,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 04:30:31,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=751226.6666666666, ans=0.1 2023-10-02 04:30:32,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:30:35,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:30:38,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 04:30:38,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:30:40,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:41,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:43,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:30:44,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 04:30:46,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:30:46,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:30:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 04:30:47,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:30:47,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 04:30:49,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:30:49,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:30:50,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:30:53,554 INFO [train.py:1046] (2/4) Epoch 22, batch 1150, loss[loss=0.1766, simple_loss=0.2506, pruned_loss=0.05127, over 23348.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2496, pruned_loss=0.04887, over 4686769.15 frames. ], batch size: 285, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:30:53,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:30:56,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:30:59,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:59,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:30:59,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 04:31:00,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:31:03,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 04:31:06,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:31:06,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:31:12,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 04:31:14,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:31:18,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:31:18,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=751426.6666666666, ans=0.125 2023-10-02 04:31:19,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:19,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 04:31:20,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:31:20,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:31:24,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 04:31:25,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:31:26,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:31:29,157 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.03 vs. limit=15.0 2023-10-02 04:31:30,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=751493.3333333334, ans=10.0 2023-10-02 04:31:32,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=751493.3333333334, ans=0.125 2023-10-02 04:31:33,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.97 vs. limit=15.0 2023-10-02 04:31:35,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:36,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.52 vs. limit=15.0 2023-10-02 04:31:38,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=751560.0, ans=0.0 2023-10-02 04:31:41,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:42,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 04:31:42,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:43,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:47,844 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 04:31:49,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:55,287 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 04:32:00,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:01,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:32:02,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:32:02,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:32:04,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:06,278 INFO [train.py:1046] (2/4) Epoch 22, batch 1200, loss[loss=0.1769, simple_loss=0.2662, pruned_loss=0.04377, over 24326.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2501, pruned_loss=0.04907, over 4699126.18 frames. ], batch size: 74, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:32:07,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.53 vs. limit=10.0 2023-10-02 04:32:09,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:32:09,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:32:11,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:11,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:12,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:32:14,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=751693.3333333334, ans=0.0 2023-10-02 04:32:16,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:32:17,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:32:17,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:17,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:32:20,451 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 04:32:23,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 04:32:25,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:32:26,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:32:28,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:31,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:32:31,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 04:32:32,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:33,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=751760.0, ans=0.0 2023-10-02 04:32:34,985 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.785e+02 1.972e+02 2.130e+02 2.698e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-02 04:32:38,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751826.6666666666, ans=0.1 2023-10-02 04:32:39,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:32:39,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:32:39,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 04:32:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:32:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 04:32:48,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 04:32:48,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:49,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:32:51,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:32:51,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:32:53,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:53,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:32:53,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:32:54,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 04:32:56,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:32:56,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:32:56,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:32:58,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:58,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:33:03,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:33:04,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:33:07,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 04:33:11,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=751960.0, ans=0.0 2023-10-02 04:33:12,025 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 04:33:15,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:33:15,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.36 vs. limit=22.5 2023-10-02 04:33:16,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:33:17,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:33:19,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:33:20,645 INFO [train.py:1046] (2/4) Epoch 22, batch 1250, loss[loss=0.1703, simple_loss=0.2451, pruned_loss=0.04771, over 23546.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2503, pruned_loss=0.04917, over 4713640.76 frames. ], batch size: 134, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:33:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 04:33:26,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:33:26,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:28,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 04:33:28,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:33:29,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:33:29,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=752026.6666666666, ans=0.0 2023-10-02 04:33:32,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:33:33,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:33,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:33:33,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:33:34,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=752093.3333333334, ans=0.125 2023-10-02 04:33:37,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:33:39,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=752093.3333333334, ans=0.95 2023-10-02 04:33:42,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 04:33:42,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:33:42,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:33:44,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:33:46,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:33:49,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:33:49,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:33:54,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 04:33:54,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:33:56,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:33:56,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 04:33:58,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:58,394 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 04:33:58,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:33:58,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:01,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:34:03,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:34:05,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:34:06,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 04:34:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 04:34:08,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 04:34:11,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:34:11,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 04:34:11,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:16,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 04:34:16,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:34:17,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 04:34:17,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:34:19,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:34:19,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:34:20,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:34:21,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 04:34:24,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:34:24,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:34:26,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:34:27,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:34:32,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:34:33,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 04:34:34,838 INFO [train.py:1046] (2/4) Epoch 22, batch 1300, loss[loss=0.158, simple_loss=0.2365, pruned_loss=0.03971, over 24605.00 frames. ], tot_loss[loss=0.175, simple_loss=0.251, pruned_loss=0.04946, over 4718256.80 frames. ], batch size: 60, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:34:36,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:34:36,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:34:37,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:34:39,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:40,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:34:42,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 04:34:50,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:34:50,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:34:51,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 04:34:55,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:34:58,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:34:59,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:35:00,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:35:01,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:03,257 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.912e+02 2.109e+02 2.278e+02 3.691e+02, threshold=4.217e+02, percent-clipped=0.0 2023-10-02 04:35:03,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:35:03,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:35:03,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 04:35:03,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=752493.3333333334, ans=0.125 2023-10-02 04:35:07,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=752493.3333333334, ans=0.125 2023-10-02 04:35:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:35:09,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:35:11,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 04:35:13,005 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.95 vs. limit=22.5 2023-10-02 04:35:13,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:35:15,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:35:18,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:35:18,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 04:35:18,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:35:20,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 04:35:21,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:35:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:35:27,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:35:29,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 04:35:30,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 04:35:31,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 04:35:37,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:35:40,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 04:35:41,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:47,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 04:35:48,370 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.55 vs. limit=22.5 2023-10-02 04:35:49,072 INFO [train.py:1046] (2/4) Epoch 22, batch 1350, loss[loss=0.1508, simple_loss=0.2022, pruned_loss=0.04971, over 19646.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2498, pruned_loss=0.04918, over 4722171.95 frames. ], batch size: 389, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:35:52,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:35:52,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=752693.3333333334, ans=0.2 2023-10-02 04:35:55,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:35:56,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:57,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:35:58,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:35:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:36:03,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:36:04,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 04:36:05,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:36:06,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:36:09,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 04:36:11,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:36:12,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:36:12,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 04:36:13,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 04:36:15,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 04:36:17,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:17,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 04:36:22,579 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:36:27,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:36,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:36,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:36:36,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 04:36:39,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:36:41,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 04:36:41,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:36:41,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:36:45,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:36:47,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 04:36:48,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:36:54,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 04:36:54,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=752960.0, ans=0.125 2023-10-02 04:36:55,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 04:37:01,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 04:37:01,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:37:02,556 INFO [train.py:1046] (2/4) Epoch 22, batch 1400, loss[loss=0.1842, simple_loss=0.2525, pruned_loss=0.05795, over 23805.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2484, pruned_loss=0.04873, over 4708246.05 frames. ], batch size: 212, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:37:04,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:37:04,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:37:10,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 04:37:11,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 04:37:11,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=753026.6666666666, ans=0.2 2023-10-02 04:37:20,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:37:21,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:37:24,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:37:24,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:37:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:37:29,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 04:37:30,876 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.430e+02 1.782e+02 2.050e+02 2.266e+02 3.252e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 04:37:39,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:39,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:44,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 04:37:45,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:37:46,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:37:47,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:37:49,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:37:49,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:37:49,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:37:49,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:37:52,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 04:37:52,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:37:56,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:59,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:38:07,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 04:38:07,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:38:09,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:38:11,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 04:38:13,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:15,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:38:17,293 INFO [train.py:1046] (2/4) Epoch 22, batch 1450, loss[loss=0.157, simple_loss=0.2449, pruned_loss=0.0346, over 24652.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.248, pruned_loss=0.04869, over 4710419.78 frames. ], batch size: 65, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:38:18,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:38:21,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:38:21,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:21,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 04:38:27,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:27,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:38:28,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:38:29,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 04:38:29,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:38:31,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 04:38:32,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:32,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:32,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 04:38:33,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:38:35,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:38:35,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 04:38:35,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:36,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:38:39,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:42,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:42,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=753426.6666666666, ans=0.125 2023-10-02 04:38:45,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:38:45,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:38:48,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:48,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:50,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:50,838 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.58 vs. limit=15.0 2023-10-02 04:38:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:38:51,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:38:52,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=753493.3333333334, ans=0.2 2023-10-02 04:38:57,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 04:38:58,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:39:01,410 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 04:39:02,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:39:04,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:39:05,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 04:39:11,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:12,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 04:39:14,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 04:39:15,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:18,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:39:18,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:39:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 04:39:23,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 04:39:23,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=753626.6666666666, ans=0.0 2023-10-02 04:39:25,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 04:39:25,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:26,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:39:32,467 INFO [train.py:1046] (2/4) Epoch 22, batch 1500, loss[loss=0.1707, simple_loss=0.2492, pruned_loss=0.04614, over 20955.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2488, pruned_loss=0.04871, over 4712023.84 frames. ], batch size: 45, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:39:36,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 04:39:36,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:39:36,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:39:38,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:38,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:39:38,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=753693.3333333334, ans=0.1 2023-10-02 04:39:39,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:39:40,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 04:39:42,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:39:42,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:39:42,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:39:43,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:39:44,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:39:46,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:39:51,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:39:51,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 04:39:52,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:39:52,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:39:52,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:53,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=753760.0, ans=0.125 2023-10-02 04:39:56,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 04:40:00,496 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.944e+02 2.195e+02 2.609e+02 5.119e+02, threshold=4.390e+02, percent-clipped=1.0 2023-10-02 04:40:00,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 04:40:01,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:40:02,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 04:40:04,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:40:07,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:40:07,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=753826.6666666666, ans=0.125 2023-10-02 04:40:08,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:40:08,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:40:10,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 04:40:10,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:40:11,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:40:11,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 04:40:11,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:40:16,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:40:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 04:40:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:40:23,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:40:24,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=753893.3333333334, ans=0.0 2023-10-02 04:40:27,408 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 04:40:28,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:28,804 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 04:40:31,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:40:31,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:40:32,905 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 04:40:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:40:35,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=753960.0, ans=0.2 2023-10-02 04:40:37,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 04:40:37,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:37,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=753960.0, ans=0.0 2023-10-02 04:40:37,838 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.59 vs. limit=15.0 2023-10-02 04:40:41,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:40:41,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:42,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:40:42,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:42,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:40:45,337 INFO [train.py:1046] (2/4) Epoch 22, batch 1550, loss[loss=0.147, simple_loss=0.2282, pruned_loss=0.03285, over 24325.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2498, pruned_loss=0.04887, over 4710144.87 frames. ], batch size: 56, lr: 4.69e-03, grad_scale: 8.0 2023-10-02 04:40:45,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 04:40:47,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 04:40:47,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:40:48,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 04:40:48,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 04:40:50,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:40:50,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=754026.6666666666, ans=0.125 2023-10-02 04:40:51,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:51,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:40:51,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=754026.6666666666, ans=0.2 2023-10-02 04:40:52,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:40:53,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:53,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=754026.6666666666, ans=0.0 2023-10-02 04:40:54,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:58,400 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 04:40:58,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:40:58,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:40:59,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:41:02,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:41:02,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 04:41:02,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:41:03,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 04:41:05,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 04:41:05,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 04:41:05,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:06,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:11,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:41:12,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 04:41:12,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 04:41:20,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:24,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:41:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:41:26,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:41:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 04:41:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:41:34,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:37,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:41:38,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:41:39,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:39,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 04:41:41,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:41:42,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:41:42,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:44,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 04:41:44,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 04:41:46,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:41:51,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 04:41:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:41:58,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:59,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 04:42:00,696 INFO [train.py:1046] (2/4) Epoch 22, batch 1600, loss[loss=0.1622, simple_loss=0.2434, pruned_loss=0.04056, over 24601.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2508, pruned_loss=0.04923, over 4714164.12 frames. ], batch size: 60, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:42:00,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:42:02,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:42:02,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:42:02,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:42:04,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:42:08,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:08,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 04:42:09,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 04:42:12,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 04:42:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:42:15,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 04:42:15,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:42:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:42:19,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=754426.6666666666, ans=0.1 2023-10-02 04:42:22,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:42:24,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 04:42:28,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:42:29,798 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.849e+02 2.015e+02 2.329e+02 4.994e+02, threshold=4.030e+02, percent-clipped=1.0 2023-10-02 04:42:29,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 04:42:29,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:31,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 04:42:33,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=754493.3333333334, ans=0.125 2023-10-02 04:42:38,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 04:42:43,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:42:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 04:42:45,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=754560.0, ans=0.125 2023-10-02 04:42:46,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:42:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:42:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:42:46,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=754560.0, ans=0.1 2023-10-02 04:42:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 04:42:53,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 04:42:54,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:42:54,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:56,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:42:59,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:42:59,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:43:00,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:43:08,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:43:08,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:43:11,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 04:43:11,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:43:12,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=754626.6666666666, ans=0.2 2023-10-02 04:43:14,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 04:43:15,451 INFO [train.py:1046] (2/4) Epoch 22, batch 1650, loss[loss=0.164, simple_loss=0.2553, pruned_loss=0.03634, over 24305.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2518, pruned_loss=0.0493, over 4712660.64 frames. ], batch size: 74, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:43:18,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:43:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:43:18,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=754693.3333333334, ans=0.125 2023-10-02 04:43:19,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:43:19,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 04:43:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 04:43:19,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 04:43:19,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 04:43:23,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:43:24,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:43:25,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:43:25,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:43:27,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:43:31,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 04:43:33,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:43:33,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:43:33,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:43:33,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:43:34,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=15.0 2023-10-02 04:43:34,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 04:43:34,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 04:43:41,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:43:43,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:43:50,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 04:43:50,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:43:52,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 04:43:55,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:43:58,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:43:58,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:43:59,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:43:59,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:43:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:02,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:03,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:03,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:44:05,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:44:07,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:44:09,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:44:11,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:44:13,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 04:44:14,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:44:14,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 04:44:16,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 04:44:17,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 04:44:17,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:44:17,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:44:17,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:44:18,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:18,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 04:44:21,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:44:23,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:44:23,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:44:23,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=754960.0, ans=0.1 2023-10-02 04:44:25,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 04:44:28,641 INFO [train.py:1046] (2/4) Epoch 22, batch 1700, loss[loss=0.1664, simple_loss=0.2287, pruned_loss=0.05207, over 23442.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2513, pruned_loss=0.04944, over 4706282.39 frames. ], batch size: 285, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:44:30,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:44:30,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:44:31,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 04:44:33,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:44:33,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:44:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:33,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=755026.6666666666, ans=10.0 2023-10-02 04:44:34,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:44:36,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:44:36,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 04:44:38,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=755026.6666666666, ans=0.125 2023-10-02 04:44:40,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:44:44,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:47,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:44:51,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=755093.3333333334, ans=0.2 2023-10-02 04:44:52,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:44:52,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:44:52,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:44:54,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:44:57,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 04:44:58,523 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.824e+02 2.037e+02 2.362e+02 3.685e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 04:44:58,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:44:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:01,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:45:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:45:04,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 04:45:05,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 04:45:06,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:08,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 04:45:09,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:45:16,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=755226.6666666666, ans=0.0 2023-10-02 04:45:17,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:19,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:20,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:45:20,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:45:20,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 04:45:20,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:45:23,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:23,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 04:45:23,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:45:23,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:45:24,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:24,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:45:27,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:45:27,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:45:29,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:30,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:45:30,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:33,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:45:35,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 04:45:39,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:40,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:45:43,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 04:45:44,587 INFO [train.py:1046] (2/4) Epoch 22, batch 1750, loss[loss=0.1676, simple_loss=0.244, pruned_loss=0.04559, over 24444.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2497, pruned_loss=0.04906, over 4711316.78 frames. ], batch size: 58, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:45:48,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:51,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:45:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:45:53,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 04:45:53,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:55,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:45:55,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:00,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 04:46:02,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:05,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 04:46:05,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:46:06,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:46:10,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:46:10,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 04:46:13,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:46:13,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 04:46:20,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:46:21,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.88 vs. limit=15.0 2023-10-02 04:46:23,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:46:23,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:46:26,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:46:27,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:46:29,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:32,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:46:32,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:46:33,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=755560.0, ans=0.125 2023-10-02 04:46:34,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 04:46:36,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:46:36,532 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:46:39,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 04:46:40,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:46:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:42,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:46:47,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:46:48,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:46:48,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:50,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:46:54,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:54,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=755626.6666666666, ans=0.125 2023-10-02 04:46:55,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:46:56,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=755626.6666666666, ans=0.125 2023-10-02 04:46:56,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.68 vs. limit=15.0 2023-10-02 04:46:57,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:46:57,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 04:46:58,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:46:58,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:46:58,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:46:58,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:47:00,009 INFO [train.py:1046] (2/4) Epoch 22, batch 1800, loss[loss=0.1564, simple_loss=0.2377, pruned_loss=0.03758, over 24676.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2489, pruned_loss=0.04886, over 4718809.95 frames. ], batch size: 65, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:47:00,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:47:00,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:47:00,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=755693.3333333334, ans=10.0 2023-10-02 04:47:03,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:47:03,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:47:05,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:47:06,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:47:11,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 04:47:13,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:47:15,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:47:17,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:19,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:19,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:47:20,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:47:20,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 04:47:22,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:24,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:27,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 04:47:30,293 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.851e+02 2.122e+02 2.393e+02 3.759e+02, threshold=4.245e+02, percent-clipped=0.0 2023-10-02 04:47:30,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 04:47:30,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 04:47:31,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:47:33,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:33,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:47:35,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:47:40,194 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 04:47:41,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:47:43,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:46,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 04:47:47,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 04:47:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:47:48,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:47:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:47:54,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 04:47:59,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:48:00,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 04:48:01,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:48:01,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:03,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:48:03,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 04:48:04,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:48:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:48:06,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 04:48:06,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:10,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:48:11,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:48:11,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:48:12,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.23 vs. limit=10.0 2023-10-02 04:48:12,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:48:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:48:14,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:48:14,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:48:16,096 INFO [train.py:1046] (2/4) Epoch 22, batch 1850, loss[loss=0.1879, simple_loss=0.2602, pruned_loss=0.05781, over 22882.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2498, pruned_loss=0.04931, over 4714123.56 frames. ], batch size: 322, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:48:18,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:48:18,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:48:25,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:48:25,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 04:48:28,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 04:48:28,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=756093.3333333334, ans=0.0 2023-10-02 04:48:31,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 04:48:35,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:48:35,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 04:48:37,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 04:48:45,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:48:47,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 04:48:50,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:48:50,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:48:51,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=756160.0, ans=0.125 2023-10-02 04:48:54,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 04:48:55,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:55,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:48:57,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:48:58,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:49:03,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:49:04,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=756226.6666666666, ans=0.0 2023-10-02 04:49:06,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:49:06,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:06,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:49:06,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:07,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:49:08,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:49:13,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 04:49:13,522 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.08 vs. limit=15.0 2023-10-02 04:49:14,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:49:18,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:49:19,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:49:19,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 04:49:19,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 04:49:23,126 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 04:49:23,206 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 04:49:24,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:49:24,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:49:24,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:49:24,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:25,972 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 04:49:25,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:49:26,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:27,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:49:28,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.50 vs. limit=22.5 2023-10-02 04:49:28,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:49:30,108 INFO [train.py:1046] (2/4) Epoch 22, batch 1900, loss[loss=0.1744, simple_loss=0.2456, pruned_loss=0.05155, over 23802.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2503, pruned_loss=0.04965, over 4719132.52 frames. ], batch size: 212, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:49:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:49:30,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 04:49:31,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:31,637 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 04:49:31,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:49:33,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:38,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:40,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:49:41,833 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 04:49:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 04:49:45,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:49:45,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:49:47,162 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 04:49:47,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 04:49:50,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 04:49:52,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:49:57,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 04:49:57,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 04:49:57,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=756426.6666666666, ans=0.0 2023-10-02 04:50:00,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.830e+02 1.988e+02 2.362e+02 3.579e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-02 04:50:07,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 04:50:10,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 04:50:10,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:10,721 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 04:50:10,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 04:50:12,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 04:50:12,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 04:50:12,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:50:14,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 04:50:18,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:50:19,208 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.73 vs. limit=22.5 2023-10-02 04:50:20,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:50:20,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 04:50:23,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:50:25,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=756560.0, ans=0.04949747468305833 2023-10-02 04:50:26,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 04:50:27,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:50:35,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:50:35,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:50:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:50:35,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:50:35,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=756626.6666666666, ans=0.125 2023-10-02 04:50:36,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:50:38,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 04:50:39,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:50:40,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:50:40,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:50:42,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=756626.6666666666, ans=0.0 2023-10-02 04:50:43,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:50:43,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:50:44,909 INFO [train.py:1046] (2/4) Epoch 22, batch 1950, loss[loss=0.1772, simple_loss=0.2441, pruned_loss=0.05516, over 23664.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2513, pruned_loss=0.04997, over 4705189.19 frames. ], batch size: 149, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:50:44,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:50:46,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:50:49,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:50:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:50:52,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:52,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:50:57,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 04:50:57,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:50:57,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:58,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:51:01,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:01,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:04,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:51:07,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:51:07,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:51:07,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:51:08,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:11,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:14,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:51:14,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:14,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:51:14,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 04:51:15,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:51:15,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:51:17,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:17,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=756826.6666666666, ans=0.125 2023-10-02 04:51:21,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:21,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756826.6666666666, ans=0.1 2023-10-02 04:51:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:51:26,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=756826.6666666666, ans=0.1 2023-10-02 04:51:27,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:51:32,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:51:32,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:51:32,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 04:51:34,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:51:38,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:51:38,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:51:39,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:51:43,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=756960.0, ans=0.95 2023-10-02 04:51:47,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:47,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:49,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:51,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:51:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:55,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 04:51:55,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:51:57,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:59,247 INFO [train.py:1046] (2/4) Epoch 22, batch 2000, loss[loss=0.1721, simple_loss=0.2339, pruned_loss=0.05515, over 22701.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2514, pruned_loss=0.04979, over 4709112.31 frames. ], batch size: 322, lr: 4.68e-03, grad_scale: 32.0 2023-10-02 04:51:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 04:52:00,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:52:01,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=757026.6666666666, ans=0.1 2023-10-02 04:52:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:52:05,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:52:05,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:52:08,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:52:09,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:11,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 04:52:13,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:52:14,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:52:16,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 04:52:18,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:52:18,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:52:21,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:52:23,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 04:52:23,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:24,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:26,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:26,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 04:52:27,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:52:29,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 04:52:29,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:52:30,915 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.905e+02 2.121e+02 2.548e+02 4.469e+02, threshold=4.243e+02, percent-clipped=4.0 2023-10-02 04:52:32,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:52:34,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:52:34,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:34,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:52:35,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:52:37,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 04:52:39,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 04:52:39,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:52:39,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:52:44,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:44,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=757226.6666666666, ans=0.04949747468305833 2023-10-02 04:52:45,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:52:45,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:52:45,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:52:47,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:52:48,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:50,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:52:50,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:50,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:53,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:52:53,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 04:52:57,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.39 vs. limit=15.0 2023-10-02 04:52:59,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:52:59,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:52:59,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757293.3333333334, ans=0.1 2023-10-02 04:53:04,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:53:06,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:08,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:53:08,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:10,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:53:10,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:53:14,707 INFO [train.py:1046] (2/4) Epoch 22, batch 2050, loss[loss=0.161, simple_loss=0.2143, pruned_loss=0.05386, over 19611.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2509, pruned_loss=0.04946, over 4704663.89 frames. ], batch size: 388, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:53:14,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:16,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:18,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:53:18,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:24,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:53:26,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:53:26,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:27,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:53:29,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 04:53:29,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:53:29,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:53:31,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:53:31,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=757426.6666666666, ans=0.2 2023-10-02 04:53:40,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:53:40,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:42,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 04:53:43,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:45,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 04:53:45,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:53:45,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=12.0 2023-10-02 04:53:47,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=757493.3333333334, ans=0.125 2023-10-02 04:53:48,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:53:51,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:53:52,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:53:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:53:55,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:53:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:53:56,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:53:57,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=757493.3333333334, ans=0.2 2023-10-02 04:54:00,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:01,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:54:03,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:54:04,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:54:08,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:54:14,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:54:14,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=757626.6666666666, ans=0.07 2023-10-02 04:54:16,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 04:54:20,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:54:22,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:54:24,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=757626.6666666666, ans=0.2 2023-10-02 04:54:25,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:54:25,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 04:54:29,516 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 04:54:29,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:54:29,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:30,829 INFO [train.py:1046] (2/4) Epoch 22, batch 2100, loss[loss=0.1739, simple_loss=0.2581, pruned_loss=0.04481, over 24100.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2498, pruned_loss=0.04919, over 4702011.48 frames. ], batch size: 80, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:54:30,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:54:31,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:54:31,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 04:54:31,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757693.3333333334, ans=0.1 2023-10-02 04:54:32,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 04:54:34,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=757693.3333333334, ans=0.125 2023-10-02 04:54:35,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:54:37,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:54:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:54:41,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:54:41,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:54:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 04:54:43,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:54:44,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 04:54:44,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 04:54:47,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:54:47,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:54:47,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 04:54:47,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:54:53,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 04:54:53,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:54:55,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:54:55,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:55:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:55:01,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 04:55:02,525 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.868e+02 2.085e+02 2.437e+02 3.685e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 04:55:02,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:02,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:55:03,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 04:55:04,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:04,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 04:55:05,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 04:55:05,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 04:55:06,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:55:09,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:55:11,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:55:12,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:55:14,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:17,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:17,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 04:55:17,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:17,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:17,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:18,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 04:55:20,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 04:55:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 04:55:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:55:28,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:55:28,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 04:55:34,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:35,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:55:36,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:55:36,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:55:36,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 04:55:37,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:55:37,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=757960.0, ans=0.0 2023-10-02 04:55:38,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:38,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:55:41,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:55:41,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:42,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 04:55:44,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 04:55:44,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:55:45,762 INFO [train.py:1046] (2/4) Epoch 22, batch 2150, loss[loss=0.1837, simple_loss=0.2439, pruned_loss=0.06169, over 22729.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2494, pruned_loss=0.04902, over 4714216.34 frames. ], batch size: 322, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:55:46,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:46,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:55:46,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:55:47,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:55:52,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:55:53,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:55:54,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:55,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758026.6666666666, ans=0.1 2023-10-02 04:55:56,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:55:56,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:55:58,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:56:01,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:01,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:56:01,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:56:04,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:05,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 04:56:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:11,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:56:13,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:13,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:13,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:13,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:56:14,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:56:14,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:56:16,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:56:16,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 04:56:18,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:56:19,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:19,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:20,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:56:20,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:56:23,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:24,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:56:25,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.82 vs. limit=15.0 2023-10-02 04:56:26,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:26,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 04:56:26,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:56:28,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:29,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:31,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:32,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:56:32,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:34,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:34,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 04:56:36,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 04:56:36,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:56:38,341 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 04:56:38,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:38,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:56:39,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.17 vs. limit=6.0 2023-10-02 04:56:39,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 04:56:39,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:56:39,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 04:56:41,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 04:56:41,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 04:56:41,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 04:56:43,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:43,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:56:43,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:56:44,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:45,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:56:46,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=758293.3333333334, ans=0.125 2023-10-02 04:56:46,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=758293.3333333334, ans=0.0 2023-10-02 04:56:47,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:47,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:48,170 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.22 vs. limit=6.0 2023-10-02 04:56:52,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=758293.3333333334, ans=0.125 2023-10-02 04:56:53,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:56:53,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 04:56:59,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:57:00,786 INFO [train.py:1046] (2/4) Epoch 22, batch 2200, loss[loss=0.1663, simple_loss=0.2546, pruned_loss=0.03899, over 24451.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.249, pruned_loss=0.04929, over 4712273.18 frames. ], batch size: 69, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:57:02,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:04,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:57:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:05,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:57:08,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:57:09,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:57:09,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 04:57:09,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=758360.0, ans=0.0 2023-10-02 04:57:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 04:57:17,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:57:23,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 04:57:24,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:24,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:57:26,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:57:29,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:57:29,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 04:57:32,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:57:34,191 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.806e+02 1.968e+02 2.209e+02 3.586e+02, threshold=3.937e+02, percent-clipped=0.0 2023-10-02 04:57:34,570 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:57:35,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:36,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:57:39,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:57:41,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:57:42,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:57:43,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=758493.3333333334, ans=0.125 2023-10-02 04:57:44,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:46,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 04:57:47,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:48,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 04:57:52,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:52,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:57:52,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:52,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=758560.0, ans=0.125 2023-10-02 04:57:54,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:57:54,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:57:54,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:54,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:56,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:57:56,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:57:59,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 04:58:02,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:58:03,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:58:05,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:58:07,307 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 04:58:08,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:58:10,298 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 04:58:10,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:58:11,650 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 04:58:13,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:15,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 04:58:15,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:16,413 INFO [train.py:1046] (2/4) Epoch 22, batch 2250, loss[loss=0.1824, simple_loss=0.2567, pruned_loss=0.05401, over 23599.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2497, pruned_loss=0.04913, over 4718062.40 frames. ], batch size: 256, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:58:16,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 04:58:19,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:58:21,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:58:22,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=758693.3333333334, ans=0.125 2023-10-02 04:58:26,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:58:27,363 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.75 vs. limit=15.0 2023-10-02 04:58:28,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:58:31,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:31,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:58:32,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:58:33,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 04:58:33,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:58:33,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:58:35,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 04:58:37,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:58:37,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:39,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:58:42,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:58:44,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 04:58:46,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:58:46,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=758826.6666666666, ans=0.0 2023-10-02 04:58:47,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 04:58:49,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:50,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=758826.6666666666, ans=0.125 2023-10-02 04:58:53,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:58:56,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:58:57,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:58:59,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:59,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:59:01,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.47 vs. limit=22.5 2023-10-02 04:59:02,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:59:03,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:59:08,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:59:10,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:59:14,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:59:14,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:59:15,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:59:21,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 04:59:23,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:59:23,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 04:59:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:24,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:59:28,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 04:59:30,851 INFO [train.py:1046] (2/4) Epoch 22, batch 2300, loss[loss=0.1864, simple_loss=0.2672, pruned_loss=0.05282, over 24340.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.251, pruned_loss=0.04937, over 4714264.02 frames. ], batch size: 74, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:59:30,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:59:30,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:37,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:37,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:59:39,911 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 04:59:41,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:59:45,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=759093.3333333334, ans=0.1 2023-10-02 04:59:48,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:59:49,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:59:49,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:59:50,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:59:50,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 04:59:50,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:59:52,510 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:59:53,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:59:53,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:59:56,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:59:58,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:00:02,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:00:03,914 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.999e+02 2.256e+02 2.584e+02 4.812e+02, threshold=4.513e+02, percent-clipped=1.0 2023-10-02 05:00:08,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:00:08,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:00:10,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=759160.0, ans=0.125 2023-10-02 05:00:11,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:00:15,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:00:16,158 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.71 vs. limit=15.0 2023-10-02 05:00:19,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:00:20,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:00:21,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:00:21,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 05:00:25,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:00:25,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:00:26,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:00:26,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:00:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:00:28,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 05:00:28,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:00:28,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=759226.6666666666, ans=0.125 2023-10-02 05:00:29,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 05:00:29,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:00:29,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:00:31,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 05:00:35,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:00:39,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:00:43,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=759293.3333333334, ans=0.1 2023-10-02 05:00:43,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=759293.3333333334, ans=0.05 2023-10-02 05:00:44,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:00:44,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:00:44,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:00:45,921 INFO [train.py:1046] (2/4) Epoch 22, batch 2350, loss[loss=0.1586, simple_loss=0.2478, pruned_loss=0.03469, over 24675.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2513, pruned_loss=0.0498, over 4709282.06 frames. ], batch size: 73, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 05:00:47,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:00:47,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:00:47,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:00:48,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 05:00:54,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:00:54,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 05:01:00,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 05:01:01,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=759426.6666666666, ans=0.0 2023-10-02 05:01:03,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:01:06,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:06,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:06,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:01:06,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:01:08,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 05:01:11,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:01:13,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.95 vs. limit=15.0 2023-10-02 05:01:15,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.23 vs. limit=6.0 2023-10-02 05:01:15,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 05:01:16,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=759493.3333333334, ans=0.0 2023-10-02 05:01:17,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:01:20,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:01:20,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:01:23,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:01:25,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 05:01:25,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:01:28,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:01:28,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:01:28,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:01:31,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:01:32,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 05:01:33,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:01:36,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:38,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:01:39,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 05:01:39,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:01:42,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 05:01:42,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:01:43,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.62 vs. limit=15.0 2023-10-02 05:01:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 05:01:48,587 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.57 vs. limit=15.0 2023-10-02 05:01:52,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 05:01:53,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:01:53,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:01:53,712 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 05:01:55,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 05:01:56,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 05:01:59,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:02:01,008 INFO [train.py:1046] (2/4) Epoch 22, batch 2400, loss[loss=0.1928, simple_loss=0.2699, pruned_loss=0.05787, over 23758.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2505, pruned_loss=0.04952, over 4705189.95 frames. ], batch size: 85, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 05:02:04,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:02:06,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:02:06,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:02:08,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 05:02:08,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 05:02:16,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:02:16,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:02:17,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 05:02:19,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:02:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:20,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 05:02:22,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=759760.0, ans=0.1 2023-10-02 05:02:28,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:29,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 05:02:32,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:02:34,041 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.808e+02 2.044e+02 2.322e+02 3.355e+02, threshold=4.088e+02, percent-clipped=0.0 2023-10-02 05:02:34,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=759826.6666666666, ans=0.0 2023-10-02 05:02:36,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 05:02:39,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:02:41,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:41,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=759826.6666666666, ans=0.0 2023-10-02 05:02:41,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=759826.6666666666, ans=0.1 2023-10-02 05:02:44,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:02:46,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 05:02:47,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:02:52,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:02:55,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:02:58,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:00,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:03:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:03:00,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:03:00,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:03:01,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:03:01,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:03:05,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:03:06,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:03:06,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 05:03:08,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 05:03:09,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:03:09,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:03:10,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=759960.0, ans=0.0 2023-10-02 05:03:11,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 05:03:11,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 05:03:12,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 05:03:12,982 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 05:03:13,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 05:03:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:03:15,812 INFO [train.py:1046] (2/4) Epoch 22, batch 2450, loss[loss=0.1643, simple_loss=0.2244, pruned_loss=0.05208, over 23420.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2493, pruned_loss=0.0491, over 4703545.80 frames. ], batch size: 285, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:03:15,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:15,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:03:16,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=760026.6666666666, ans=0.1 2023-10-02 05:03:17,817 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 05:03:17,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:17,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:03:22,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:03:22,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:03:22,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=760026.6666666666, ans=0.0 2023-10-02 05:03:25,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:25,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:03:27,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 05:03:31,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:03:32,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:34,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:03:34,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:03:35,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:03:35,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 05:03:39,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:41,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=760093.3333333334, ans=0.0 2023-10-02 05:03:43,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:03:43,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:03:46,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:03:46,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:03:46,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=760160.0, ans=0.125 2023-10-02 05:03:47,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:03:48,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:50,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 05:03:51,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:04:00,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:01,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:04:02,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:02,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:04:03,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:04,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:04:04,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 05:04:07,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:04:08,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:04:11,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:04:12,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:14,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=760293.3333333334, ans=0.125 2023-10-02 05:04:17,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:04:17,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 05:04:17,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:04:18,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:04:18,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 05:04:19,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=760293.3333333334, ans=0.125 2023-10-02 05:04:20,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:04:20,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:04:24,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:04:24,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=760293.3333333334, ans=0.125 2023-10-02 05:04:26,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=760293.3333333334, ans=0.0 2023-10-02 05:04:28,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:28,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:04:31,457 INFO [train.py:1046] (2/4) Epoch 22, batch 2500, loss[loss=0.1767, simple_loss=0.2589, pruned_loss=0.04723, over 24432.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2485, pruned_loss=0.04864, over 4705211.48 frames. ], batch size: 69, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:04:31,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 05:04:32,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:04:38,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:04:48,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:04:48,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:49,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:04:49,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 05:04:54,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=760426.6666666666, ans=0.125 2023-10-02 05:04:56,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:04:56,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=760426.6666666666, ans=0.125 2023-10-02 05:04:58,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:04:58,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:04:58,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:04:58,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=760426.6666666666, ans=0.125 2023-10-02 05:04:59,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 05:05:00,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:01,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:05:01,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 05:05:01,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:03,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 05:05:03,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:04,480 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.863e+02 2.107e+02 2.380e+02 3.578e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-02 05:05:07,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:05:07,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:05:10,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:05:10,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 05:05:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:05:10,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=760493.3333333334, ans=0.125 2023-10-02 05:05:11,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:15,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:20,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:21,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:05:26,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:05:29,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 05:05:30,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:05:30,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:05:31,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:05:31,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:05:33,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 05:05:33,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 05:05:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 05:05:33,584 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:05:35,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:36,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=760626.6666666666, ans=0.125 2023-10-02 05:05:37,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 05:05:38,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 05:05:38,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:05:40,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 05:05:42,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 05:05:46,203 INFO [train.py:1046] (2/4) Epoch 22, batch 2550, loss[loss=0.184, simple_loss=0.2552, pruned_loss=0.05639, over 23796.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2489, pruned_loss=0.0486, over 4694414.64 frames. ], batch size: 232, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:05:46,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:05:47,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:05:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:05:50,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:05:50,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 05:05:52,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:05:55,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 05:05:56,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:05:58,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:01,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:06:01,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 05:06:03,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:06:03,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:06:03,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:06:06,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:06:06,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 05:06:07,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:06:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:07,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 05:06:17,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:06:20,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.09 vs. limit=6.0 2023-10-02 05:06:23,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:06:23,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:23,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:06:23,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:06:30,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:06:33,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:06:33,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:06:33,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:06:34,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 05:06:34,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:06:37,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:06:37,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:42,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:06:42,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 05:06:42,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:06:44,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:45,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:06:47,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:06:49,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:06:51,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.12 vs. limit=10.0 2023-10-02 05:06:52,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=760960.0, ans=0.0 2023-10-02 05:06:55,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:06:57,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:01,063 INFO [train.py:1046] (2/4) Epoch 22, batch 2600, loss[loss=0.1707, simple_loss=0.2449, pruned_loss=0.04824, over 23623.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2505, pruned_loss=0.04927, over 4691981.57 frames. ], batch size: 135, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:07:01,162 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 05:07:02,195 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.508e-03 2023-10-02 05:07:03,195 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 05:07:03,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:07:03,235 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 05:07:03,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 05:07:04,611 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 05:07:05,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:07:05,995 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 05:07:07,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 05:07:08,880 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 05:07:11,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:07:14,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 05:07:14,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=761093.3333333334, ans=0.125 2023-10-02 05:07:15,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 05:07:17,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:07:17,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 05:07:20,895 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 05:07:20,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 05:07:27,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:07:27,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:28,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=761093.3333333334, ans=0.125 2023-10-02 05:07:29,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:07:29,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 05:07:30,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:07:33,965 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.850e+02 2.100e+02 2.387e+02 3.462e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-02 05:07:38,639 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 05:07:44,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:45,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:07:45,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 05:07:47,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:07:47,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:07:47,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 05:07:48,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:07:50,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:07:51,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:07:53,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=761226.6666666666, ans=0.1 2023-10-02 05:07:56,326 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 05:07:56,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:07:56,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=761226.6666666666, ans=0.125 2023-10-02 05:07:57,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:07:58,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=761226.6666666666, ans=0.0 2023-10-02 05:07:59,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=761293.3333333334, ans=0.0 2023-10-02 05:08:03,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:08:03,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:08:03,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 05:08:04,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:08:06,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:08:08,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:08:13,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 05:08:15,225 INFO [train.py:1046] (2/4) Epoch 22, batch 2650, loss[loss=0.1521, simple_loss=0.2338, pruned_loss=0.03522, over 24306.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2507, pruned_loss=0.04912, over 4708025.86 frames. ], batch size: 61, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:08:15,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:17,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:08:19,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 05:08:19,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:08:21,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=761360.0, ans=0.0 2023-10-02 05:08:23,069 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 05:08:23,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:08:24,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:26,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:08:27,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:08:30,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:08:30,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 05:08:30,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:08:32,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:08:35,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 05:08:36,329 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 05:08:38,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:08:41,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=761426.6666666666, ans=0.125 2023-10-02 05:08:42,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 05:08:43,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:08:44,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 05:08:44,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=761493.3333333334, ans=0.125 2023-10-02 05:08:48,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:08:48,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:08:48,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:08:49,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:08:49,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=761493.3333333334, ans=0.125 2023-10-02 05:08:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 05:08:52,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 05:08:56,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:09:01,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 05:09:01,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:09:01,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:02,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:09:03,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:09:03,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:09:04,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:09:05,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:09:05,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:09:07,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:09:08,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:09:10,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:10,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:09:12,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:13,104 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.36 vs. limit=10.0 2023-10-02 05:09:13,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:09:13,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:09:16,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:17,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:09:17,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:18,631 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.20 vs. limit=6.0 2023-10-02 05:09:19,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 05:09:20,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:09:23,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:24,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:26,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:28,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:09:28,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:29,691 INFO [train.py:1046] (2/4) Epoch 22, batch 2700, loss[loss=0.164, simple_loss=0.2494, pruned_loss=0.03935, over 24308.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2513, pruned_loss=0.04924, over 4712661.64 frames. ], batch size: 61, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:09:31,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:09:31,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 05:09:33,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:09:35,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 05:09:38,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:09:38,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:38,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:38,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:09:39,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:39,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:09:39,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 05:09:39,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 05:09:39,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:09:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:09:43,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:09:44,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:47,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:09:47,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 05:09:49,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:09:52,344 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-10-02 05:09:53,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:09:53,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:09:53,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=761760.0, ans=0.1 2023-10-02 05:09:59,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:09:59,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:09:59,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:10:00,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=761826.6666666666, ans=0.0 2023-10-02 05:10:01,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:10:01,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=761826.6666666666, ans=0.125 2023-10-02 05:10:02,471 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.897e+02 2.080e+02 2.344e+02 3.157e+02, threshold=4.159e+02, percent-clipped=0.0 2023-10-02 05:10:03,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:06,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:10:06,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:10:06,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:10:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:12,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:10:19,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:10:20,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:10:23,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:10:23,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:24,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:26,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:28,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:10:28,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:29,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:29,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:10:34,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:10:35,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:10:35,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:10:38,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 05:10:38,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:39,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:10:39,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 05:10:41,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 05:10:43,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:44,759 INFO [train.py:1046] (2/4) Epoch 22, batch 2750, loss[loss=0.182, simple_loss=0.2531, pruned_loss=0.05548, over 23548.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2509, pruned_loss=0.0488, over 4716777.05 frames. ], batch size: 149, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:10:46,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:10:46,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:46,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.33 vs. limit=10.0 2023-10-02 05:10:47,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:48,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:10:48,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:53,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:10:53,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:10:53,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:10:54,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:54,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 05:10:54,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:10:54,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:11:00,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 05:11:02,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:11:02,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:03,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:11:03,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:11:05,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:11:06,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:11:06,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=762093.3333333334, ans=0.04949747468305833 2023-10-02 05:11:07,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:07,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:13,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:11:13,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:11:13,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:11:15,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:17,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:11:17,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=762160.0, ans=0.125 2023-10-02 05:11:22,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:25,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:11:25,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:11:30,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:30,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:11:30,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:11:30,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=762226.6666666666, ans=0.0 2023-10-02 05:11:36,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:11:36,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:11:36,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 05:11:42,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:11:43,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 05:11:49,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:11:51,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:11:51,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 05:11:52,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:11:52,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:11:54,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 05:11:54,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:11:57,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 05:11:58,746 INFO [train.py:1046] (2/4) Epoch 22, batch 2800, loss[loss=0.1755, simple_loss=0.2501, pruned_loss=0.05039, over 23219.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2503, pruned_loss=0.04825, over 4732695.12 frames. ], batch size: 93, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:11:58,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:11:58,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:00,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 05:12:00,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:00,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:02,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:03,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 05:12:03,535 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 05:12:06,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:09,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:12:09,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:12:12,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:12:12,714 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.94 vs. limit=12.0 2023-10-02 05:12:13,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 05:12:15,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 05:12:16,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 05:12:19,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:12:20,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:12:20,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:12:20,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=762426.6666666666, ans=0.125 2023-10-02 05:12:23,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:12:24,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:12:24,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:12:25,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:12:31,712 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.904e+02 2.151e+02 2.380e+02 3.525e+02, threshold=4.302e+02, percent-clipped=0.0 2023-10-02 05:12:34,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:12:36,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:37,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:39,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:12:39,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:12:39,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=762493.3333333334, ans=0.125 2023-10-02 05:12:43,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:12:43,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 05:12:44,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:46,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:12:46,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:12:51,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:52,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:55,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:12:55,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=762560.0, ans=0.2 2023-10-02 05:12:57,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:12:57,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:57,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:12:57,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=762626.6666666666, ans=0.125 2023-10-02 05:12:58,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:12:58,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:13:00,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:13:00,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 05:13:01,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:03,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:13:03,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:03,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=762626.6666666666, ans=0.125 2023-10-02 05:13:04,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 05:13:05,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:05,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:13:08,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:13:08,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 05:13:14,067 INFO [train.py:1046] (2/4) Epoch 22, batch 2850, loss[loss=0.1575, simple_loss=0.2365, pruned_loss=0.03927, over 23414.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2491, pruned_loss=0.04805, over 4720020.15 frames. ], batch size: 119, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:13:14,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:13:14,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:13:15,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:13:17,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=762693.3333333334, ans=0.95 2023-10-02 05:13:18,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:13:18,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=762693.3333333334, ans=0.0 2023-10-02 05:13:21,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:13:23,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:13:23,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:13:25,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:26,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:13:26,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=762693.3333333334, ans=0.0 2023-10-02 05:13:27,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:13:27,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=762760.0, ans=0.125 2023-10-02 05:13:27,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=762760.0, ans=0.1 2023-10-02 05:13:28,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 05:13:33,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 05:13:33,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:13:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 05:13:36,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:37,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 05:13:39,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 05:13:42,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:51,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=762826.6666666666, ans=0.125 2023-10-02 05:13:52,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:52,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:13:54,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:13:55,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:13:55,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:13:55,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:13:57,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:13:58,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 05:14:00,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:14:00,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:14:00,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:14:00,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=762893.3333333334, ans=0.125 2023-10-02 05:14:01,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:04,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:05,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:07,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:09,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:14:11,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:14:11,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:13,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:15,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:14:18,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:14:19,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=762960.0, ans=0.0 2023-10-02 05:14:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 05:14:20,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=762960.0, ans=0.125 2023-10-02 05:14:22,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 05:14:23,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:14:23,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:23,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 05:14:23,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:14:25,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:25,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:14:26,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:14:26,921 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 05:14:26,962 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 05:14:26,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:14:27,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:28,338 INFO [train.py:1046] (2/4) Epoch 22, batch 2900, loss[loss=0.1777, simple_loss=0.2522, pruned_loss=0.05162, over 23434.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2487, pruned_loss=0.04822, over 4714527.24 frames. ], batch size: 134, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:14:29,058 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-10-02 05:14:31,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:14:32,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:14:32,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:14:34,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 05:14:38,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:38,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 05:14:39,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 05:14:41,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:14:41,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:14:43,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:43,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:14:47,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:14:47,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:14:51,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 05:14:51,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:14:53,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:57,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 05:14:57,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 05:14:59,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:59,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 05:14:59,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:15:02,605 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.846e+02 2.042e+02 2.296e+02 2.937e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 05:15:03,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:15:03,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:15:06,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:15:07,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:15:10,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:15:13,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:14,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 05:15:14,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 05:15:14,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:15:14,866 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=12.0 2023-10-02 05:15:18,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:15:19,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.14 vs. limit=15.0 2023-10-02 05:15:20,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 05:15:21,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:15:26,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:15:35,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:15:35,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:15:37,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 05:15:41,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:41,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 05:15:41,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:15:41,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:15:42,578 INFO [train.py:1046] (2/4) Epoch 22, batch 2950, loss[loss=0.1853, simple_loss=0.2622, pruned_loss=0.05425, over 23315.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2495, pruned_loss=0.04832, over 4714354.00 frames. ], batch size: 105, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:15:46,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:15:48,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 05:15:50,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:15:50,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:51,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:15:54,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:15:54,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 05:15:56,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 05:15:56,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:15:56,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:16:01,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=763426.6666666666, ans=0.025 2023-10-02 05:16:02,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:16:04,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=763426.6666666666, ans=0.0 2023-10-02 05:16:05,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:16:07,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:16:07,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=763426.6666666666, ans=0.05 2023-10-02 05:16:08,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:16:11,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:16:11,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:16:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:16:14,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:16:14,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:16:17,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 05:16:21,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 05:16:21,978 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 05:16:22,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:16:23,484 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 05:16:26,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 05:16:26,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:16:26,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:16:26,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 05:16:26,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:16:29,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 05:16:29,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:16:29,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:16:30,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=763560.0, ans=0.125 2023-10-02 05:16:32,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:16:33,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=763560.0, ans=0.1 2023-10-02 05:16:34,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:16:34,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:34,376 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 05:16:35,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.01 vs. limit=15.0 2023-10-02 05:16:35,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:16:35,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 05:16:43,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:44,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:16:44,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 05:16:44,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:16:45,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 05:16:50,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:16:52,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:16:53,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:16:53,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:53,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:16:54,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:16:57,523 INFO [train.py:1046] (2/4) Epoch 22, batch 3000, loss[loss=0.1668, simple_loss=0.2566, pruned_loss=0.03849, over 24447.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2504, pruned_loss=0.04809, over 4722156.23 frames. ], batch size: 69, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:16:57,523 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 05:17:11,210 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.7660, 1.7436, 4.2408, 3.8843], device='cuda:2') 2023-10-02 05:17:11,895 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.0708, 3.8405, 3.6081, 3.5816], device='cuda:2') 2023-10-02 05:17:15,402 INFO [train.py:1078] (2/4) Epoch 22, validation: loss=0.3452, simple_loss=0.2763, pruned_loss=0.2071, over 1125622.00 frames. 2023-10-02 05:17:15,402 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 05:17:15,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:15,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:17:15,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:17:15,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:17:16,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:17:16,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:16,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 05:17:18,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:18,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=763693.3333333334, ans=0.1 2023-10-02 05:17:21,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:17:21,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:17:24,085 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 05:17:25,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 05:17:27,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:17:28,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:17:28,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 05:17:30,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:17:35,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=763760.0, ans=0.0 2023-10-02 05:17:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:17:45,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:17:49,821 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.840e+02 2.064e+02 2.389e+02 3.388e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-02 05:17:52,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 05:17:52,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:17:55,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:17:56,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:17:56,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:17:59,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:17:59,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 05:18:00,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 05:18:00,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=763893.3333333334, ans=0.0 2023-10-02 05:18:01,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:18:03,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:18:03,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=763893.3333333334, ans=0.125 2023-10-02 05:18:04,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:18:06,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:18:06,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:06,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:18:08,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:18:10,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:18:10,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:18:13,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:18:15,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 05:18:17,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:18:17,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:17,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:18:21,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:22,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:22,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 05:18:22,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 05:18:24,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:18:24,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 05:18:25,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:18:27,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 05:18:29,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:18:29,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:18:29,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 05:18:29,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 05:18:29,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:18:30,956 INFO [train.py:1046] (2/4) Epoch 22, batch 3050, loss[loss=0.1644, simple_loss=0.2421, pruned_loss=0.04335, over 24485.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2515, pruned_loss=0.04885, over 4718339.20 frames. ], batch size: 63, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:18:31,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:18:31,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=764026.6666666666, ans=0.1 2023-10-02 05:18:32,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:32,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:18:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:33,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:18:35,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 05:18:38,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=764026.6666666666, ans=0.05 2023-10-02 05:18:39,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:18:41,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:18:41,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:18:44,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:47,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 05:18:53,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 05:18:53,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 05:18:53,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=764093.3333333334, ans=0.125 2023-10-02 05:18:54,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:18:57,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:19:00,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:00,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:19:00,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:03,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:19:05,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:19:05,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:05,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:19:05,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:08,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:09,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:13,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:13,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 05:19:14,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:14,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:19:18,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:19:18,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:19:19,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:19:19,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:22,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=764226.6666666666, ans=0.0 2023-10-02 05:19:25,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:26,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:30,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:30,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:19:30,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:32,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:19:33,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:19:33,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:19:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 05:19:38,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:19:38,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 05:19:39,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=764293.3333333334, ans=0.125 2023-10-02 05:19:39,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:45,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:46,781 INFO [train.py:1046] (2/4) Epoch 22, batch 3100, loss[loss=0.1717, simple_loss=0.2331, pruned_loss=0.05518, over 22545.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2511, pruned_loss=0.04872, over 4716803.74 frames. ], batch size: 322, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:19:48,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:19:49,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:19:49,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=764360.0, ans=0.0 2023-10-02 05:19:52,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 05:19:55,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 05:19:55,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 05:19:58,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:19:59,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=764426.6666666666, ans=0.5 2023-10-02 05:20:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:20:00,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:03,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 05:20:07,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:11,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 05:20:18,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:20:19,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:19,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:20:20,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.816e+02 1.968e+02 2.176e+02 3.408e+02, threshold=3.937e+02, percent-clipped=0.0 2023-10-02 05:20:20,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:20:20,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 05:20:20,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:20:20,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 05:20:20,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:20:22,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:23,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 05:20:25,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:20:29,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:20:29,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 05:20:30,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 05:20:30,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:30,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:30,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=764560.0, ans=0.1 2023-10-02 05:20:34,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:20:34,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:34,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=764560.0, ans=0.2 2023-10-02 05:20:35,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:20:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:20:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:20:38,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:20:38,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:20:38,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:38,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:20:41,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=764560.0, ans=0.125 2023-10-02 05:20:42,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:20:44,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 05:20:46,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:20:48,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 05:20:48,658 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.21 vs. limit=15.0 2023-10-02 05:20:49,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:20:49,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:49,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 05:20:59,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 05:21:00,689 INFO [train.py:1046] (2/4) Epoch 22, batch 3150, loss[loss=0.1833, simple_loss=0.2501, pruned_loss=0.0582, over 23734.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2498, pruned_loss=0.04868, over 4721653.01 frames. ], batch size: 164, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:21:02,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:02,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:04,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:21:04,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:21:04,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 05:21:05,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:05,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:21:06,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 05:21:09,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:11,345 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 05:21:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 05:21:14,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:21:15,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 05:21:17,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 05:21:19,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 05:21:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 05:21:20,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 05:21:20,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:20,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:21:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:23,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 05:21:23,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=764760.0, ans=0.125 2023-10-02 05:21:24,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:25,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:26,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:21:26,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=764760.0, ans=0.125 2023-10-02 05:21:27,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:21:31,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 05:21:31,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:21:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:21:36,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:21:36,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 05:21:39,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 05:21:41,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:21:41,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:21:41,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:21:42,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:42,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:21:45,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:21:45,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:21:45,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 05:21:46,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:21:46,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:21:47,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:21:48,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:21:48,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 05:21:50,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:21:50,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=764893.3333333334, ans=0.07 2023-10-02 05:21:52,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 05:21:52,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:21:52,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 05:21:54,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 05:21:55,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:21:55,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:21:56,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 05:21:58,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 05:21:58,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:59,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:22:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:02,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:22:07,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:22:09,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:10,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 05:22:14,641 INFO [train.py:1046] (2/4) Epoch 22, batch 3200, loss[loss=0.1522, simple_loss=0.2471, pruned_loss=0.02861, over 24300.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2478, pruned_loss=0.04783, over 4715016.51 frames. ], batch size: 74, lr: 4.66e-03, grad_scale: 32.0 2023-10-02 05:22:16,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:22:16,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 05:22:21,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:21,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:22:22,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 05:22:25,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:22:28,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:22:31,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:34,507 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.87 vs. limit=15.0 2023-10-02 05:22:40,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:22:49,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 05:22:50,471 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.957e+02 2.150e+02 2.421e+02 3.635e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-02 05:22:50,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:22:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 05:22:54,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:22:58,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:22:58,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:22:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:23:02,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.49 vs. limit=15.0 2023-10-02 05:23:03,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 05:23:05,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 05:23:06,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 05:23:09,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 05:23:13,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:23:18,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:18,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:23:19,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:20,029 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 05:23:20,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:23:25,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:23:26,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 05:23:26,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 05:23:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 05:23:29,512 INFO [train.py:1046] (2/4) Epoch 22, batch 3250, loss[loss=0.1795, simple_loss=0.2493, pruned_loss=0.05484, over 23572.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2481, pruned_loss=0.04788, over 4726159.73 frames. ], batch size: 256, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:23:29,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 05:23:29,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=765360.0, ans=0.0 2023-10-02 05:23:30,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:23:33,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:23:33,872 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 05:23:33,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:23:33,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:35,364 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 05:23:39,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:23:41,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:23:41,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=765360.0, ans=0.0 2023-10-02 05:23:47,796 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.63 vs. limit=22.5 2023-10-02 05:23:48,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:23:48,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 05:23:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:23:51,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:51,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:23:53,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:23:53,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:23:56,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:23:56,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:23:56,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:23:57,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=765426.6666666666, ans=0.0 2023-10-02 05:24:00,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:03,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:24:05,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:24:05,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:24:06,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:24:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:24:06,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:24:11,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 05:24:11,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:24:11,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:24:12,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:14,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:24:19,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:24:26,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765560.0, ans=0.1 2023-10-02 05:24:27,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:24:27,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:27,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 05:24:27,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:24:27,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:24:29,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:30,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=765626.6666666666, ans=0.025 2023-10-02 05:24:31,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 05:24:31,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 05:24:31,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:24:33,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:33,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:24:34,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 05:24:34,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:24:39,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:24:39,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:24:40,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 05:24:40,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:24:43,103 INFO [train.py:1046] (2/4) Epoch 22, batch 3300, loss[loss=0.1818, simple_loss=0.25, pruned_loss=0.05686, over 23810.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2492, pruned_loss=0.04863, over 4721229.00 frames. ], batch size: 195, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:24:43,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:24:43,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 05:24:45,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:24:47,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 05:24:48,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 05:24:50,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 05:24:51,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:52,426 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-02 05:24:53,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765693.3333333334, ans=0.1 2023-10-02 05:24:54,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:24:54,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:24:56,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:24:57,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:24:57,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=765760.0, ans=0.125 2023-10-02 05:25:00,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:02,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:25:06,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 05:25:06,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:25:06,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:09,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:10,656 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 05:25:12,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:25:12,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=765826.6666666666, ans=0.125 2023-10-02 05:25:13,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:25:13,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:25:13,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:13,447 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 05:25:13,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=765826.6666666666, ans=0.125 2023-10-02 05:25:18,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:25:18,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:25:19,559 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.834e+02 2.092e+02 2.306e+02 3.229e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 05:25:19,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:19,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 05:25:21,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 05:25:21,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:23,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:25:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 05:25:26,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 05:25:26,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:25:26,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=765826.6666666666, ans=0.0 2023-10-02 05:25:27,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 05:25:31,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:25:34,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:25:34,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:25:37,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=765893.3333333334, ans=0.2 2023-10-02 05:25:38,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:25:38,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:38,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:25:39,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:25:40,071 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:25:42,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:25:42,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:43,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:25:46,477 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 05:25:46,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 05:25:48,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:25:48,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:25:48,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:25:49,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:49,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:25:51,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:25:53,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:53,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:25:54,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:54,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:25:55,199 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.77 vs. limit=6.0 2023-10-02 05:25:56,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=765960.0, ans=0.125 2023-10-02 05:25:57,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 05:25:57,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:25:59,253 INFO [train.py:1046] (2/4) Epoch 22, batch 3350, loss[loss=0.1683, simple_loss=0.2445, pruned_loss=0.04607, over 21311.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2501, pruned_loss=0.04903, over 4718816.05 frames. ], batch size: 46, lr: 4.66e-03, grad_scale: 8.0 2023-10-02 05:25:59,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:59,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=766026.6666666666, ans=0.0 2023-10-02 05:26:02,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:26:02,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:26:03,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:03,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=766026.6666666666, ans=0.2 2023-10-02 05:26:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:26:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:06,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=766026.6666666666, ans=0.0 2023-10-02 05:26:07,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=766026.6666666666, ans=0.1 2023-10-02 05:26:08,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:26:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:11,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:26:12,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:15,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:26:16,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:16,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:26:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 05:26:20,211 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 05:26:20,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:20,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=766093.3333333334, ans=0.0 2023-10-02 05:26:23,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=766093.3333333334, ans=0.0 2023-10-02 05:26:24,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 05:26:24,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 05:26:25,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:26:26,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:26:27,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:27,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 05:26:27,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:27,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:26:30,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:30,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:30,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:32,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:26:37,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:40,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:40,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:44,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:26:45,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:46,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:47,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:48,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:50,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 05:26:50,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:26:50,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 05:26:51,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:26:51,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 05:26:52,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:55,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:55,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=22.5 2023-10-02 05:27:00,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-10-02 05:27:00,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:27:02,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 05:27:02,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:27:02,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:27:04,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:27:04,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=766293.3333333334, ans=0.0 2023-10-02 05:27:07,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:27:10,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 05:27:10,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:27:10,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:27:11,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:27:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 05:27:14,638 INFO [train.py:1046] (2/4) Epoch 22, batch 3400, loss[loss=0.1535, simple_loss=0.2313, pruned_loss=0.03781, over 24563.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2516, pruned_loss=0.05016, over 4709333.62 frames. ], batch size: 60, lr: 4.66e-03, grad_scale: 8.0 2023-10-02 05:27:14,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:27:14,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 05:27:16,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:27:17,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:27:18,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:27:19,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:27:19,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 05:27:25,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 05:27:25,414 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 05:27:25,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:27:28,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:27:28,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:27:29,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:27:31,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:27:35,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=766426.6666666666, ans=0.2 2023-10-02 05:27:37,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:27:38,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 05:27:40,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=766426.6666666666, ans=0.0 2023-10-02 05:27:43,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:27:45,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:27:46,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:27:46,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=766493.3333333334, ans=0.0 2023-10-02 05:27:47,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:27:51,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=766493.3333333334, ans=0.025 2023-10-02 05:27:54,013 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.880e+02 2.089e+02 2.300e+02 3.158e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 05:27:54,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:27:55,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 05:28:01,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:28:01,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=766560.0, ans=0.125 2023-10-02 05:28:02,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:28:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 05:28:03,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:28:03,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:05,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:28:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:28:07,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:28:12,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:28:12,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:28:15,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.84 vs. limit=15.0 2023-10-02 05:28:16,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:28:16,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=766626.6666666666, ans=0.0 2023-10-02 05:28:17,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 05:28:22,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:28:27,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 05:28:30,080 INFO [train.py:1046] (2/4) Epoch 22, batch 3450, loss[loss=0.1692, simple_loss=0.2602, pruned_loss=0.03911, over 24423.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.0496, over 4726310.99 frames. ], batch size: 69, lr: 4.65e-03, grad_scale: 4.0 2023-10-02 05:28:30,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 05:28:30,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:28:33,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:28:33,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 05:28:33,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:28:36,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:28:43,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:28:45,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:28:46,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:28:46,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:48,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:54,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 05:28:59,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 05:28:59,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:28:59,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:29:02,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:03,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.59 vs. limit=6.0 2023-10-02 05:29:04,586 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.18 vs. limit=22.5 2023-10-02 05:29:07,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 05:29:08,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.62 vs. limit=15.0 2023-10-02 05:29:09,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:29:11,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:29:11,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:29:13,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:29:15,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:29:16,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 05:29:16,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:29:18,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:29:20,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:29:21,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=766893.3333333334, ans=0.0 2023-10-02 05:29:24,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 05:29:29,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:29:34,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:29:35,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:36,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=766960.0, ans=0.125 2023-10-02 05:29:39,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:42,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:42,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=766960.0, ans=0.0 2023-10-02 05:29:43,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:29:43,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:29:44,731 INFO [train.py:1046] (2/4) Epoch 22, batch 3500, loss[loss=0.1694, simple_loss=0.2473, pruned_loss=0.04575, over 24488.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2499, pruned_loss=0.04905, over 4707588.82 frames. ], batch size: 66, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:29:44,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:29:49,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:52,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:29:53,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 05:29:54,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.63 vs. limit=6.0 2023-10-02 05:29:55,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:29:57,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:29:59,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:59,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 05:30:00,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=767093.3333333334, ans=0.125 2023-10-02 05:30:04,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:30:04,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:30:06,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:30:06,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:07,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:30:07,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:09,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:30:09,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 05:30:12,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:13,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:30:15,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:30:18,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:19,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 05:30:19,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:30:20,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:30:22,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=767160.0, ans=0.125 2023-10-02 05:30:23,753 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.866e+02 2.009e+02 2.229e+02 3.626e+02, threshold=4.019e+02, percent-clipped=0.0 2023-10-02 05:30:23,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:30:25,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:26,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:30:26,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:30:28,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 05:30:28,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 05:30:29,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 05:30:30,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=767226.6666666666, ans=0.125 2023-10-02 05:30:31,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:30:32,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:32,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:32,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:30:32,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=767226.6666666666, ans=0.125 2023-10-02 05:30:36,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:30:37,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:30:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:30:43,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 05:30:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 05:30:43,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:30:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:30:47,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:30:49,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:50,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 05:30:50,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:30:52,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:53,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 05:30:54,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 05:30:56,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:57,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:30:57,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:30:57,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:30:59,166 INFO [train.py:1046] (2/4) Epoch 22, batch 3550, loss[loss=0.1644, simple_loss=0.2458, pruned_loss=0.0415, over 24457.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2493, pruned_loss=0.04835, over 4716199.30 frames. ], batch size: 66, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:31:01,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:31:09,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:10,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 05:31:14,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:31:14,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:31:17,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:17,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:31:18,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:31:21,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:31:22,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:31:22,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:24,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:31:24,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:31:30,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:31:30,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=767493.3333333334, ans=0.5 2023-10-02 05:31:31,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:31:31,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:31:31,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:33,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:31:33,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 05:31:33,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:34,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:36,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:31:42,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:31:42,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:31:44,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:31:46,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 05:31:47,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:31:48,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 05:31:48,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:31:49,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:31:49,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:31:54,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 05:31:54,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-10-02 05:31:55,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:31:59,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:32:00,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 05:32:02,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:06,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:32:07,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 05:32:10,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=767626.6666666666, ans=0.0 2023-10-02 05:32:13,848 INFO [train.py:1046] (2/4) Epoch 22, batch 3600, loss[loss=0.1658, simple_loss=0.2359, pruned_loss=0.04781, over 19362.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2492, pruned_loss=0.04813, over 4718326.65 frames. ], batch size: 42, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:32:15,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 05:32:15,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:32:15,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:32:16,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:18,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:19,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:32:22,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:32:23,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:25,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:32:25,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:32:25,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:26,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 05:32:28,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:32:29,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:31,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:32:34,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:32:35,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:32:35,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:32:35,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 05:32:37,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:32:40,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:41,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:32:43,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:32:45,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=767826.6666666666, ans=0.1 2023-10-02 05:32:46,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:32:47,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:32:49,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 05:32:52,054 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.859e+02 2.035e+02 2.316e+02 3.119e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-02 05:32:54,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:32:56,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:32:56,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 05:33:00,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:33:07,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:08,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:09,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.56 vs. limit=15.0 2023-10-02 05:33:10,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=767893.3333333334, ans=0.0 2023-10-02 05:33:11,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.61 vs. limit=15.0 2023-10-02 05:33:12,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:33:12,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:33:12,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 05:33:12,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=767960.0, ans=0.0 2023-10-02 05:33:13,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 05:33:15,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 05:33:17,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.36 vs. limit=15.0 2023-10-02 05:33:18,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:33:18,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:33:18,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 05:33:19,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:33:19,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:33:19,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:33:20,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 05:33:23,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 05:33:24,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=767960.0, ans=0.1 2023-10-02 05:33:25,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:25,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 05:33:28,010 INFO [train.py:1046] (2/4) Epoch 22, batch 3650, loss[loss=0.1726, simple_loss=0.2579, pruned_loss=0.04368, over 24641.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2501, pruned_loss=0.04898, over 4720334.14 frames. ], batch size: 73, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:33:31,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 05:33:31,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:33:36,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 05:33:37,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.00 vs. limit=10.0 2023-10-02 05:33:38,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 05:33:43,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:33:43,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:33:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:33:48,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:33:48,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:33:49,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 05:33:50,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:33:51,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:33:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 05:33:52,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:33:52,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:33:52,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:33:55,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:33:57,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 05:33:59,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 05:34:00,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:34:01,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 05:34:03,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:34:03,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:34:07,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=768160.0, ans=0.0 2023-10-02 05:34:09,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:34:10,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:34:10,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:34:12,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:34:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:34:16,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:34:17,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=768226.6666666666, ans=0.2 2023-10-02 05:34:19,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:34:19,835 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.60 vs. limit=15.0 2023-10-02 05:34:20,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:20,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:34:22,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:34:24,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:34:24,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:34:24,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=768226.6666666666, ans=0.125 2023-10-02 05:34:24,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=768226.6666666666, ans=0.0 2023-10-02 05:34:25,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=768293.3333333334, ans=0.1 2023-10-02 05:34:30,005 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 05:34:32,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:34:32,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:34:34,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=768293.3333333334, ans=0.2 2023-10-02 05:34:34,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=768293.3333333334, ans=0.125 2023-10-02 05:34:35,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:34:35,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:37,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:34:38,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:39,507 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.60 vs. limit=6.0 2023-10-02 05:34:40,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 05:34:40,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:42,044 INFO [train.py:1046] (2/4) Epoch 22, batch 3700, loss[loss=0.1633, simple_loss=0.2487, pruned_loss=0.039, over 24541.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2501, pruned_loss=0.04877, over 4727987.05 frames. ], batch size: 71, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:34:42,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:34:43,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=768360.0, ans=0.125 2023-10-02 05:34:44,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:34:46,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:34:47,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.12 vs. limit=15.0 2023-10-02 05:34:49,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:49,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 05:34:49,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:49,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:34:51,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:34:56,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:34:57,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:34:57,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:34:59,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:35:00,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:35:00,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:35:02,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=768426.6666666666, ans=0.125 2023-10-02 05:35:03,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:35:04,644 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 05:35:06,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=768426.6666666666, ans=0.0 2023-10-02 05:35:11,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:35:13,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:35:14,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:35:14,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 05:35:14,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:35:18,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:18,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 05:35:19,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=768493.3333333334, ans=0.125 2023-10-02 05:35:20,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:21,390 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.818e+02 2.091e+02 2.478e+02 3.925e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 05:35:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:35:24,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:24,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:35:24,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=768493.3333333334, ans=0.125 2023-10-02 05:35:26,367 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:35:27,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:35:31,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:35:31,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 05:35:33,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:35:33,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 05:35:35,048 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.86 vs. limit=22.5 2023-10-02 05:35:37,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:35:37,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:35:39,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:35:40,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 05:35:43,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:35:43,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:35:43,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:35:44,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:35:48,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:35:49,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 05:35:49,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 05:35:50,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:35:50,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:35:50,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:35:52,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:35:55,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:55,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=768693.3333333334, ans=0.1 2023-10-02 05:35:56,799 INFO [train.py:1046] (2/4) Epoch 22, batch 3750, loss[loss=0.2236, simple_loss=0.289, pruned_loss=0.07911, over 19435.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2507, pruned_loss=0.04912, over 4732460.69 frames. ], batch size: 388, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:35:56,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:35:58,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:01,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 05:36:02,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 05:36:02,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=768693.3333333334, ans=0.125 2023-10-02 05:36:03,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:36:05,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 05:36:05,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:36:06,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:36:06,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:36:07,192 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.52 vs. limit=15.0 2023-10-02 05:36:09,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:36:12,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=768760.0, ans=0.125 2023-10-02 05:36:13,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:36:16,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:36:18,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:36:20,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:36:25,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:36:25,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 05:36:26,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:36:28,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:36:28,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:36:30,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 05:36:31,753 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.17 vs. limit=15.0 2023-10-02 05:36:33,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 05:36:34,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:36:36,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:36:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:36:43,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:44,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=768893.3333333334, ans=0.125 2023-10-02 05:36:45,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:36:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 05:36:50,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:53,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:36:53,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:36:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:36:59,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-10-02 05:37:02,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:37:03,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:37:05,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:37:06,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:37:09,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:37:10,668 INFO [train.py:1046] (2/4) Epoch 22, batch 3800, loss[loss=0.1904, simple_loss=0.2726, pruned_loss=0.05412, over 24384.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.25, pruned_loss=0.04885, over 4726220.70 frames. ], batch size: 77, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:37:14,114 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.45 vs. limit=22.5 2023-10-02 05:37:16,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:37:19,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:21,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:37:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 05:37:24,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:37:24,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:37:26,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:37:29,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 05:37:29,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:29,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:37:30,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:37:30,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:37:31,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:33,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 05:37:36,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 05:37:36,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:37:38,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:37:41,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:37:41,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:37:43,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:37:43,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:44,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.54 vs. limit=22.5 2023-10-02 05:37:46,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:46,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:51,711 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.862e+02 2.088e+02 2.377e+02 3.435e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-02 05:37:52,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=769160.0, ans=0.2 2023-10-02 05:37:53,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:37:53,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 05:37:53,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:38:00,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:38:04,719 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.19 vs. limit=10.0 2023-10-02 05:38:05,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:38:06,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 05:38:08,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 05:38:08,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:10,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:38:12,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:12,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 05:38:17,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 05:38:17,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 05:38:17,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:38:27,278 INFO [train.py:1046] (2/4) Epoch 22, batch 3850, loss[loss=0.1547, simple_loss=0.2349, pruned_loss=0.03725, over 24464.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2484, pruned_loss=0.04864, over 4710090.96 frames. ], batch size: 63, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:38:27,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:38:27,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:38:31,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:38:31,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 05:38:33,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:38:33,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:35,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:38:38,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:40,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:38:41,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 05:38:47,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:38:50,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:52,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:38:53,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:38:54,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:38:55,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:38:55,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:55,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:38:57,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:38:58,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:38:58,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:00,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:39:00,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 05:39:00,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 05:39:01,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:39:01,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:04,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:04,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:05,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 05:39:07,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 05:39:09,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:11,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 05:39:13,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:39:19,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:20,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:21,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=769560.0, ans=0.2 2023-10-02 05:39:23,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:25,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 05:39:27,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 05:39:28,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=769626.6666666666, ans=0.125 2023-10-02 05:39:29,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:31,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:32,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:39:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:39:33,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:33,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:33,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:39:33,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 05:39:35,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:39:36,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 05:39:36,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:36,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:39,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:39:40,606 INFO [train.py:1046] (2/4) Epoch 22, batch 3900, loss[loss=0.1753, simple_loss=0.2301, pruned_loss=0.0603, over 19439.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2477, pruned_loss=0.04847, over 4696476.16 frames. ], batch size: 388, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:39:40,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:41,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:39:42,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:42,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:42,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:39:43,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 05:39:44,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:48,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:39:48,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=769693.3333333334, ans=0.125 2023-10-02 05:39:49,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:39:49,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:39:51,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:39:52,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:39:53,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:55,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:39:57,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 05:39:57,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:39:59,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 05:39:59,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:40:00,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 05:40:02,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 05:40:04,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:40:06,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:40:06,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:40:07,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:13,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:40:14,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:40:19,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:40:19,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:40:20,552 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.829e+02 1.934e+02 2.273e+02 3.864e+02, threshold=3.868e+02, percent-clipped=0.0 2023-10-02 05:40:20,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:40:24,398 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.68 vs. limit=15.0 2023-10-02 05:40:26,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:40:26,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:40:33,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:40:34,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:40:43,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:40:44,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:46,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 05:40:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 05:40:48,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:48,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 05:40:50,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:40:51,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 05:40:52,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.50 vs. limit=15.0 2023-10-02 05:40:54,152 INFO [train.py:1046] (2/4) Epoch 22, batch 3950, loss[loss=0.1667, simple_loss=0.239, pruned_loss=0.04723, over 23506.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2479, pruned_loss=0.04805, over 4710375.80 frames. ], batch size: 134, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:40:56,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:40:58,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 05:40:58,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:41:01,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:41:03,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=770026.6666666666, ans=0.1 2023-10-02 05:41:04,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:41:09,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=770093.3333333334, ans=0.125 2023-10-02 05:41:09,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=770093.3333333334, ans=0.2 2023-10-02 05:41:10,430 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 05:41:10,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:41:10,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 05:41:11,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 05:41:11,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:41:14,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:41:14,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:41:14,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:41:17,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 05:41:19,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:41:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:41:21,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:41:21,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:41:22,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:41:27,532 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:41:30,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=770160.0, ans=0.5 2023-10-02 05:41:31,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:41:33,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:41:34,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=770160.0, ans=0.125 2023-10-02 05:41:38,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 05:41:43,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 05:41:43,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 05:41:43,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:41:44,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:41:45,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=770226.6666666666, ans=0.125 2023-10-02 05:41:45,130 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:41:52,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:41:52,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:41:54,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:41:54,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:41:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 05:41:58,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:42:00,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:42:04,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 05:42:09,558 INFO [train.py:1046] (2/4) Epoch 22, batch 4000, loss[loss=0.2029, simple_loss=0.2809, pruned_loss=0.06245, over 24324.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2488, pruned_loss=0.04833, over 4711269.94 frames. ], batch size: 77, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:42:12,110 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.45 vs. limit=15.0 2023-10-02 05:42:14,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:20,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:26,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:42:26,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:42:27,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:27,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 05:42:27,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:42:27,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=770426.6666666666, ans=0.2 2023-10-02 05:42:28,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 05:42:28,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:42:28,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 05:42:30,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=770426.6666666666, ans=0.1 2023-10-02 05:42:32,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:42:34,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:42:34,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:42:34,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:42:34,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:42:34,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:42:36,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:42:36,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=770426.6666666666, ans=0.125 2023-10-02 05:42:38,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 05:42:38,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:42:39,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:42:41,042 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 05:42:42,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:42:42,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:42:43,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=770493.3333333334, ans=0.0 2023-10-02 05:42:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 05:42:48,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.34 vs. limit=6.0 2023-10-02 05:42:48,839 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.409e+02 1.823e+02 2.125e+02 2.454e+02 3.151e+02, threshold=4.250e+02, percent-clipped=0.0 2023-10-02 05:42:48,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:42:51,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=770560.0, ans=0.0 2023-10-02 05:42:53,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:42:54,729 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 05:42:56,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:42:57,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 05:42:57,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:42:59,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:42:59,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:43:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:43:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:43:02,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:43:02,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 05:43:02,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:43:04,033 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 05:43:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:43:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 05:43:14,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:43:14,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:43:14,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=770626.6666666666, ans=0.125 2023-10-02 05:43:15,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:43:15,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:43:20,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:43:22,901 INFO [train.py:1046] (2/4) Epoch 22, batch 4050, loss[loss=0.2236, simple_loss=0.2833, pruned_loss=0.082, over 19404.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2491, pruned_loss=0.04847, over 4709383.70 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:43:24,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:43:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 05:43:26,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:43:28,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:43:28,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:43:29,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:43:29,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:43:29,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=770693.3333333334, ans=0.125 2023-10-02 05:43:33,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:43:38,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:43:39,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:43:41,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:43:43,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:43:46,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:43:46,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:43:49,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 05:43:50,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 05:43:51,806 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 05:43:54,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:44:02,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 05:44:03,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:44:08,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:44:11,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:44:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:44:12,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:44:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:44:17,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 05:44:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:44:17,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=770893.3333333334, ans=0.05 2023-10-02 05:44:19,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:44:20,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 05:44:24,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:44:29,723 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:44:29,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=770960.0, ans=0.0 2023-10-02 05:44:30,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 05:44:32,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:44:32,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:44:33,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=770960.0, ans=0.125 2023-10-02 05:44:35,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 05:44:35,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 05:44:35,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:38,147 INFO [train.py:1046] (2/4) Epoch 22, batch 4100, loss[loss=0.2295, simple_loss=0.2925, pruned_loss=0.08329, over 19517.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2511, pruned_loss=0.04956, over 4697401.25 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:44:38,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:44:39,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:39,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:44:41,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=771026.6666666666, ans=0.125 2023-10-02 05:44:47,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 05:44:47,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=771026.6666666666, ans=0.0 2023-10-02 05:44:48,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 05:44:50,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 05:44:51,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 05:44:51,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:51,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:53,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:53,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:44:54,572 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 05:44:54,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=771093.3333333334, ans=0.0 2023-10-02 05:44:57,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:44:57,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:44:57,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:59,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:44:59,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.98 vs. limit=15.0 2023-10-02 05:45:00,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=771093.3333333334, ans=0.125 2023-10-02 05:45:01,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:45:03,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:45:03,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:45:05,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 05:45:05,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:45:05,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:45:05,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:45:05,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:45:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 05:45:09,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:09,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 05:45:10,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:45:12,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:45:12,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 05:45:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:45:14,771 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-10-02 05:45:16,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:45:17,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:45:18,693 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.910e+02 2.108e+02 2.331e+02 3.295e+02, threshold=4.216e+02, percent-clipped=0.0 2023-10-02 05:45:18,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 05:45:19,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=771160.0, ans=0.125 2023-10-02 05:45:20,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:45:21,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:45:23,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 05:45:24,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:45:24,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:45:27,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:30,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=771226.6666666666, ans=0.2 2023-10-02 05:45:34,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:45:37,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:45:37,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=771293.3333333334, ans=0.125 2023-10-02 05:45:39,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:45:45,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=771293.3333333334, ans=0.125 2023-10-02 05:45:46,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:45:46,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:51,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:45:51,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:45:52,888 INFO [train.py:1046] (2/4) Epoch 22, batch 4150, loss[loss=0.1637, simple_loss=0.2381, pruned_loss=0.04468, over 18309.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2508, pruned_loss=0.04916, over 4702349.64 frames. ], batch size: 40, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:45:53,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=771360.0, ans=10.0 2023-10-02 05:45:54,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:45:55,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:45:55,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:45:55,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:45:59,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 05:45:59,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:46:00,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 05:46:00,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 05:46:00,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 05:46:03,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:46:06,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:46:06,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:10,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:12,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:46:14,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:46:14,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:46:14,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:46:16,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:46:20,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:23,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:46:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 05:46:26,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 05:46:26,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:46:28,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 05:46:28,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:46:28,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:46:32,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:33,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:36,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 05:46:37,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=771560.0, ans=0.125 2023-10-02 05:46:39,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:46:40,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:46:41,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 05:46:42,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:46:44,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 05:46:46,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:46:47,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:46:48,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-10-02 05:46:49,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:50,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 05:46:50,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:46:50,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:46:51,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=771560.0, ans=15.0 2023-10-02 05:46:51,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:46:54,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 05:46:56,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:56,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:46:56,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:46:56,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 05:46:56,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:56,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:46:57,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:59,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:59,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 05:47:00,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:47:06,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:47:06,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 05:47:07,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.96 vs. limit=10.0 2023-10-02 05:47:08,510 INFO [train.py:1046] (2/4) Epoch 22, batch 4200, loss[loss=0.1591, simple_loss=0.2367, pruned_loss=0.04074, over 24452.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2496, pruned_loss=0.049, over 4698549.53 frames. ], batch size: 58, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:47:08,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=771693.3333333334, ans=0.125 2023-10-02 05:47:09,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:47:12,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:47:12,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:47:14,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:47:14,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:47:17,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 05:47:19,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=771693.3333333334, ans=0.2 2023-10-02 05:47:20,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 05:47:20,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:23,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:47:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:47:27,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=771760.0, ans=0.09899494936611666 2023-10-02 05:47:28,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:47:28,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:47:28,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:30,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 05:47:30,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:47:30,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=771760.0, ans=0.125 2023-10-02 05:47:32,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:33,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:47:33,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:47:34,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:47:37,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 05:47:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:43,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:47:43,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:47:46,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:47:46,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:47:50,517 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.871e+02 2.019e+02 2.222e+02 3.341e+02, threshold=4.039e+02, percent-clipped=0.0 2023-10-02 05:47:50,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:47:50,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 05:47:50,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:47:52,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:47:57,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:47:59,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:48:04,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=771893.3333333334, ans=0.0 2023-10-02 05:48:04,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=771893.3333333334, ans=0.125 2023-10-02 05:48:05,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:48:07,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 05:48:09,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:48:09,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=771960.0, ans=0.2 2023-10-02 05:48:14,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:48:14,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:17,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 05:48:20,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:48:23,996 INFO [train.py:1046] (2/4) Epoch 22, batch 4250, loss[loss=0.1493, simple_loss=0.2004, pruned_loss=0.0491, over 19148.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2472, pruned_loss=0.04872, over 4688253.67 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:48:25,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:48:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:48:29,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:33,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:48:33,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 05:48:34,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:48:37,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:39,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:48:43,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:44,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:46,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:48:46,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:48:47,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:48,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:50,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:52,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:48:54,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:48:55,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 05:48:59,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 05:48:59,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:59,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:00,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:49:01,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:49:01,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:49:04,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=772160.0, ans=0.2 2023-10-02 05:49:05,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 05:49:07,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:49:10,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:49:11,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:12,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 05:49:12,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:49:14,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 05:49:16,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:49:17,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:49:20,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:20,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:49:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 05:49:22,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=772293.3333333334, ans=0.0 2023-10-02 05:49:24,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:49:24,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:49:28,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:31,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:49:33,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:49:34,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:49:36,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:49:36,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:49:36,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 05:49:37,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:38,940 INFO [train.py:1046] (2/4) Epoch 22, batch 4300, loss[loss=0.1674, simple_loss=0.2591, pruned_loss=0.0378, over 24268.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2479, pruned_loss=0.04835, over 4702944.89 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:49:43,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:49:44,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:49:47,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:48,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=772360.0, ans=0.125 2023-10-02 05:49:53,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=772426.6666666666, ans=0.1 2023-10-02 05:49:56,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:56,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 05:49:58,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:49:59,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:49:59,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:49:59,601 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 05:50:03,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:50:05,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:50:06,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.41 vs. limit=10.0 2023-10-02 05:50:09,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 05:50:09,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:50:09,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 05:50:11,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=772493.3333333334, ans=0.1 2023-10-02 05:50:12,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:50:12,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:50:12,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=772493.3333333334, ans=0.125 2023-10-02 05:50:14,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:50:14,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:50:16,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:50:18,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:50:18,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:50:19,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.860e+02 2.141e+02 2.356e+02 3.803e+02, threshold=4.281e+02, percent-clipped=0.0 2023-10-02 05:50:20,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 05:50:20,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 05:50:21,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:50:24,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:24,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:50:24,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:24,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:50:24,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 05:50:24,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 05:50:26,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 05:50:28,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:50:28,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 05:50:28,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 05:50:32,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:50:33,641 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 05:50:33,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:50:33,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=772560.0, ans=0.1 2023-10-02 05:50:36,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:50:36,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:50:39,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 05:50:39,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:50:39,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:40,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:50:40,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:50:41,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:50:43,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:50:44,724 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.59 vs. limit=15.0 2023-10-02 05:50:45,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:50:46,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:47,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:50:48,880 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:50:51,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 05:50:52,855 INFO [train.py:1046] (2/4) Epoch 22, batch 4350, loss[loss=0.1887, simple_loss=0.2576, pruned_loss=0.05991, over 23733.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2484, pruned_loss=0.04813, over 4709911.73 frames. ], batch size: 149, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:50:52,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:50:58,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:00,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:51:02,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:51:02,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:51:06,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:51:09,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:51:11,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:51:11,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:51:13,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772760.0, ans=0.1 2023-10-02 05:51:15,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:51:17,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:51:18,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:51:23,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 05:51:25,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:25,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:31,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:34,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 05:51:35,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:51:37,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:51:42,017 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 05:51:43,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:51:43,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:51:44,858 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 05:51:46,186 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 05:51:46,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:51:46,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:46,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:51:46,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=772893.3333333334, ans=0.125 2023-10-02 05:51:48,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:51:49,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:51:49,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:51:52,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 05:51:52,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:52,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:51:54,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:54,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 05:51:54,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=772960.0, ans=0.125 2023-10-02 05:51:57,037 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 05:51:57,041 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 05:51:57,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 05:52:00,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:52:00,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:52:00,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:01,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:52:03,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 05:52:06,148 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 05:52:06,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:08,847 INFO [train.py:1046] (2/4) Epoch 22, batch 4400, loss[loss=0.1796, simple_loss=0.2624, pruned_loss=0.04841, over 24364.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2496, pruned_loss=0.04854, over 4716546.07 frames. ], batch size: 77, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:52:08,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:52:08,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:11,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=773026.6666666666, ans=0.0 2023-10-02 05:52:13,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:52:14,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 05:52:14,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 05:52:16,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 05:52:16,357 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 05:52:17,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:52:17,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:52:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 05:52:22,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=773093.3333333334, ans=0.5 2023-10-02 05:52:24,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:25,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:25,340 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 05:52:26,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.60 vs. limit=12.0 2023-10-02 05:52:26,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:26,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 05:52:26,868 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 05:52:29,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 05:52:31,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 05:52:31,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 05:52:31,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:33,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:52:33,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:52:35,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=773093.3333333334, ans=0.125 2023-10-02 05:52:36,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:52:36,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 05:52:36,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 05:52:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:39,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:52:39,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:40,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:42,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:42,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 05:52:42,172 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 05:52:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:50,691 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.791e+02 2.025e+02 2.337e+02 3.385e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 05:52:52,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:52:55,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 05:52:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:53:01,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:53:03,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:53:05,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 05:53:05,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:53:05,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:53:05,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:53:06,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:53:10,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 05:53:13,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 05:53:14,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 05:53:14,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:53:14,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 05:53:16,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:53:22,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:53:24,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 05:53:25,754 INFO [train.py:1046] (2/4) Epoch 22, batch 4450, loss[loss=0.1809, simple_loss=0.2531, pruned_loss=0.05441, over 23412.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.25, pruned_loss=0.04861, over 4727967.93 frames. ], batch size: 119, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:53:29,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:53:30,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:31,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.75 vs. limit=22.5 2023-10-02 05:53:32,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:53:37,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:53:37,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:53:40,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:43,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:53:43,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=773426.6666666666, ans=0.1 2023-10-02 05:53:44,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:53:44,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:53:47,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 05:53:47,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:53:47,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:47,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:53:47,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:53:50,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:53:57,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:53:57,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:53:58,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:54:00,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:54:00,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:54:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:54:07,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 05:54:07,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 05:54:07,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:54:10,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:54:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 05:54:15,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:54:18,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:54:18,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 05:54:18,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:18,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:54:20,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:54:20,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:54:21,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:54:24,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:54:24,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 05:54:26,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:54:27,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:54:29,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:54:30,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:30,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:54:33,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:54:36,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 05:54:37,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=773626.6666666666, ans=0.05 2023-10-02 05:54:38,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:54:41,146 INFO [train.py:1046] (2/4) Epoch 22, batch 4500, loss[loss=0.1604, simple_loss=0.2264, pruned_loss=0.04723, over 19199.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2498, pruned_loss=0.04834, over 4736326.64 frames. ], batch size: 42, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:54:41,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=773693.3333333334, ans=0.125 2023-10-02 05:54:42,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:54:42,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 05:54:42,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 05:54:45,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:54:50,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:50,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=773693.3333333334, ans=0.125 2023-10-02 05:54:51,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:54:51,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:54:53,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:54:54,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:54:54,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:55:00,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.00 vs. limit=15.0 2023-10-02 05:55:01,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.44 vs. limit=15.0 2023-10-02 05:55:08,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:55:10,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:55:11,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=773826.6666666666, ans=0.125 2023-10-02 05:55:12,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:55:12,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:55:14,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:55:18,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:55:22,909 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.844e+02 2.070e+02 2.423e+02 3.586e+02, threshold=4.140e+02, percent-clipped=0.0 2023-10-02 05:55:24,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:55:27,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:55:30,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:55:30,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 05:55:32,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:32,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:55:33,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:55:34,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:55:35,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:55:35,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 05:55:35,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:55:35,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:40,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:55:40,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:55:42,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:43,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=773960.0, ans=0.1 2023-10-02 05:55:47,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:55:47,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:55:48,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 05:55:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 05:55:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 05:55:53,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 05:55:55,881 INFO [train.py:1046] (2/4) Epoch 22, batch 4550, loss[loss=0.1765, simple_loss=0.2627, pruned_loss=0.04513, over 24695.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2494, pruned_loss=0.04815, over 4735338.68 frames. ], batch size: 73, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:55:55,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 05:55:57,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:56:00,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:56:01,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:56:05,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:11,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:56:12,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:56:15,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:15,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:56:15,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:17,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:17,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:56:20,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:56:20,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=774093.3333333334, ans=0.125 2023-10-02 05:56:23,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 05:56:23,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 05:56:23,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:56:24,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 05:56:29,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 05:56:29,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:56:32,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 05:56:34,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:56:35,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:35,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:35,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:56:38,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 05:56:41,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:56:44,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:44,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:56:45,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:47,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 05:56:47,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 05:56:47,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:56:47,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=774226.6666666666, ans=0.125 2023-10-02 05:56:48,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 05:56:48,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 05:56:50,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:52,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:52,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:56:53,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:53,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:56:55,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:56:56,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 05:56:56,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:56:56,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 05:56:58,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 05:56:58,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:56:58,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 05:57:02,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:57:02,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:57:04,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:57:05,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:57:05,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:57:07,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:57:09,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:57:11,879 INFO [train.py:1046] (2/4) Epoch 22, batch 4600, loss[loss=0.1466, simple_loss=0.1943, pruned_loss=0.04945, over 19019.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2476, pruned_loss=0.04811, over 4705229.21 frames. ], batch size: 388, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:57:11,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:12,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:57:13,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=774360.0, ans=0.125 2023-10-02 05:57:14,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:57:14,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:57:14,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:16,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 05:57:18,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:57:23,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:57:24,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:29,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:30,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.63 vs. limit=15.0 2023-10-02 05:57:36,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 05:57:36,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=774426.6666666666, ans=0.2 2023-10-02 05:57:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:41,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:44,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:57:44,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:46,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=774493.3333333334, ans=0.1 2023-10-02 05:57:49,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 05:57:49,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:57:49,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=774493.3333333334, ans=0.0 2023-10-02 05:57:50,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:57:53,313 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.872e+02 2.041e+02 2.269e+02 3.286e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-02 05:57:53,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.40 vs. limit=12.0 2023-10-02 05:57:56,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:56,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:57:57,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:58:01,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 05:58:04,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:58:07,809 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:58:09,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:10,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:58:13,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:13,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 05:58:13,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:13,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 05:58:14,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:15,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:15,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:16,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:58:17,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:17,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 05:58:18,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 05:58:19,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 05:58:19,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:20,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:58:21,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:22,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:26,299 INFO [train.py:1046] (2/4) Epoch 22, batch 4650, loss[loss=0.1546, simple_loss=0.2371, pruned_loss=0.036, over 24674.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2469, pruned_loss=0.04768, over 4700830.41 frames. ], batch size: 65, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:58:28,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.48 vs. limit=15.0 2023-10-02 05:58:31,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:58:32,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:58:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:34,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:58:34,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:34,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:58:37,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:40,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 05:58:44,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:58:47,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 05:58:47,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:58:49,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 05:58:49,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:58:49,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 05:58:49,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 05:58:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:50,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:58:53,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:58:54,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:58:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 05:58:57,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:00,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 05:59:02,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:02,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:59:03,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 05:59:05,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:59:08,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:59:11,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:14,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=774893.3333333334, ans=0.0 2023-10-02 05:59:15,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:19,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:20,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:20,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:59:23,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 05:59:23,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 05:59:23,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 05:59:23,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 05:59:24,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:27,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=774960.0, ans=0.125 2023-10-02 05:59:28,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=774960.0, ans=0.125 2023-10-02 05:59:31,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:59:31,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:59:33,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 05:59:33,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:34,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:59:34,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:59:35,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:59:36,352 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.35 vs. limit=10.0 2023-10-02 05:59:39,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:59:39,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:59:40,742 INFO [train.py:1046] (2/4) Epoch 22, batch 4700, loss[loss=0.1682, simple_loss=0.2541, pruned_loss=0.04116, over 24602.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2479, pruned_loss=0.04765, over 4713243.19 frames. ], batch size: 68, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:59:40,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:43,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:43,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:59:43,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:59:43,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=775026.6666666666, ans=0.0 2023-10-02 05:59:45,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 05:59:45,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:59:46,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 05:59:52,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=775026.6666666666, ans=0.125 2023-10-02 05:59:53,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:55,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:55,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:59:56,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:59:56,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=775093.3333333334, ans=0.125 2023-10-02 05:59:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:00:00,358 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.15 vs. limit=10.0 2023-10-02 06:00:02,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 06:00:02,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 06:00:05,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:06,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:00:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:00:11,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=775160.0, ans=0.125 2023-10-02 06:00:12,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:14,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=15.0 2023-10-02 06:00:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:00:19,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 06:00:21,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:00:22,597 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.844e+02 2.076e+02 2.631e+02 3.750e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 06:00:26,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 06:00:26,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:00:28,429 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:00:29,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:33,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 06:00:35,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:00:37,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=775226.6666666666, ans=0.125 2023-10-02 06:00:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:00:40,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 06:00:42,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:42,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:00:46,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:46,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:00:46,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 06:00:47,598 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 06:00:48,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:00:49,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=775293.3333333334, ans=0.125 2023-10-02 06:00:50,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=775293.3333333334, ans=0.2 2023-10-02 06:00:52,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:52,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:52,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 06:00:52,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:54,892 INFO [train.py:1046] (2/4) Epoch 22, batch 4750, loss[loss=0.1741, simple_loss=0.2582, pruned_loss=0.04503, over 24651.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2485, pruned_loss=0.04774, over 4722718.19 frames. ], batch size: 65, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:00:55,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=775360.0, ans=0.1 2023-10-02 06:00:56,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 06:00:59,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:01:00,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:02,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=775360.0, ans=0.125 2023-10-02 06:01:03,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:03,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:01:05,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 06:01:05,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:09,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 06:01:09,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:01:09,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:01:10,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=775426.6666666666, ans=0.1 2023-10-02 06:01:11,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:12,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=775426.6666666666, ans=0.0 2023-10-02 06:01:17,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 06:01:19,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=775426.6666666666, ans=0.125 2023-10-02 06:01:20,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:01:24,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 06:01:24,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:26,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:01:26,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:01:26,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 06:01:29,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 06:01:32,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 06:01:33,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:34,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=775493.3333333334, ans=0.0 2023-10-02 06:01:37,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:01:38,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:01:38,622 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 06:01:38,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:01:40,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=775560.0, ans=0.09899494936611666 2023-10-02 06:01:41,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:01:46,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:01:48,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 06:01:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 06:01:50,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:50,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:01:52,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 06:01:52,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 06:01:55,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 06:01:57,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:01:59,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:01:59,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 06:01:59,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:02:00,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:02,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:02:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:02,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=775626.6666666666, ans=0.125 2023-10-02 06:02:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:02:05,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=775626.6666666666, ans=0.0 2023-10-02 06:02:07,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:02:08,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 06:02:08,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 06:02:09,631 INFO [train.py:1046] (2/4) Epoch 22, batch 4800, loss[loss=0.1786, simple_loss=0.2561, pruned_loss=0.05051, over 23243.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.25, pruned_loss=0.04851, over 4721935.96 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 32.0 2023-10-02 06:02:09,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 06:02:12,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:02:14,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:02:15,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 06:02:19,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=775693.3333333334, ans=0.125 2023-10-02 06:02:20,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:20,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:25,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:02:27,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:02:27,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:27,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 06:02:28,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:02:28,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:02:29,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:02:35,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:02:36,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:36,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:02:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:38,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 06:02:38,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:40,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:02:41,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:44,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:44,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-02 06:02:48,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:48,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:02:49,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 06:02:52,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:52,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 06:02:54,241 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.869e+02 2.092e+02 2.385e+02 4.135e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 06:02:54,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 06:02:54,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:54,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:02:55,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:02:55,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:02:55,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:02:57,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:02:57,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:03:01,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:03:04,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:04,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:08,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 06:03:10,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:03:10,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:10,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:03:10,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:03:14,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:03:16,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:03:16,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:16,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:03:16,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:03:18,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:03:22,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:22,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:22,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:03:22,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=775960.0, ans=0.1 2023-10-02 06:03:25,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 06:03:26,632 INFO [train.py:1046] (2/4) Epoch 22, batch 4850, loss[loss=0.1713, simple_loss=0.2476, pruned_loss=0.04748, over 24490.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2506, pruned_loss=0.04866, over 4700778.67 frames. ], batch size: 63, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:03:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 06:03:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:03:26,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:03:28,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:03:28,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:31,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:03:35,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 06:03:38,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:42,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:03:42,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=776093.3333333334, ans=0.125 2023-10-02 06:03:44,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:03:44,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:48,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:50,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:03:51,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:03:51,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 06:03:56,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:03:57,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:03:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:03:58,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=776160.0, ans=0.125 2023-10-02 06:03:59,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:03:59,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 06:03:59,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=776160.0, ans=0.1 2023-10-02 06:04:00,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:04:00,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:04,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:05,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 06:04:06,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 06:04:07,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:04:14,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:04:14,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 06:04:15,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:04:15,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:04:18,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:04:20,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 06:04:20,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:21,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 06:04:22,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:04:23,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:04:23,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 06:04:30,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:37,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:04:37,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:04:39,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=776360.0, ans=0.07 2023-10-02 06:04:40,414 INFO [train.py:1046] (2/4) Epoch 22, batch 4900, loss[loss=0.1736, simple_loss=0.2238, pruned_loss=0.06169, over 19220.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2484, pruned_loss=0.04827, over 4699460.67 frames. ], batch size: 388, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:04:42,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 06:04:43,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:04:49,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:04:49,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=776360.0, ans=0.0 2023-10-02 06:04:51,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:04:51,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:04:53,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 06:04:54,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=776426.6666666666, ans=0.125 2023-10-02 06:04:59,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 06:05:03,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 06:05:03,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 06:05:03,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:05:03,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:05:03,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=776426.6666666666, ans=0.1 2023-10-02 06:05:04,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:05:04,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:05:04,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:05:05,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 06:05:07,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 06:05:09,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:05:10,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:05:11,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:05:13,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:05:15,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:05:16,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:16,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 06:05:16,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=776493.3333333334, ans=0.2 2023-10-02 06:05:18,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:05:18,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:05:18,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 06:05:18,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 06:05:24,335 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.396e+02 1.841e+02 1.995e+02 2.206e+02 2.989e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 06:05:24,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 06:05:26,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:05:26,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:05:27,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:05:27,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:05:27,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 06:05:29,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:05:29,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 06:05:31,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:34,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:05:34,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=776560.0, ans=0.125 2023-10-02 06:05:35,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:05:38,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 06:05:38,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:05:39,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 06:05:39,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 06:05:45,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:05:45,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=776626.6666666666, ans=0.0 2023-10-02 06:05:47,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:05:48,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 06:05:48,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=776626.6666666666, ans=0.0 2023-10-02 06:05:49,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:05:49,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:05:51,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:54,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:05:56,717 INFO [train.py:1046] (2/4) Epoch 22, batch 4950, loss[loss=0.1846, simple_loss=0.2494, pruned_loss=0.05988, over 23534.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2473, pruned_loss=0.04821, over 4699923.19 frames. ], batch size: 256, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:05:56,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:05:56,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:05:56,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 06:05:58,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:06:01,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:06:01,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:06:03,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=776693.3333333334, ans=0.1 2023-10-02 06:06:04,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 06:06:05,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 06:06:05,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:06:07,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 06:06:07,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:07,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:06:07,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:06:07,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:10,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:06:10,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:06:12,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:06:12,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:06:14,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:15,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:06:19,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:06:24,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:24,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:06:27,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:27,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:27,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=776826.6666666666, ans=0.125 2023-10-02 06:06:28,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:06:29,708 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.16 vs. limit=15.0 2023-10-02 06:06:30,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 06:06:30,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 06:06:32,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:06:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:06:35,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:06:36,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:06:37,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:06:39,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:06:39,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=776826.6666666666, ans=0.025 2023-10-02 06:06:40,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=12.0 2023-10-02 06:06:41,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:06:43,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:06:45,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:46,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:47,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 06:06:47,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:06:49,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:06:51,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.08 vs. limit=12.0 2023-10-02 06:06:52,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:06:53,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:06:53,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:06:54,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:55,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:06:56,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:06:58,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:06:59,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:06:59,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:07:01,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 06:07:04,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:04,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=776960.0, ans=0.1 2023-10-02 06:07:10,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 06:07:10,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:07:11,425 INFO [train.py:1046] (2/4) Epoch 22, batch 5000, loss[loss=0.1954, simple_loss=0.2674, pruned_loss=0.06164, over 23272.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2466, pruned_loss=0.048, over 4696716.57 frames. ], batch size: 105, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:07:16,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:07:16,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:07:19,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 06:07:19,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 06:07:19,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=777026.6666666666, ans=0.0 2023-10-02 06:07:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:07:24,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 06:07:24,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:07:24,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:07:26,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 06:07:27,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:27,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:07:27,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 06:07:27,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:28,949 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.14 vs. limit=15.0 2023-10-02 06:07:29,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:07:31,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 06:07:31,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 06:07:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:07:32,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 06:07:32,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:07:33,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:34,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:07:34,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 06:07:34,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 06:07:36,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 06:07:36,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=777093.3333333334, ans=0.04949747468305833 2023-10-02 06:07:37,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:37,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:37,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 06:07:37,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:07:39,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=777093.3333333334, ans=0.2 2023-10-02 06:07:40,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:41,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:43,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 06:07:44,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 06:07:46,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:07:46,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.56 vs. limit=6.0 2023-10-02 06:07:47,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:07:52,319 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 06:07:53,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:07:55,098 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.890e+02 2.101e+02 2.537e+02 4.736e+02, threshold=4.203e+02, percent-clipped=2.0 2023-10-02 06:07:55,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:55,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:07:59,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 06:07:59,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:59,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:08:01,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:08:01,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=777226.6666666666, ans=0.1 2023-10-02 06:08:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 06:08:03,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:08:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:08:07,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:12,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 06:08:14,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=777293.3333333334, ans=0.0 2023-10-02 06:08:15,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=777293.3333333334, ans=0.125 2023-10-02 06:08:16,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:25,964 INFO [train.py:1046] (2/4) Epoch 22, batch 5050, loss[loss=0.1888, simple_loss=0.2703, pruned_loss=0.05365, over 24564.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2474, pruned_loss=0.04754, over 4722042.00 frames. ], batch size: 71, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:08:26,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:08:26,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=777360.0, ans=0.1 2023-10-02 06:08:27,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:27,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:08:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:08:28,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:08:28,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:08:28,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:34,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:34,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 06:08:35,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:08:38,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:08:39,466 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.40 vs. limit=15.0 2023-10-02 06:08:40,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:08:40,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 06:08:40,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:41,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:08:43,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:08:44,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:08:45,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:08:46,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.whiten.whitening_limit, batch_count=777426.6666666666, ans=12.0 2023-10-02 06:08:53,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 06:08:54,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:08:54,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:08:54,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 06:08:56,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:08:57,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:08:57,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:57,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:08:59,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 06:08:59,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 06:09:00,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:09:03,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:04,403 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.53 vs. limit=15.0 2023-10-02 06:09:05,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:09:07,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 06:09:09,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:09:11,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 06:09:13,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:09:13,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:09:13,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:09:14,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:09:17,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:09:18,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:09:19,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.93 vs. limit=15.0 2023-10-02 06:09:20,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:20,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:09:20,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:09:20,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 06:09:21,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.61 vs. limit=12.0 2023-10-02 06:09:21,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:09:21,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:09:26,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:09:26,398 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 06:09:26,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:09:27,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:09:29,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:29,121 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 06:09:29,918 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.88 vs. limit=15.0 2023-10-02 06:09:33,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:33,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 06:09:33,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:35,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:09:37,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:37,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 06:09:39,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 06:09:42,303 INFO [train.py:1046] (2/4) Epoch 22, batch 5100, loss[loss=0.1934, simple_loss=0.2727, pruned_loss=0.057, over 23872.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2488, pruned_loss=0.04818, over 4718821.91 frames. ], batch size: 86, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:09:42,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:09:42,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:09:43,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:09:46,512 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 06:09:47,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:49,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 06:09:50,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 06:09:50,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:09:52,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:09:55,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:09:56,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 06:09:56,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 06:10:01,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:10:02,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:10:04,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:10:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 06:10:08,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:10:08,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=777760.0, ans=0.1 2023-10-02 06:10:10,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:10:10,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 06:10:13,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:15,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:15,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 06:10:16,905 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 06:10:18,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:18,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 06:10:18,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 06:10:18,524 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:10:21,409 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:10:22,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:10:25,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=777893.3333333334, ans=0.0 2023-10-02 06:10:26,630 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.838e+02 2.102e+02 2.485e+02 3.822e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-02 06:10:31,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:10:34,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 06:10:34,201 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 06:10:34,211 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 06:10:37,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 06:10:37,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:38,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 06:10:42,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 06:10:43,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 06:10:44,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=777960.0, ans=0.1 2023-10-02 06:10:45,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:10:47,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 06:10:49,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:10:51,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 06:10:55,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:10:55,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:10:55,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:10:57,405 INFO [train.py:1046] (2/4) Epoch 22, batch 5150, loss[loss=0.1987, simple_loss=0.2735, pruned_loss=0.06196, over 23315.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2491, pruned_loss=0.0484, over 4714391.18 frames. ], batch size: 105, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:10:57,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:10:57,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:10:57,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:10:58,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 06:10:58,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 06:11:00,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 06:11:00,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:11:00,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 06:11:02,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:02,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 06:11:03,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=778026.6666666666, ans=0.0 2023-10-02 06:11:04,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:05,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:11,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:11:12,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 06:11:13,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:13,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:11:13,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=778093.3333333334, ans=0.125 2023-10-02 06:11:15,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:11:15,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:11:15,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:11:15,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:11:16,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:11:16,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 06:11:18,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:11:19,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:11:21,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:11:23,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.83 vs. limit=6.0 2023-10-02 06:11:24,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 06:11:25,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.35 vs. limit=22.5 2023-10-02 06:11:25,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:11:31,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:11:33,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 06:11:36,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:11:42,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:11:43,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:46,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:11:48,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:11:51,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 06:11:54,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:55,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:11:55,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:11:58,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:11:59,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:12:01,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 06:12:04,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:12:06,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:12:07,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=778293.3333333334, ans=0.5 2023-10-02 06:12:08,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:12:08,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:12:10,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:12:10,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:12:10,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:12:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:12:12,059 INFO [train.py:1046] (2/4) Epoch 22, batch 5200, loss[loss=0.1674, simple_loss=0.2395, pruned_loss=0.04762, over 23151.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.25, pruned_loss=0.04835, over 4725284.43 frames. ], batch size: 51, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:12:15,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:12:15,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:12:18,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:21,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 06:12:23,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:12:24,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:27,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:27,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:12:27,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:29,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 06:12:32,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:12:32,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:12:35,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 06:12:37,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:12:38,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:12:40,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 06:12:40,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 06:12:42,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.75 vs. limit=10.0 2023-10-02 06:12:43,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 06:12:43,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:12:43,536 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 06:12:43,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:44,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:12:45,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=778493.3333333334, ans=0.125 2023-10-02 06:12:46,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:12:46,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 06:12:48,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:12:51,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:52,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 06:12:54,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 06:12:54,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 06:12:57,089 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.869e+02 2.074e+02 2.412e+02 3.434e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 06:12:58,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 06:12:58,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:13:02,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:13:02,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:04,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 06:13:06,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:13:06,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:13:06,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:07,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:13:09,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:13:10,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:13:14,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:13:16,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:16,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:19,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=778626.6666666666, ans=0.125 2023-10-02 06:13:19,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=778626.6666666666, ans=0.125 2023-10-02 06:13:23,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:23,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 06:13:23,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=778626.6666666666, ans=0.125 2023-10-02 06:13:24,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:13:24,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:13:25,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:25,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:13:27,241 INFO [train.py:1046] (2/4) Epoch 22, batch 5250, loss[loss=0.1687, simple_loss=0.2543, pruned_loss=0.04148, over 24641.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.249, pruned_loss=0.04789, over 4721185.34 frames. ], batch size: 73, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:13:27,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:13:31,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:13:34,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:13:35,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:13:41,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:43,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:13:45,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:13:46,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:13:49,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 06:13:49,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:51,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:36,733 INFO [train.py:1046] (2/4) Epoch 22, batch 5300, loss[loss=0.1674, simple_loss=0.224, pruned_loss=0.05536, over 22752.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2478, pruned_loss=0.04808, over 4705275.44 frames. ], batch size: 322, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:14:51,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:14:51,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 06:14:51,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 06:14:51,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:51,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:51,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:51,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:51,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:51,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:14:51,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:51,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:14:52,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:14:52,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 06:14:52,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 06:14:52,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 06:14:52,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:14:52,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 06:14:52,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 06:14:52,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:53,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:53,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:53,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:14:53,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:14:53,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:14:53,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:53,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:53,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:53,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:53,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:14:53,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:14:54,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 06:14:54,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:14:54,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:54,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 06:14:54,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 06:14:55,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:14:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:14:55,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 06:14:55,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 06:14:55,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:14:55,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:14:55,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:14:56,024 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 06:14:56,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 06:14:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:14:56,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:56,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 06:14:56,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 06:14:56,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 06:14:56,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:14:59,132 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.88 vs. limit=15.0 2023-10-02 06:15:03,812 INFO [train.py:1046] (2/4) Epoch 23, batch 0, loss[loss=0.1805, simple_loss=0.2521, pruned_loss=0.0544, over 23779.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2521, pruned_loss=0.0544, over 23779.00 frames. ], batch size: 179, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:15:03,812 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 06:15:16,807 INFO [train.py:1078] (2/4) Epoch 23, validation: loss=0.2993, simple_loss=0.2685, pruned_loss=0.165, over 1125622.00 frames. 2023-10-02 06:15:16,808 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 06:15:18,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 06:15:19,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.76 vs. limit=15.0 2023-10-02 06:15:20,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:15:20,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=779106.6666666666, ans=0.125 2023-10-02 06:15:21,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:15:22,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=779106.6666666666, ans=0.07 2023-10-02 06:15:24,095 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-10-02 06:15:26,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:26,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:15:27,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:27,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 06:15:29,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 06:15:32,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:33,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:36,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:36,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:37,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:15:37,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:15:39,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 06:15:41,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:15:42,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=779173.3333333334, ans=0.125 2023-10-02 06:15:43,685 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.858e+02 2.101e+02 2.344e+02 3.915e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-02 06:15:48,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:15:48,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:50,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 06:15:54,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:15:54,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:15:56,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:00,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:16:00,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=779306.6666666666, ans=0.0 2023-10-02 06:16:03,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:10,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 06:16:13,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 06:16:13,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:16:13,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:15,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:16:17,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:16:17,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 06:16:20,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:20,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-10-02 06:16:22,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:16:27,630 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 06:16:29,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:16:31,772 INFO [train.py:1046] (2/4) Epoch 23, batch 50, loss[loss=0.1679, simple_loss=0.2525, pruned_loss=0.0417, over 24071.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2479, pruned_loss=0.04744, over 1069932.93 frames. ], batch size: 80, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:16:31,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:16:35,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:16:35,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 06:16:35,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:16:35,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:16:37,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:16:39,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:16:40,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:16:42,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 06:16:43,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:45,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=779506.6666666666, ans=0.125 2023-10-02 06:16:50,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:16:52,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=779506.6666666666, ans=0.125 2023-10-02 06:16:53,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 06:16:54,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 06:16:57,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:16:58,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:16:58,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:17:00,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:17:00,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:17:01,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:17:01,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:17:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:17:07,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:09,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:17:09,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 06:17:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:17:12,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:17:12,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 06:17:12,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:17:15,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 06:17:22,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:17:22,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:17:22,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:23,281 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.15 vs. limit=22.5 2023-10-02 06:17:24,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:17:24,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:17:26,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 06:17:28,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 06:17:29,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:29,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:17:30,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:17:32,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:17:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 06:17:32,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 06:17:33,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 06:17:33,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=779706.6666666666, ans=0.0 2023-10-02 06:17:34,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:17:35,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:17:35,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=779706.6666666666, ans=0.125 2023-10-02 06:17:36,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 06:17:36,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 06:17:37,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:17:38,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:40,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:17:40,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:17:43,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:17:45,360 INFO [train.py:1046] (2/4) Epoch 23, batch 100, loss[loss=0.1834, simple_loss=0.2675, pruned_loss=0.04964, over 24413.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.251, pruned_loss=0.04762, over 1890875.78 frames. ], batch size: 77, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:17:47,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:17:50,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:17:51,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 06:17:51,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:52,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=779773.3333333334, ans=0.05 2023-10-02 06:17:56,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:17:56,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:17:56,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:56,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:17:57,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:17:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 06:18:00,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:18:02,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:02,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:02,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:18:03,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=779840.0, ans=0.1 2023-10-02 06:18:04,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 06:18:06,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:07,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:08,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:18:10,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:18:12,034 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.840e+02 2.037e+02 2.251e+02 3.061e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 06:18:13,479 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 06:18:13,493 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 06:18:16,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:16,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:18:20,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:18:21,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:22,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:26,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.42 vs. limit=15.0 2023-10-02 06:18:28,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:28,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 06:18:30,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 06:18:32,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:18:34,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:18:36,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:38,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:41,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:18:43,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:18:45,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:45,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:48,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:48,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:18:48,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:49,889 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.43 vs. limit=15.0 2023-10-02 06:18:50,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 06:18:50,344 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 06:18:51,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:51,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:18:52,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:18:52,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:52,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 06:18:52,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:18:54,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:18:54,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:18:54,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:55,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:55,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:18:57,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:18:59,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:00,312 INFO [train.py:1046] (2/4) Epoch 23, batch 150, loss[loss=0.1799, simple_loss=0.2653, pruned_loss=0.04721, over 24004.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2512, pruned_loss=0.04789, over 2523919.66 frames. ], batch size: 80, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:19:00,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:19:00,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:00,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=780106.6666666666, ans=0.1 2023-10-02 06:19:01,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:04,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:19:04,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:07,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:19:08,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:12,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=780106.6666666666, ans=0.125 2023-10-02 06:19:13,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 06:19:15,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 06:19:15,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 06:19:18,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:19:18,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:19:18,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:19:19,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:19:21,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:19:21,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:21,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:23,114 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 06:19:24,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:19:30,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:34,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:19:34,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 06:19:37,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:19:37,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:19:40,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:19:42,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:19:42,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:19:44,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:44,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 06:19:52,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:52,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:19:53,220 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.07 vs. limit=15.0 2023-10-02 06:19:53,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:19:53,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:19:55,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:58,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.02 vs. limit=22.5 2023-10-02 06:19:58,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 06:20:01,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:20:04,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:20:07,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:08,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:20:08,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 06:20:08,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:20:08,970 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 06:20:11,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:20:14,527 INFO [train.py:1046] (2/4) Epoch 23, batch 200, loss[loss=0.1643, simple_loss=0.2466, pruned_loss=0.041, over 24497.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2522, pruned_loss=0.04797, over 2999576.01 frames. ], batch size: 63, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:20:15,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:20:15,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:20:17,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 06:20:17,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:17,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:20,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 06:20:22,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:20:23,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:24,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:20:30,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:20:30,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:20:30,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:40,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.789e+02 2.004e+02 2.358e+02 3.840e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-02 06:20:49,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:20:49,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:20:50,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:20:52,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:20:52,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 06:20:52,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:20:53,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:20:55,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:20:57,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:57,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:20:58,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 06:21:00,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:21:00,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=780640.0, ans=0.0 2023-10-02 06:21:01,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:01,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=780640.0, ans=0.1 2023-10-02 06:21:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:21:07,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=780640.0, ans=0.0 2023-10-02 06:21:12,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:21:17,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:19,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:21:20,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=780706.6666666666, ans=0.1 2023-10-02 06:21:24,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:26,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 06:21:27,513 INFO [train.py:1046] (2/4) Epoch 23, batch 250, loss[loss=0.1791, simple_loss=0.256, pruned_loss=0.05109, over 23670.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2516, pruned_loss=0.04781, over 3375546.53 frames. ], batch size: 149, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:21:27,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:27,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:21:27,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:21:29,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:21:31,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 06:21:32,740 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.06 vs. limit=22.5 2023-10-02 06:21:33,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:21:33,419 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 06:21:34,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:37,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:21:38,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:38,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:41,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:21:41,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:43,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:21:46,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:21:53,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:21:56,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:21:56,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:22:02,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:22:03,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:22:03,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:22:03,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:22:05,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:22:05,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:22:05,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:22:07,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:22:09,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 06:22:09,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:22:12,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:22:12,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:22:12,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:22:13,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:22:15,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:22:15,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:22:17,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:19,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:22:19,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:19,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=780973.3333333334, ans=0.125 2023-10-02 06:22:20,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=780973.3333333334, ans=0.125 2023-10-02 06:22:22,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:22:24,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:28,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:22:29,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=781040.0, ans=0.0 2023-10-02 06:22:33,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:34,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:22:37,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=781040.0, ans=0.0 2023-10-02 06:22:38,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 06:22:40,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:22:40,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:22:42,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 06:22:42,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:22:43,423 INFO [train.py:1046] (2/4) Epoch 23, batch 300, loss[loss=0.1851, simple_loss=0.2607, pruned_loss=0.05478, over 23414.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2484, pruned_loss=0.04761, over 3659640.91 frames. ], batch size: 106, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:22:43,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:22:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 06:22:48,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:50,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:22:51,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:22:53,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 06:22:55,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:56,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:22:56,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 06:22:56,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:00,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:23:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:23:05,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 06:23:06,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 06:23:08,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:10,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:11,717 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.930e+02 2.119e+02 2.410e+02 3.837e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-02 06:23:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 06:23:13,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:23:14,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:23:17,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:23:17,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:23:21,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:23:21,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 06:23:22,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:23:26,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:27,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 06:23:28,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:23:33,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:23:37,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:23:37,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 06:23:41,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:41,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:23:42,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:44,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:23:44,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 06:23:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:23:46,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:23:47,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 06:23:49,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:49,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:23:50,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:50,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:23:51,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:23:56,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:23:56,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 06:23:58,128 INFO [train.py:1046] (2/4) Epoch 23, batch 350, loss[loss=0.168, simple_loss=0.2449, pruned_loss=0.04551, over 22530.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2481, pruned_loss=0.04684, over 3901439.83 frames. ], batch size: 49, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:23:59,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:02,089 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.69 vs. limit=5.0 2023-10-02 06:24:04,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=781440.0, ans=0.125 2023-10-02 06:24:05,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:24:08,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:10,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:10,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=781440.0, ans=0.0 2023-10-02 06:24:13,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 06:24:15,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:24:15,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 06:24:19,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:19,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 06:24:20,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:24:22,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 06:24:23,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:24:25,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:24:26,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:24:26,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:28,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:28,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:24:28,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:28,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:24:29,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:24:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:33,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=781573.3333333334, ans=0.125 2023-10-02 06:24:38,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:24:38,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:24:38,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:24:38,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:45,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 06:24:45,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:46,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=781640.0, ans=0.05 2023-10-02 06:24:49,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:49,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:24:49,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:24:50,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 06:24:52,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:24:53,649 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 06:24:53,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 06:24:55,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:55,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=781640.0, ans=0.0 2023-10-02 06:24:58,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:24:58,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 06:24:58,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=781706.6666666666, ans=0.125 2023-10-02 06:25:00,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=781706.6666666666, ans=0.0 2023-10-02 06:25:01,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:02,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:25:02,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:04,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:04,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:25:07,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:25:10,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:25:12,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:25:12,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 06:25:12,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:13,647 INFO [train.py:1046] (2/4) Epoch 23, batch 400, loss[loss=0.1708, simple_loss=0.2427, pruned_loss=0.04945, over 23600.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2476, pruned_loss=0.04741, over 4066555.89 frames. ], batch size: 256, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:25:13,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:15,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:25:15,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:17,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:18,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:20,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 06:25:21,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 06:25:21,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:22,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 06:25:24,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:26,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=781773.3333333334, ans=0.125 2023-10-02 06:25:29,332 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:25:30,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:25:30,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:25:30,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 06:25:30,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:25:30,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:31,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:25:31,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:32,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=781840.0, ans=0.2 2023-10-02 06:25:34,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 06:25:35,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 06:25:41,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:42,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:42,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 06:25:43,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 06:25:45,155 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.817e+02 2.027e+02 2.516e+02 3.767e+02, threshold=4.054e+02, percent-clipped=0.0 2023-10-02 06:25:45,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:25:47,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:25:50,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=781906.6666666666, ans=0.0 2023-10-02 06:25:50,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=15.0 2023-10-02 06:25:52,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=781906.6666666666, ans=0.125 2023-10-02 06:25:54,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 06:25:59,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:26:00,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 06:26:02,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:26:03,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:26:04,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 06:26:08,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:26:10,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:26:10,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:26:14,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:15,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 06:26:17,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:26:17,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 06:26:20,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:26:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:26:23,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 06:26:23,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:26:23,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=782040.0, ans=0.125 2023-10-02 06:26:23,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.85 vs. limit=10.0 2023-10-02 06:26:24,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:26:24,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:26:26,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 06:26:26,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:26:27,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:26:27,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:26:27,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 06:26:29,557 INFO [train.py:1046] (2/4) Epoch 23, batch 450, loss[loss=0.1768, simple_loss=0.2475, pruned_loss=0.05305, over 22706.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2477, pruned_loss=0.04751, over 4218243.17 frames. ], batch size: 322, lr: 4.51e-03, grad_scale: 8.0 2023-10-02 06:26:29,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:26:31,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:26:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:26:35,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=782106.6666666666, ans=0.0 2023-10-02 06:26:42,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:26:46,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 06:26:47,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 06:26:49,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.73 vs. limit=15.0 2023-10-02 06:26:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:26:53,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:56,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:26:58,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:26:59,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:27:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 06:27:02,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 06:27:04,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 06:27:04,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:05,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:05,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:27:08,428 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 06:27:08,436 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 06:27:10,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:27:11,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:27:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 06:27:15,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:27:15,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:27:15,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:27:17,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 06:27:20,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:27:23,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:27:23,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:27:25,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 06:27:26,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=782306.6666666666, ans=0.0 2023-10-02 06:27:26,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=782306.6666666666, ans=0.0 2023-10-02 06:27:27,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:27:28,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 06:27:29,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 06:27:30,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:27:35,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:27:37,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:27:37,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=782373.3333333334, ans=0.125 2023-10-02 06:27:38,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:27:38,834 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 06:27:43,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:43,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:27:44,764 INFO [train.py:1046] (2/4) Epoch 23, batch 500, loss[loss=0.1655, simple_loss=0.2401, pruned_loss=0.04548, over 18344.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.248, pruned_loss=0.04786, over 4329783.25 frames. ], batch size: 39, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:27:44,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:44,844 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 06:27:46,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 06:27:46,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:46,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=782440.0, ans=0.0 2023-10-02 06:27:50,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:27:54,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:27:57,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:27:58,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:27:58,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:59,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:09,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:09,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:28:10,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:28:10,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:10,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 06:28:11,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:28:14,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:28:14,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:28:14,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:28:15,793 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.861e+02 2.114e+02 2.319e+02 3.215e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-02 06:28:15,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:15,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 06:28:20,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 06:28:23,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:23,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:25,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:25,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:27,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:28:28,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 06:28:32,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:28:32,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:37,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:28:40,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:44,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:44,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=782706.6666666666, ans=0.0 2023-10-02 06:28:47,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 06:28:47,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:48,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:50,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=782706.6666666666, ans=0.125 2023-10-02 06:28:51,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 06:28:53,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:28:55,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:59,824 INFO [train.py:1046] (2/4) Epoch 23, batch 550, loss[loss=0.1626, simple_loss=0.249, pruned_loss=0.03807, over 24416.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2485, pruned_loss=0.04783, over 4427112.63 frames. ], batch size: 69, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:28:59,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 06:29:03,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 06:29:04,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:04,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=782773.3333333334, ans=0.125 2023-10-02 06:29:05,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 06:29:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:29:05,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:05,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:05,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:05,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:29:06,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:29:10,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:29:12,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 06:29:12,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:29:16,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:16,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:18,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:29:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:20,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=782840.0, ans=0.0 2023-10-02 06:29:23,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 06:29:25,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 06:29:26,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:29:33,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:29:34,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:29:35,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:29:37,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:37,238 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 06:29:37,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=782906.6666666666, ans=0.125 2023-10-02 06:29:38,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:40,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 06:29:41,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:29:43,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:29:43,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:29:43,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:44,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 06:29:44,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 06:29:46,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:29:46,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:29:46,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:29:47,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:47,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=782973.3333333334, ans=0.0 2023-10-02 06:29:48,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.47 vs. limit=6.0 2023-10-02 06:29:50,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:29:51,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=782973.3333333334, ans=0.1 2023-10-02 06:29:53,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:29:54,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:29:55,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:56,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 06:29:57,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=782973.3333333334, ans=10.0 2023-10-02 06:29:58,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:29:59,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:00,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:30:02,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:04,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:30:04,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 06:30:05,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=783040.0, ans=0.125 2023-10-02 06:30:11,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 06:30:13,932 INFO [train.py:1046] (2/4) Epoch 23, batch 600, loss[loss=0.1716, simple_loss=0.2423, pruned_loss=0.05041, over 23612.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2495, pruned_loss=0.04812, over 4494331.90 frames. ], batch size: 134, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:30:14,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 06:30:15,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:30:15,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:30:17,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:24,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:30:25,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:30:27,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 06:30:30,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:30:31,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:30:32,460 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-10-02 06:30:33,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 06:30:35,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:30:38,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=783173.3333333334, ans=0.0 2023-10-02 06:30:40,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.00 vs. limit=15.0 2023-10-02 06:30:40,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 06:30:44,840 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.803e+02 1.978e+02 2.195e+02 2.831e+02, threshold=3.957e+02, percent-clipped=0.0 2023-10-02 06:30:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:30:44,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:44,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:30:45,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=783240.0, ans=0.125 2023-10-02 06:30:51,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:30:51,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:30:52,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:59,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:30:59,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=783306.6666666666, ans=0.125 2023-10-02 06:31:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:31:05,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:31:05,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:31:12,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 06:31:16,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:31:16,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:31:21,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 06:31:21,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:31:23,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 06:31:25,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:31:25,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:31:28,447 INFO [train.py:1046] (2/4) Epoch 23, batch 650, loss[loss=0.1792, simple_loss=0.2424, pruned_loss=0.05796, over 23716.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2487, pruned_loss=0.048, over 4534496.60 frames. ], batch size: 212, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:31:29,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 06:31:31,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:31:33,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:31:35,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:31:35,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=783440.0, ans=0.125 2023-10-02 06:31:36,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:31:40,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 06:31:40,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:31:44,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:31:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:31:46,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:31:51,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 06:31:53,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:31:54,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:31:54,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=783506.6666666666, ans=0.0 2023-10-02 06:31:57,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:31:57,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 06:31:59,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:01,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:01,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:32:03,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:04,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:32:06,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:32:07,417 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 06:32:07,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:07,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:32:10,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:10,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=783573.3333333334, ans=0.125 2023-10-02 06:32:11,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:32:11,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:11,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:32:12,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 06:32:12,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:32:14,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:32:15,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:32:15,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:32:17,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:32:18,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 06:32:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 06:32:20,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:20,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:32:20,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:32:21,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:32:23,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:32:28,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:28,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:32:30,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:33,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:33,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:32:33,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:40,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:32:40,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:32:42,107 INFO [train.py:1046] (2/4) Epoch 23, batch 700, loss[loss=0.1698, simple_loss=0.2394, pruned_loss=0.05007, over 23596.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2459, pruned_loss=0.04766, over 4543225.33 frames. ], batch size: 256, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:32:42,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:32:42,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:32:47,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 06:32:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 06:32:50,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=783773.3333333334, ans=0.1 2023-10-02 06:32:51,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 06:32:52,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:52,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:32:53,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=783773.3333333334, ans=0.2 2023-10-02 06:32:55,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 06:32:59,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:33:01,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:33:02,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=783840.0, ans=0.0 2023-10-02 06:33:03,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:33:04,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:33:05,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:33:06,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=783840.0, ans=0.04949747468305833 2023-10-02 06:33:07,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=783840.0, ans=0.0 2023-10-02 06:33:09,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:33:10,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 06:33:10,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:33:12,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 06:33:13,306 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.806e+02 2.034e+02 2.307e+02 3.674e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 06:33:13,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 06:33:17,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:33:17,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:33:19,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:33:23,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:33:23,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 06:33:27,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:33:28,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:33:28,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 06:33:31,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=783973.3333333334, ans=0.0 2023-10-02 06:33:35,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:33:35,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:33:38,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:33:42,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:33:42,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 06:33:46,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 06:33:46,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 06:33:51,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:33:53,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:33:53,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:33:53,558 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.61 vs. limit=15.0 2023-10-02 06:33:55,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:33:55,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 06:33:57,177 INFO [train.py:1046] (2/4) Epoch 23, batch 750, loss[loss=0.1499, simple_loss=0.2328, pruned_loss=0.0335, over 24645.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2454, pruned_loss=0.04717, over 4585043.12 frames. ], batch size: 65, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:33:59,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 06:33:59,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 06:34:01,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 06:34:03,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 06:34:03,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 06:34:03,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:34:03,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.82 vs. limit=22.5 2023-10-02 06:34:04,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 06:34:06,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:34:06,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:34:09,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:10,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:10,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:34:10,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:34:13,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:34:14,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:34:17,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:34:20,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:20,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:22,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 06:34:22,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.98 vs. limit=15.0 2023-10-02 06:34:23,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:34:23,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:34:25,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:34:28,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:34:29,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 06:34:29,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:34:31,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 06:34:31,221 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 06:34:31,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 06:34:31,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:34:31,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:34:34,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:34:39,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.33 vs. limit=15.0 2023-10-02 06:34:40,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:34:40,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:34:40,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:34:41,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:45,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:34:46,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 06:34:46,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:34:47,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 06:34:48,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:34:50,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:34:50,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 06:34:52,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:34:56,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:34:56,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:34:58,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:59,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:35:01,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=784373.3333333334, ans=0.125 2023-10-02 06:35:03,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 06:35:05,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:35:05,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:06,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.09 vs. limit=15.0 2023-10-02 06:35:06,294 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.78 vs. limit=22.5 2023-10-02 06:35:08,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:08,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:11,275 INFO [train.py:1046] (2/4) Epoch 23, batch 800, loss[loss=0.1918, simple_loss=0.2624, pruned_loss=0.06057, over 23474.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2466, pruned_loss=0.04769, over 4613378.78 frames. ], batch size: 285, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:35:11,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:11,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:35:20,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:20,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:22,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=784440.0, ans=0.0 2023-10-02 06:35:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:35:23,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:24,343 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.23 vs. limit=10.0 2023-10-02 06:35:24,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:24,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:26,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:30,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:31,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:35:33,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 06:35:33,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:35,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:35,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:35:36,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:35:36,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 06:35:36,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:36,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 06:35:39,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:41,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:42,556 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.768e+02 2.043e+02 2.413e+02 3.379e+02, threshold=4.086e+02, percent-clipped=0.0 2023-10-02 06:35:42,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:42,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=784573.3333333334, ans=0.125 2023-10-02 06:35:44,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:35:47,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:47,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:52,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:35:53,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:35:53,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 06:35:54,664 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 06:35:55,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 06:35:55,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:35:55,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:57,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:57,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:36:03,503 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 06:36:03,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 06:36:03,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:36:05,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:36:09,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:36:12,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:36:13,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 06:36:13,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:36:16,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 06:36:22,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:36:26,655 INFO [train.py:1046] (2/4) Epoch 23, batch 850, loss[loss=0.1732, simple_loss=0.2622, pruned_loss=0.04207, over 24642.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2478, pruned_loss=0.0479, over 4646575.68 frames. ], batch size: 68, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:36:26,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:36:26,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 06:36:26,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:36:28,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:36:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 06:36:29,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:31,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:36:32,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:36:33,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:36:35,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:36:35,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=784773.3333333334, ans=0.2 2023-10-02 06:36:37,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 06:36:37,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 06:36:37,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 06:36:38,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:36:40,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:36:41,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:36:41,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:36:41,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:36:44,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=784840.0, ans=0.0 2023-10-02 06:36:46,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=784840.0, ans=0.2 2023-10-02 06:36:47,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:47,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:36:47,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 06:36:51,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 06:36:53,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=784840.0, ans=0.0 2023-10-02 06:36:54,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:54,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=784906.6666666666, ans=0.025 2023-10-02 06:36:55,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 06:36:59,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 06:37:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 06:37:02,528 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 06:37:02,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:37:02,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:37:03,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 06:37:07,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:07,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:08,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 06:37:11,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:37:12,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:37:12,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:37:12,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:37:14,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:37:15,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:37:15,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 06:37:19,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:37:19,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:37:19,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:37:19,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:37:21,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:37:21,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=784973.3333333334, ans=0.125 2023-10-02 06:37:24,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:28,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:37:28,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.91 vs. limit=15.0 2023-10-02 06:37:29,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:37:29,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:37:31,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:37:37,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:37:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:37:40,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 06:37:40,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:37:40,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:37:41,622 INFO [train.py:1046] (2/4) Epoch 23, batch 900, loss[loss=0.1598, simple_loss=0.238, pruned_loss=0.04074, over 24469.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2485, pruned_loss=0.04834, over 4647175.93 frames. ], batch size: 63, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:37:43,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 06:37:48,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:37:51,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:37:51,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 06:37:56,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:37:57,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 06:37:58,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 06:37:59,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:37:59,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:00,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:38:00,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:38:07,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=785173.3333333334, ans=15.0 2023-10-02 06:38:11,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:11,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:38:11,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:38:12,830 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.908e+02 2.046e+02 2.304e+02 2.973e+02, threshold=4.093e+02, percent-clipped=0.0 2023-10-02 06:38:15,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:17,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=785240.0, ans=0.125 2023-10-02 06:38:19,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 06:38:21,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:38:25,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:38:25,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:38:25,543 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 06:38:28,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 06:38:32,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:38:32,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:38:32,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:38:33,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=785306.6666666666, ans=0.125 2023-10-02 06:38:38,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:38,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:38:41,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 06:38:41,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:44,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 06:38:46,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:38:46,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:48,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:38:48,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:38:53,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 06:38:53,108 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 06:38:53,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=785373.3333333334, ans=0.125 2023-10-02 06:38:55,736 INFO [train.py:1046] (2/4) Epoch 23, batch 950, loss[loss=0.1695, simple_loss=0.2577, pruned_loss=0.04066, over 24434.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2491, pruned_loss=0.04859, over 4658307.54 frames. ], batch size: 69, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:38:55,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:38:55,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 06:38:57,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:39:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 06:39:02,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=785440.0, ans=0.125 2023-10-02 06:39:05,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:07,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=785440.0, ans=0.5 2023-10-02 06:39:09,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:09,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:09,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:39:13,071 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 06:39:15,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:17,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:39:17,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:17,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:39:17,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 06:39:18,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:39:20,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:21,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 06:39:22,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:39:25,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:25,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:39:25,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:39:28,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 06:39:30,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:39:32,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:39:33,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:39:37,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:39:37,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:42,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 06:39:44,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 06:39:44,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:39:44,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:39:46,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:39:50,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 06:39:50,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:39:51,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:39:53,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:53,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 06:39:54,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:54,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:39:54,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 06:40:00,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:40:02,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:40:06,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:40:07,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 06:40:07,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 06:40:10,913 INFO [train.py:1046] (2/4) Epoch 23, batch 1000, loss[loss=0.1571, simple_loss=0.2399, pruned_loss=0.03717, over 24665.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2482, pruned_loss=0.0486, over 4650528.55 frames. ], batch size: 65, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:40:12,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:40:14,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 06:40:14,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:21,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:40:21,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 06:40:21,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 06:40:25,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:25,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:40:26,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=785840.0, ans=0.1 2023-10-02 06:40:27,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:29,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 06:40:33,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 06:40:35,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 06:40:37,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:40:37,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 06:40:37,468 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:40:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 06:40:38,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 06:40:40,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:41,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:42,775 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.055e+02 2.435e+02 3.236e+02, threshold=4.111e+02, percent-clipped=0.0 2023-10-02 06:40:50,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:51,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:40:51,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:53,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:53,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 06:40:54,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:40:54,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:40:55,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:56,017 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 06:40:58,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 06:41:00,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 06:41:01,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 06:41:03,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:41:09,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:09,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=786040.0, ans=0.125 2023-10-02 06:41:11,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:41:11,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:12,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:41:13,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 06:41:13,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:41:15,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 06:41:15,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 06:41:17,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:41:17,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:41:18,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:41:19,929 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.17 vs. limit=8.0 2023-10-02 06:41:21,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:41:24,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:41:25,580 INFO [train.py:1046] (2/4) Epoch 23, batch 1050, loss[loss=0.1812, simple_loss=0.2608, pruned_loss=0.0508, over 23366.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2474, pruned_loss=0.04803, over 4664531.68 frames. ], batch size: 93, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:41:25,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:41:25,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=786106.6666666666, ans=0.1 2023-10-02 06:41:27,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:41:28,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:41:30,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:32,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:41:32,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=786106.6666666666, ans=0.1 2023-10-02 06:41:33,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:41:35,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:41:37,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:41:38,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:41:38,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:41:39,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:41:39,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 06:41:41,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:41:41,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 06:41:44,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:41:44,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 06:41:45,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:41:50,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:51,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:41:51,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:41:54,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 06:41:54,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 06:41:54,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:41:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 06:41:59,659 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.34 vs. limit=15.0 2023-10-02 06:42:02,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 06:42:03,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:06,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=786240.0, ans=0.125 2023-10-02 06:42:07,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 06:42:10,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 06:42:10,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:42:11,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:42:13,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=786306.6666666666, ans=0.125 2023-10-02 06:42:15,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:42:18,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 06:42:20,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 06:42:20,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 06:42:20,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:42:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:42:22,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 06:42:24,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.11 vs. limit=22.5 2023-10-02 06:42:27,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:42:29,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:42:29,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:42:30,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:42:30,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:34,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:34,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 06:42:37,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:42:37,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 06:42:37,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 06:42:39,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:42:42,262 INFO [train.py:1046] (2/4) Epoch 23, batch 1100, loss[loss=0.1769, simple_loss=0.2528, pruned_loss=0.05048, over 23250.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.247, pruned_loss=0.04789, over 4682712.78 frames. ], batch size: 105, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:42:43,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:42:46,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=786440.0, ans=0.05 2023-10-02 06:42:48,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:42:51,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:42:52,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:42:52,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:42:52,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 06:42:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:42:58,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:42:59,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:43:02,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:43:02,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 06:43:03,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 06:43:05,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:43:05,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:43:09,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:43:10,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.93 vs. limit=6.0 2023-10-02 06:43:10,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.26 vs. limit=10.0 2023-10-02 06:43:12,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:43:13,580 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.798e+02 1.908e+02 2.172e+02 3.443e+02, threshold=3.816e+02, percent-clipped=0.0 2023-10-02 06:43:15,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=786573.3333333334, ans=0.125 2023-10-02 06:43:16,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:43:18,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=786573.3333333334, ans=0.1 2023-10-02 06:43:19,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 06:43:19,322 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 06:43:19,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:22,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:22,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:43:24,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:43:25,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 06:43:26,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:43:26,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:43:26,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:43:26,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:26,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 06:43:28,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=786640.0, ans=0.07 2023-10-02 06:43:32,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:43:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 06:43:34,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:43:35,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.99 vs. limit=15.0 2023-10-02 06:43:37,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=786640.0, ans=0.0 2023-10-02 06:43:40,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:43:42,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 06:43:42,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:43:44,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:46,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:43:48,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:43:49,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 06:43:49,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:43:49,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:43:50,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 06:43:51,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:43:52,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 06:43:52,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:43:52,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:43:52,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=786706.6666666666, ans=0.0 2023-10-02 06:43:54,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:43:57,346 INFO [train.py:1046] (2/4) Epoch 23, batch 1150, loss[loss=0.1746, simple_loss=0.2493, pruned_loss=0.05, over 23611.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2481, pruned_loss=0.04794, over 4700484.81 frames. ], batch size: 256, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:43:57,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=786773.3333333334, ans=0.0 2023-10-02 06:43:58,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:01,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:44:04,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:44:04,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:44:04,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=786773.3333333334, ans=0.125 2023-10-02 06:44:05,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 06:44:05,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:44:09,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 06:44:09,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:09,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:44:10,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.57 vs. limit=15.0 2023-10-02 06:44:11,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=786840.0, ans=0.5 2023-10-02 06:44:12,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=786840.0, ans=0.125 2023-10-02 06:44:14,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=786840.0, ans=0.125 2023-10-02 06:44:17,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 06:44:19,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:44:24,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:24,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:25,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 06:44:25,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:44:25,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:44:29,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 06:44:30,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:44:31,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:44:39,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=786906.6666666666, ans=0.025 2023-10-02 06:44:39,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=786906.6666666666, ans=0.0 2023-10-02 06:44:43,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:47,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:49,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 06:44:49,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:44:50,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:44:54,625 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 06:44:55,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:45:03,338 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 06:45:06,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=787040.0, ans=0.125 2023-10-02 06:45:07,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:09,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:45:10,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:45:10,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:45:12,374 INFO [train.py:1046] (2/4) Epoch 23, batch 1200, loss[loss=0.1809, simple_loss=0.2626, pruned_loss=0.04966, over 23623.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2496, pruned_loss=0.04829, over 4703672.72 frames. ], batch size: 85, lr: 4.49e-03, grad_scale: 32.0 2023-10-02 06:45:12,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:45:17,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:45:17,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:45:19,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:45:19,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:19,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:45:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:45:23,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:45:24,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:45:24,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:45:27,458 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 06:45:30,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 06:45:33,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:45:36,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:45:37,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:45:39,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:45:39,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=787173.3333333334, ans=0.07 2023-10-02 06:45:40,374 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 06:45:40,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=787240.0, ans=0.125 2023-10-02 06:45:42,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:43,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 1.812e+02 2.009e+02 2.421e+02 3.393e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 06:45:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:45:49,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:45:49,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 06:45:51,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:45:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 06:45:58,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 06:45:58,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:58,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=787306.6666666666, ans=0.125 2023-10-02 06:45:59,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:46:01,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:02,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:46:02,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:46:02,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:46:04,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:46:05,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 06:46:05,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:46:07,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:46:07,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:46:08,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:46:08,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:13,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:46:16,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:46:19,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 06:46:22,527 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 06:46:24,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:46:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:46:27,480 INFO [train.py:1046] (2/4) Epoch 23, batch 1250, loss[loss=0.2353, simple_loss=0.294, pruned_loss=0.08834, over 19206.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2505, pruned_loss=0.04843, over 4705024.18 frames. ], batch size: 389, lr: 4.49e-03, grad_scale: 32.0 2023-10-02 06:46:27,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:46:28,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:46:33,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 06:46:37,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:46:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:46:39,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 06:46:39,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=787440.0, ans=0.125 2023-10-02 06:46:40,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:46:42,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:46:42,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=787506.6666666666, ans=0.125 2023-10-02 06:46:45,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:46:45,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:46:46,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:46:46,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:46:48,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:46:53,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:46:53,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:46:53,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:46:54,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:46:54,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:46:57,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:57,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:47:02,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=787573.3333333334, ans=0.95 2023-10-02 06:47:03,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 06:47:03,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:47:06,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:47:08,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 06:47:09,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:47:09,443 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 06:47:09,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:09,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:12,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:47:15,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:47:15,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:47:17,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 06:47:17,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 06:47:17,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 06:47:20,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=787640.0, ans=0.125 2023-10-02 06:47:23,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:47:24,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 06:47:24,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:28,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 06:47:28,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:47:32,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 06:47:32,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:47:32,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:47:32,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 06:47:32,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:47:34,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 06:47:35,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=787706.6666666666, ans=0.1 2023-10-02 06:47:36,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:47:38,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:47:39,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:47:42,157 INFO [train.py:1046] (2/4) Epoch 23, batch 1300, loss[loss=0.1602, simple_loss=0.2193, pruned_loss=0.05052, over 22833.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2507, pruned_loss=0.04859, over 4704277.17 frames. ], batch size: 322, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:47:43,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:47:44,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:47:45,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 06:47:49,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=787773.3333333334, ans=0.125 2023-10-02 06:47:51,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:47:52,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:47:52,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=787773.3333333334, ans=0.0 2023-10-02 06:47:53,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:47:55,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:57,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:47:57,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 06:47:57,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=787840.0, ans=0.125 2023-10-02 06:48:01,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:48:02,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:48:03,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 06:48:06,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:48:09,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:10,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:48:12,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:48:12,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:13,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:48:15,083 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.873e+02 2.076e+02 2.335e+02 3.601e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 06:48:15,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:48:15,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 06:48:22,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:48:23,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:48:24,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 06:48:25,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:48:27,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:48:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:48:30,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 06:48:30,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:48:30,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 06:48:31,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:48:35,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:48:35,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:48:39,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 06:48:41,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 06:48:41,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 06:48:46,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:48:46,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=788040.0, ans=0.125 2023-10-02 06:48:47,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 06:48:49,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:56,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 06:48:57,841 INFO [train.py:1046] (2/4) Epoch 23, batch 1350, loss[loss=0.1515, simple_loss=0.2081, pruned_loss=0.04744, over 22814.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2496, pruned_loss=0.04827, over 4697389.73 frames. ], batch size: 322, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:48:59,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:00,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:49:05,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:06,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:49:08,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:49:11,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:49:14,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 06:49:14,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:49:15,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:49:18,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 06:49:18,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:49:20,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:49:20,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 06:49:23,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 06:49:25,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 06:49:26,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:26,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 06:49:27,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.13 vs. limit=6.0 2023-10-02 06:49:37,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:46,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:46,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:49:47,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 06:49:51,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:49:52,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 06:49:52,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:49:53,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:55,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:49:58,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 06:49:59,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:50:04,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 06:50:06,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 06:50:13,336 INFO [train.py:1046] (2/4) Epoch 23, batch 1400, loss[loss=0.1836, simple_loss=0.2647, pruned_loss=0.05119, over 24133.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2493, pruned_loss=0.04807, over 4706864.92 frames. ], batch size: 86, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:50:13,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 06:50:14,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:50:16,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:50:17,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:50:22,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 06:50:23,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 06:50:35,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:50:37,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:50:39,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:50:39,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:50:39,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=788506.6666666666, ans=0.125 2023-10-02 06:50:39,859 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.24 vs. limit=15.0 2023-10-02 06:50:44,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:50:44,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 06:50:46,772 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.899e+02 2.083e+02 2.387e+02 3.639e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 06:50:53,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:50:54,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:50:56,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=788573.3333333334, ans=0.0 2023-10-02 06:50:56,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=788573.3333333334, ans=0.125 2023-10-02 06:50:57,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 06:50:58,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:50:58,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:51:00,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:51:02,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:51:02,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:51:02,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:51:03,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:51:03,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 06:51:04,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:51:09,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:09,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=788640.0, ans=0.0 2023-10-02 06:51:13,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:51:13,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.55 vs. limit=15.0 2023-10-02 06:51:18,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 06:51:20,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:51:21,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:51:24,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 06:51:24,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:27,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:51:28,908 INFO [train.py:1046] (2/4) Epoch 23, batch 1450, loss[loss=0.1626, simple_loss=0.2567, pruned_loss=0.03425, over 24644.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2481, pruned_loss=0.04765, over 4705972.17 frames. ], batch size: 73, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:51:29,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:51:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:51:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:32,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 06:51:36,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:38,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:51:39,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:51:39,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 06:51:41,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:51:41,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 06:51:42,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:43,063 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.69 vs. limit=15.0 2023-10-02 06:51:44,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:44,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 06:51:44,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:51:45,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:51:47,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 06:51:47,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:51:50,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:50,855 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=22.5 2023-10-02 06:51:51,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:54,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:51:54,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:51:55,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=788840.0, ans=0.0 2023-10-02 06:51:56,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:57,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:59,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:59,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:51:59,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:00,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:03,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 06:52:06,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:52:08,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 06:52:08,803 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:52:10,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:52:11,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:52:13,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:14,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 06:52:17,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:17,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 06:52:19,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 06:52:19,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.21 vs. limit=22.5 2023-10-02 06:52:20,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:25,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:52:25,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:52:26,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 06:52:27,220 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-02 06:52:30,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 06:52:30,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 06:52:32,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:33,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:52:40,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=789040.0, ans=0.125 2023-10-02 06:52:41,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=789040.0, ans=0.1 2023-10-02 06:52:43,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=789106.6666666666, ans=0.125 2023-10-02 06:52:44,868 INFO [train.py:1046] (2/4) Epoch 23, batch 1500, loss[loss=0.1982, simple_loss=0.2618, pruned_loss=0.06733, over 22712.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2484, pruned_loss=0.0478, over 4694408.02 frames. ], batch size: 322, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:52:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 06:52:44,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:52:44,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:52:46,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:52:47,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:52:49,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 06:52:50,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:52:50,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:52:51,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:52:51,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:52:53,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:52:55,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:59,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:59,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 06:52:59,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=789173.3333333334, ans=0.0 2023-10-02 06:53:00,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:53:00,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:53:02,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:53:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 06:53:08,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 06:53:09,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:53:10,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 06:53:12,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:53:15,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:53:17,156 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.895e+02 2.081e+02 2.533e+02 4.073e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-02 06:53:17,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:53:17,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:53:19,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 06:53:19,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:53:19,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:53:21,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 06:53:21,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:53:21,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=789240.0, ans=0.125 2023-10-02 06:53:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:53:25,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 06:53:30,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=789306.6666666666, ans=0.125 2023-10-02 06:53:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:53:35,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:53:35,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=15.0 2023-10-02 06:53:36,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=789306.6666666666, ans=0.0 2023-10-02 06:53:39,519 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 06:53:39,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:39,572 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 06:53:40,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:53:42,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:53:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 06:53:43,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=789373.3333333334, ans=0.0 2023-10-02 06:53:44,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:53:47,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 06:53:49,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:52,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:53:52,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:53,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:53:53,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:54,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:53:56,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 06:53:56,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 06:53:56,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=789373.3333333334, ans=0.1 2023-10-02 06:53:57,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:53:57,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 06:53:57,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 06:53:59,021 INFO [train.py:1046] (2/4) Epoch 23, batch 1550, loss[loss=0.154, simple_loss=0.2363, pruned_loss=0.03588, over 24303.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2489, pruned_loss=0.04781, over 4697282.52 frames. ], batch size: 61, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:54:01,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:54:02,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:02,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:54:02,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:54:02,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:03,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:04,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=789440.0, ans=0.125 2023-10-02 06:54:07,270 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 06:54:07,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:08,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:54:08,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:54:11,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:54:11,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 06:54:12,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:54:12,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 06:54:14,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 06:54:14,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 06:54:15,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=789506.6666666666, ans=0.0 2023-10-02 06:54:16,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:16,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:20,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:54:22,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 06:54:22,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 06:54:26,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=789506.6666666666, ans=0.0 2023-10-02 06:54:29,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:33,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:54:33,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:54:35,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:54:35,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 06:54:42,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:54:43,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:47,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:54:49,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:54:50,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:50,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 06:54:50,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:54:51,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=789640.0, ans=0.125 2023-10-02 06:54:52,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:54:52,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:52,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 06:54:52,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 06:54:54,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:00,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 06:55:01,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=789706.6666666666, ans=0.125 2023-10-02 06:55:03,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:55:05,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:07,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 06:55:08,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:55:08,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:55:08,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:55:10,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:55:10,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:55:14,640 INFO [train.py:1046] (2/4) Epoch 23, batch 1600, loss[loss=0.1899, simple_loss=0.2714, pruned_loss=0.05418, over 24043.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2503, pruned_loss=0.04877, over 4698952.17 frames. ], batch size: 80, lr: 4.48e-03, grad_scale: 32.0 2023-10-02 06:55:14,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:14,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 06:55:16,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 06:55:17,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 06:55:20,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:55:20,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 06:55:21,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:55:23,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=789773.3333333334, ans=0.125 2023-10-02 06:55:24,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:55:29,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:55:32,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 06:55:34,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:55:35,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 06:55:35,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:35,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 06:55:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 06:55:46,669 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.856e+02 2.043e+02 2.277e+02 4.874e+02, threshold=4.086e+02, percent-clipped=2.0 2023-10-02 06:55:51,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:51,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 06:55:52,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:52,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:55:52,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:55:56,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 06:55:56,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=789906.6666666666, ans=0.125 2023-10-02 06:55:56,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=789906.6666666666, ans=0.1 2023-10-02 06:56:00,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 06:56:01,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:56:01,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:02,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:03,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=789973.3333333334, ans=0.125 2023-10-02 06:56:04,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:56:05,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:56:07,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:56:08,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:56:12,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=790040.0, ans=0.125 2023-10-02 06:56:14,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:16,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:56:17,113 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:56:18,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 06:56:18,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:56:18,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 06:56:20,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=790040.0, ans=0.05 2023-10-02 06:56:24,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:56:25,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:56:25,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:56:27,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 06:56:27,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 06:56:27,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 06:56:27,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 06:56:28,503 INFO [train.py:1046] (2/4) Epoch 23, batch 1650, loss[loss=0.1781, simple_loss=0.2563, pruned_loss=0.04992, over 23717.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2509, pruned_loss=0.04916, over 4703204.54 frames. ], batch size: 85, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:56:31,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:31,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:56:31,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:56:31,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:56:34,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:56:35,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 06:56:37,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:56:37,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=790106.6666666666, ans=0.0 2023-10-02 06:56:37,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=790106.6666666666, ans=0.125 2023-10-02 06:56:38,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:56:38,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:56:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:56:40,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 06:56:40,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 06:56:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:56:47,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:56:53,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=790173.3333333334, ans=0.125 2023-10-02 06:57:00,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 06:57:02,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:03,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 06:57:05,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:06,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:57:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:57:08,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:10,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:57:10,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:12,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:57:14,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:14,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:57:15,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:57:15,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:57:15,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:57:16,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=790306.6666666666, ans=0.125 2023-10-02 06:57:19,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:57:19,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 06:57:21,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:57:22,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 06:57:23,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 06:57:24,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 06:57:24,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:57:24,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:57:24,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:26,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:26,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 06:57:31,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:33,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:57:34,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:36,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 06:57:40,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:40,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:57:40,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 06:57:41,918 INFO [train.py:1046] (2/4) Epoch 23, batch 1700, loss[loss=0.1591, simple_loss=0.2442, pruned_loss=0.03699, over 24647.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2506, pruned_loss=0.04864, over 4704588.91 frames. ], batch size: 65, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:57:42,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:57:42,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:57:42,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:57:45,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:57:45,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:57:47,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 06:57:49,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=790440.0, ans=0.125 2023-10-02 06:57:50,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:57:58,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:00,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:58:04,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:58:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:58:06,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:58:06,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:58:09,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 06:58:10,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:58:10,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:12,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:58:13,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:58:15,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 06:58:15,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=790573.3333333334, ans=0.0 2023-10-02 06:58:16,795 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.870e+02 2.002e+02 2.327e+02 4.481e+02, threshold=4.004e+02, percent-clipped=3.0 2023-10-02 06:58:16,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 06:58:18,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:20,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 06:58:20,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:58:30,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:30,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:58:30,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:58:31,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.49 vs. limit=10.0 2023-10-02 06:58:33,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:58:33,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 06:58:33,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:58:35,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:35,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 06:58:37,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:58:37,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:58:37,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:37,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:58:40,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:58:40,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:58:42,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:58:43,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:58:43,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:48,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:49,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 06:58:51,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:52,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:54,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 06:58:57,203 INFO [train.py:1046] (2/4) Epoch 23, batch 1750, loss[loss=0.1717, simple_loss=0.2435, pruned_loss=0.04997, over 23765.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2484, pruned_loss=0.04825, over 4694863.89 frames. ], batch size: 179, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:59:00,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:01,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:01,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:59:02,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 06:59:02,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:59:04,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=790773.3333333334, ans=10.0 2023-10-02 06:59:05,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:59:05,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:08,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=790773.3333333334, ans=0.2 2023-10-02 06:59:12,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 06:59:15,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:16,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 06:59:16,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:59:16,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=790840.0, ans=0.1 2023-10-02 06:59:17,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:59:19,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 06:59:20,762 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.60 vs. limit=15.0 2023-10-02 06:59:21,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 06:59:22,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:59:23,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 06:59:31,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:59:32,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:59:32,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:59:35,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:35,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:59:38,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:59:39,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:41,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:59:41,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:59:43,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 06:59:45,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:59:48,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 06:59:49,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:59:50,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:51,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:59:56,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:59:57,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:59:58,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=791040.0, ans=0.1 2023-10-02 06:59:59,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:00:00,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:00:00,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=791040.0, ans=0.125 2023-10-02 07:00:04,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:00:06,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:00:07,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:00:08,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 07:00:08,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:00:09,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:00:09,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:09,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:00:09,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:00:10,934 INFO [train.py:1046] (2/4) Epoch 23, batch 1800, loss[loss=0.1671, simple_loss=0.2526, pruned_loss=0.04074, over 24464.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2476, pruned_loss=0.04801, over 4704209.20 frames. ], batch size: 66, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:00:10,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:00:13,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:00:15,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:00:18,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:00:19,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:00:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:00:22,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:00:25,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:00:28,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:29,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:31,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:00:32,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:00:32,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 07:00:32,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:35,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:39,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 07:00:41,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 07:00:43,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 07:00:43,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:00:45,212 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.867e+02 2.054e+02 2.379e+02 4.412e+02, threshold=4.108e+02, percent-clipped=1.0 2023-10-02 07:00:45,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:45,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:00:45,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:00:46,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=791240.0, ans=0.125 2023-10-02 07:00:52,817 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 07:00:54,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:00:57,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:58,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 07:00:59,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=791306.6666666666, ans=0.05 2023-10-02 07:01:00,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 07:01:00,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:01:01,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:01:01,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:01:02,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=791306.6666666666, ans=0.125 2023-10-02 07:01:04,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 07:01:11,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:01:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 07:01:11,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:01:11,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:01:13,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:01:13,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 07:01:15,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:01:15,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:01:20,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 07:01:20,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:01:22,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:01:22,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:01:23,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:01:24,777 INFO [train.py:1046] (2/4) Epoch 23, batch 1850, loss[loss=0.1526, simple_loss=0.2439, pruned_loss=0.03071, over 24488.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.248, pruned_loss=0.04766, over 4708841.83 frames. ], batch size: 71, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:01:24,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:01:24,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:01:27,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:01:27,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:01:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:01:32,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:01:39,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:01:39,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 07:01:42,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 07:01:43,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 07:01:47,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:01:47,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 07:01:47,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 07:01:47,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=791506.6666666666, ans=0.025 2023-10-02 07:01:53,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=15.0 2023-10-02 07:01:55,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.90 vs. limit=8.0 2023-10-02 07:01:57,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:01:59,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 07:02:00,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:02:00,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:02:06,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 07:02:06,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:06,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:02:07,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:02:10,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:02:11,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:02:12,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=791640.0, ans=0.1 2023-10-02 07:02:16,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:02:16,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:16,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 07:02:16,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:19,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:02:20,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:02:22,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=791706.6666666666, ans=0.125 2023-10-02 07:02:24,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 07:02:24,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:02:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:02:27,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:02:27,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 07:02:28,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 07:02:29,494 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.00 vs. limit=15.0 2023-10-02 07:02:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 07:02:31,792 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 07:02:33,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:02:34,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:02:34,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:02:34,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:35,804 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 07:02:35,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:02:35,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:37,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:02:37,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:02:38,710 INFO [train.py:1046] (2/4) Epoch 23, batch 1900, loss[loss=0.1565, simple_loss=0.2326, pruned_loss=0.04024, over 21476.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2488, pruned_loss=0.04758, over 4707000.78 frames. ], batch size: 47, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:02:38,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:02:38,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 07:02:40,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:41,322 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 07:02:41,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:02:41,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=791773.3333333334, ans=0.2 2023-10-02 07:02:42,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:48,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:50,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:02:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 07:02:51,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 07:02:53,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:02:55,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:02:55,070 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 07:02:55,104 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 07:02:58,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 07:03:01,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:03:01,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.86 vs. limit=12.0 2023-10-02 07:03:05,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 07:03:05,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 07:03:12,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=791906.6666666666, ans=0.0 2023-10-02 07:03:13,810 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.937e+02 2.220e+02 2.601e+02 3.701e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-02 07:03:13,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 07:03:17,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 07:03:17,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=791906.6666666666, ans=0.0 2023-10-02 07:03:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:03:20,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 07:03:20,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 07:03:20,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 07:03:21,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 07:03:21,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:03:24,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 07:03:29,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:03:30,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:03:30,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 07:03:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:03:35,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 07:03:35,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:03:41,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:03:41,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:03:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:03:42,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:03:45,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:03:45,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:03:47,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:03:51,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:03:51,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:03:52,940 INFO [train.py:1046] (2/4) Epoch 23, batch 1950, loss[loss=0.1547, simple_loss=0.2333, pruned_loss=0.03808, over 20168.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2495, pruned_loss=0.04801, over 4690723.91 frames. ], batch size: 44, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:03:54,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:03:54,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:03:54,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:03:56,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:03:59,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:04:01,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:04:01,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:01,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:04:03,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 07:04:05,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 07:04:05,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:05,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:09,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:04:09,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:10,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:12,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:04:13,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:04:13,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:04:13,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:04:13,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:16,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=792173.3333333334, ans=0.0 2023-10-02 07:04:18,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:21,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:04:21,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:21,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:04:21,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 07:04:21,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:04:21,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:04:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:22,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=792240.0, ans=0.1 2023-10-02 07:04:26,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:29,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:04:34,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:04:37,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:04:37,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:04:37,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 07:04:37,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:04:38,019 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-10-02 07:04:41,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:04:41,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:04:42,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:04:50,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:50,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:52,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:54,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:57,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:04:58,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:59,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 07:04:59,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:05:00,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:05:00,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 07:05:03,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:05:06,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:05:07,948 INFO [train.py:1046] (2/4) Epoch 23, batch 2000, loss[loss=0.1664, simple_loss=0.2437, pruned_loss=0.04454, over 23671.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2509, pruned_loss=0.04827, over 4690296.36 frames. ], batch size: 135, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:05:08,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:05:08,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:05:09,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:05:09,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=792440.0, ans=0.2 2023-10-02 07:05:12,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:14,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 07:05:14,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:05:17,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:05:17,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=792440.0, ans=0.125 2023-10-02 07:05:19,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 07:05:21,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:05:21,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:05:22,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:05:25,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 07:05:27,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 07:05:30,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:05:33,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 07:05:33,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:05:34,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.01 vs. limit=10.0 2023-10-02 07:05:37,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:05:39,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:05:39,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:39,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:05:40,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:05:40,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 07:05:40,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=792573.3333333334, ans=0.05 2023-10-02 07:05:42,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=792573.3333333334, ans=0.0 2023-10-02 07:05:43,149 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.945e+02 2.106e+02 2.300e+02 3.133e+02, threshold=4.213e+02, percent-clipped=0.0 2023-10-02 07:05:44,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 07:05:44,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:05:44,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:05:50,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:50,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:05:50,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:05:50,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=792640.0, ans=0.0 2023-10-02 07:05:52,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:05:52,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=792640.0, ans=0.1 2023-10-02 07:05:53,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:05:55,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:55,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:05:55,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:56,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:59,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:06:00,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 07:06:03,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=792640.0, ans=0.5 2023-10-02 07:06:04,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:06:06,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:10,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:10,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:06:13,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:14,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:06:14,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:16,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:06:17,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:06:18,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:20,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:21,943 INFO [train.py:1046] (2/4) Epoch 23, batch 2050, loss[loss=0.1573, simple_loss=0.2089, pruned_loss=0.05286, over 19388.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2493, pruned_loss=0.04807, over 4685692.79 frames. ], batch size: 388, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:06:23,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:06:24,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:29,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:06:31,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:06:32,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:34,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:06:35,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 07:06:35,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:06:37,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:06:37,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:06:45,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:06:45,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:49,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 07:06:50,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:51,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 07:06:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:06:55,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:06:58,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:06:59,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.27 vs. limit=12.0 2023-10-02 07:07:00,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:07:01,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:07:03,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:07:04,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:07:04,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:07:05,275 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:07:06,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:09,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:07:11,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:07:11,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:07:15,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:07:19,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:07:21,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 07:07:21,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=793040.0, ans=0.125 2023-10-02 07:07:27,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:07:28,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:07:32,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:07:33,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 07:07:33,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=793040.0, ans=0.05 2023-10-02 07:07:36,275 INFO [train.py:1046] (2/4) Epoch 23, batch 2100, loss[loss=0.1742, simple_loss=0.2425, pruned_loss=0.05298, over 23918.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2475, pruned_loss=0.0475, over 4688410.65 frames. ], batch size: 212, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:07:38,156 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 07:07:38,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:07:38,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:38,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.65 vs. limit=22.5 2023-10-02 07:07:39,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:07:39,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=793106.6666666666, ans=0.125 2023-10-02 07:07:41,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:07:42,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 07:07:42,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 07:07:43,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:07:44,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:07:45,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:07:47,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:07:49,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:07:49,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 07:07:49,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:07:49,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 07:07:49,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 07:07:50,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:07:50,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:07:51,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 07:07:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:07:52,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=793173.3333333334, ans=0.05 2023-10-02 07:07:56,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 07:07:56,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:08:02,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:08:02,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:08:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:08:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 07:08:06,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:06,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 07:08:09,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 07:08:09,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:09,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 07:08:09,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 07:08:10,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=793240.0, ans=0.0 2023-10-02 07:08:11,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 07:08:11,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:08:12,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.23 vs. limit=15.0 2023-10-02 07:08:12,672 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.417e+02 1.829e+02 2.013e+02 2.459e+02 4.112e+02, threshold=4.025e+02, percent-clipped=0.0 2023-10-02 07:08:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:08:15,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:08:16,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:08:18,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:19,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:19,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 07:08:19,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:19,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:21,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:21,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 07:08:22,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=793306.6666666666, ans=0.125 2023-10-02 07:08:23,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 07:08:24,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 07:08:28,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:08:32,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:08:32,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 07:08:39,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:41,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:08:42,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:08:42,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:08:42,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 07:08:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:08:44,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:44,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:08:45,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:08:47,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:48,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 07:08:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 07:08:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:08:51,395 INFO [train.py:1046] (2/4) Epoch 23, batch 2150, loss[loss=0.1622, simple_loss=0.2511, pruned_loss=0.03664, over 24544.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2465, pruned_loss=0.04697, over 4699116.12 frames. ], batch size: 71, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:08:53,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:53,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:08:53,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:08:54,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:08:56,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=793440.0, ans=0.0 2023-10-02 07:08:58,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 07:09:01,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:01,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:02,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=793440.0, ans=0.0 2023-10-02 07:09:05,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:09:05,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:06,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:09:11,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:11,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:09:11,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:09:15,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:15,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 07:09:20,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:20,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:09:21,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:21,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:22,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:22,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:09:24,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:24,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:09:24,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:09:25,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 07:09:27,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:09:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:28,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:29,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:09:31,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:09:34,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:34,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:09:34,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 07:09:34,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:09:38,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:38,544 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:09:40,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:41,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:41,766 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:09:42,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:09:42,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:44,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:44,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 07:09:45,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 07:09:45,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:09:46,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 07:09:48,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:48,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:09:48,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 07:09:48,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:09:48,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 07:09:48,463 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 07:09:48,463 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 07:09:48,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 07:09:51,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:51,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:51,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:09:51,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:52,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:09:52,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:54,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:57,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=793706.6666666666, ans=0.125 2023-10-02 07:10:01,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:10:02,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 07:10:03,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=793706.6666666666, ans=0.2 2023-10-02 07:10:06,052 INFO [train.py:1046] (2/4) Epoch 23, batch 2200, loss[loss=0.1681, simple_loss=0.2419, pruned_loss=0.0472, over 23550.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2471, pruned_loss=0.04701, over 4706665.35 frames. ], batch size: 256, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:10:08,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:10:12,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:12,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:10:14,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:14,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:10:17,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:10:17,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:10:17,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 07:10:17,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=793773.3333333334, ans=0.1 2023-10-02 07:10:22,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 07:10:24,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:10:28,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 07:10:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:32,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:10:33,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=793840.0, ans=0.2 2023-10-02 07:10:34,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:10:37,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:10:37,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 07:10:42,795 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.834e+02 1.960e+02 2.207e+02 4.164e+02, threshold=3.921e+02, percent-clipped=1.0 2023-10-02 07:10:42,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:10:44,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:44,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 07:10:48,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:10:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:10:51,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:10:52,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:54,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 07:10:55,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:10:58,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 07:10:59,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:59,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:11:00,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:11:02,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:11:02,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:11:02,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:11:02,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:11:03,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:11:05,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:11:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:11:10,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 07:11:10,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:11:13,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:11:14,918 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 07:11:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:11:16,435 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 07:11:17,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:11:19,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 07:11:20,451 INFO [train.py:1046] (2/4) Epoch 23, batch 2250, loss[loss=0.1804, simple_loss=0.2586, pruned_loss=0.05106, over 23253.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2481, pruned_loss=0.04724, over 4721933.21 frames. ], batch size: 93, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:11:21,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:11:21,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:11:23,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:11:24,719 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 07:11:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:11:28,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:11:34,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:11:34,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:11:37,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:39,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:11:40,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:11:42,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 07:11:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:11:42,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:11:44,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 07:11:44,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:11:44,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:46,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:11:51,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:11:51,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:11:52,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:11:54,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 07:11:55,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:57,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:12:00,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:12:02,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:12:04,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:04,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:12:07,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:12:07,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=794306.6666666666, ans=0.0 2023-10-02 07:12:09,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:12:13,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:12:17,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:12:22,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:12:22,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:12:22,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:12:27,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:12:29,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:12:29,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 07:12:29,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:31,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:12:33,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.whiten.whitening_limit, batch_count=794440.0, ans=12.0 2023-10-02 07:12:33,764 INFO [train.py:1046] (2/4) Epoch 23, batch 2300, loss[loss=0.2369, simple_loss=0.2908, pruned_loss=0.09153, over 19699.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2496, pruned_loss=0.04806, over 4716592.31 frames. ], batch size: 390, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:12:33,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 07:12:36,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:12:36,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:42,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:43,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=794440.0, ans=0.125 2023-10-02 07:12:44,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:12:45,715 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 07:12:47,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:47,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=794506.6666666666, ans=0.125 2023-10-02 07:12:52,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=794506.6666666666, ans=0.04949747468305833 2023-10-02 07:12:55,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:12:55,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:12:55,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:12:56,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:56,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 07:12:56,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:12:59,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:13:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:13:03,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:13:04,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:13:07,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:13:09,520 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.881e+02 2.032e+02 2.329e+02 3.115e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-02 07:13:14,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:13:14,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:13:15,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=794573.3333333334, ans=0.04949747468305833 2023-10-02 07:13:17,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:13:20,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:13:23,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:13:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:13:25,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:13:25,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 07:13:27,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=794640.0, ans=0.0 2023-10-02 07:13:28,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:13:28,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:13:29,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:13:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:13:29,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:13:30,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 07:13:30,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:13:32,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 07:13:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:13:32,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:13:33,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 07:13:33,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=794706.6666666666, ans=0.1 2023-10-02 07:13:37,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:13:40,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:13:45,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:13:47,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:13:47,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:13:48,592 INFO [train.py:1046] (2/4) Epoch 23, batch 2350, loss[loss=0.169, simple_loss=0.2533, pruned_loss=0.04236, over 24457.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2506, pruned_loss=0.04863, over 4713522.57 frames. ], batch size: 63, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:13:48,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:13:48,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:13:48,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:13:50,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 07:13:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:13:56,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 07:14:01,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 07:14:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:14:08,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:08,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:08,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:14:08,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:14:10,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 07:14:11,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:14:17,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 07:14:18,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:14:21,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:14:21,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:14:24,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:14:26,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 07:14:26,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:14:26,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=794906.6666666666, ans=0.1 2023-10-02 07:14:27,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:14:27,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:14:27,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:14:30,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:14:30,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=794906.6666666666, ans=0.1 2023-10-02 07:14:31,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 07:14:33,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:14:34,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:34,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:14:36,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=794973.3333333334, ans=0.0 2023-10-02 07:14:37,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 07:14:39,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:14:43,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 07:14:43,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:14:47,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 07:14:51,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 07:14:52,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:14:52,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:14:52,831 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 07:14:52,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 07:14:54,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 07:14:57,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:14:57,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=795040.0, ans=0.125 2023-10-02 07:15:01,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:15:02,810 INFO [train.py:1046] (2/4) Epoch 23, batch 2400, loss[loss=0.1813, simple_loss=0.2498, pruned_loss=0.05638, over 23634.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2501, pruned_loss=0.04857, over 4719691.41 frames. ], batch size: 149, lr: 4.47e-03, grad_scale: 32.0 2023-10-02 07:15:05,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:15:07,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:15:08,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 07:15:09,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 07:15:10,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-10-02 07:15:13,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=795106.6666666666, ans=0.2 2023-10-02 07:15:17,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:15:17,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:15:18,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 07:15:19,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.51 vs. limit=22.5 2023-10-02 07:15:20,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:15:20,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:21,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 07:15:26,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.23 vs. limit=10.0 2023-10-02 07:15:27,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:29,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 07:15:33,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:15:36,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 07:15:37,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:15:39,698 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.839e+02 2.014e+02 2.319e+02 3.519e+02, threshold=4.028e+02, percent-clipped=0.0 2023-10-02 07:15:41,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:42,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:15:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 07:15:44,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:15:51,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:15:54,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:15:57,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:15:59,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:15:59,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:15:59,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:15:59,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:15:59,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:16:00,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:16:03,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:16:05,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:16:05,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 07:16:06,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 07:16:08,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:16:08,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:16:09,063 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.28 vs. limit=15.0 2023-10-02 07:16:09,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 07:16:09,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 07:16:09,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 07:16:09,870 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 07:16:11,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 07:16:13,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:16:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:14,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:16:15,937 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 07:16:17,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:18,737 INFO [train.py:1046] (2/4) Epoch 23, batch 2450, loss[loss=0.1684, simple_loss=0.2557, pruned_loss=0.04055, over 24641.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2494, pruned_loss=0.0478, over 4725179.86 frames. ], batch size: 68, lr: 4.47e-03, grad_scale: 32.0 2023-10-02 07:16:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:16:22,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:16:22,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:16:22,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=795440.0, ans=0.2 2023-10-02 07:16:26,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:26,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:16:27,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 07:16:32,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:16:32,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:36,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:16:36,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:16:36,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:16:37,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 07:16:41,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:44,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:16:44,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:16:47,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:16:47,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:16:49,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:16:49,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:51,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=795573.3333333334, ans=0.125 2023-10-02 07:16:52,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 07:16:53,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:17:01,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.97 vs. limit=12.0 2023-10-02 07:17:01,385 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-10-02 07:17:01,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:03,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:17:03,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:05,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:17:05,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:06,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:17:06,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 07:17:09,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:17:09,818 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:17:11,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:17:14,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:17:14,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:17,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:17:19,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 07:17:19,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:17:20,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:17:21,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 07:17:22,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:17:23,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:17:25,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:17:27,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:27,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:17:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 07:17:32,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:17:33,342 INFO [train.py:1046] (2/4) Epoch 23, batch 2500, loss[loss=0.1762, simple_loss=0.2606, pruned_loss=0.04585, over 24411.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.248, pruned_loss=0.04769, over 4706525.71 frames. ], batch size: 77, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:17:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:17:47,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:17:49,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:50,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:17:50,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 07:17:56,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:17:58,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:17:58,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:17:58,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:17:59,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 07:17:59,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=795840.0, ans=0.125 2023-10-02 07:18:00,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:02,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:18:03,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 07:18:03,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:03,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 07:18:04,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:05,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=795906.6666666666, ans=0.0 2023-10-02 07:18:09,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:18:09,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:18:11,066 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.892e+02 2.068e+02 2.319e+02 3.851e+02, threshold=4.136e+02, percent-clipped=0.0 2023-10-02 07:18:12,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:18:12,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=795906.6666666666, ans=0.125 2023-10-02 07:18:14,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 07:18:14,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:18:16,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:20,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:22,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:26,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:18:32,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:18:35,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 07:18:35,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:18:35,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:18:36,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:18:36,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:18:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 07:18:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 07:18:38,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 07:18:40,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 07:18:42,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 07:18:44,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:18:46,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 07:18:49,410 INFO [train.py:1046] (2/4) Epoch 23, batch 2550, loss[loss=0.1928, simple_loss=0.2614, pruned_loss=0.06207, over 23482.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2483, pruned_loss=0.04775, over 4710634.17 frames. ], batch size: 285, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:18:50,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 07:18:51,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=796106.6666666666, ans=0.025 2023-10-02 07:18:51,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=796106.6666666666, ans=0.0 2023-10-02 07:18:51,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=796106.6666666666, ans=0.125 2023-10-02 07:18:52,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:18:53,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:18:55,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:18:55,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=796106.6666666666, ans=0.125 2023-10-02 07:18:57,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:18:57,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.39 vs. limit=22.5 2023-10-02 07:18:58,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 07:18:58,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:19:02,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 07:19:04,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:19:05,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:06,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:19:06,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 07:19:06,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:19:06,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:19:08,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:19:08,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=796173.3333333334, ans=0.125 2023-10-02 07:19:10,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:19:10,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 07:19:12,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:19:12,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:12,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 07:19:24,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:19:30,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:19:30,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:30,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:19:30,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:19:37,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:19:39,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:19:39,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:19:39,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:19:39,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:19:40,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:19:43,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:19:43,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:50,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:19:50,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 07:19:50,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:19:51,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:53,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:19:53,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:19:55,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:00,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:20:03,558 INFO [train.py:1046] (2/4) Epoch 23, batch 2600, loss[loss=0.1577, simple_loss=0.2394, pruned_loss=0.03795, over 24301.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2485, pruned_loss=0.04753, over 4709885.42 frames. ], batch size: 61, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:20:03,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:06,370 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 07:20:09,126 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 07:20:09,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:20:09,169 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 07:20:09,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 07:20:10,557 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 07:20:12,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:20:12,071 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 07:20:13,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 07:20:15,327 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 07:20:16,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:20:17,530 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.14 vs. limit=10.0 2023-10-02 07:20:20,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 07:20:22,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 07:20:22,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:20:22,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 07:20:24,948 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 07:20:24,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 07:20:26,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=796506.6666666666, ans=0.035 2023-10-02 07:20:27,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=796506.6666666666, ans=0.125 2023-10-02 07:20:31,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=796506.6666666666, ans=0.1 2023-10-02 07:20:31,860 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.09 vs. limit=22.5 2023-10-02 07:20:32,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:20:32,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:20:32,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 07:20:35,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:20:40,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.53 vs. limit=15.0 2023-10-02 07:20:41,115 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.855e+02 2.030e+02 2.251e+02 3.622e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 07:20:41,259 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 07:20:47,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:48,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:20:48,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=796640.0, ans=0.125 2023-10-02 07:20:50,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 07:20:50,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:20:50,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:20:52,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 07:20:55,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:20:56,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:20:58,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:01,151 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 07:21:01,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:02,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:21:06,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:21:06,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:21:06,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 07:21:07,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:21:08,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:21:09,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:21:09,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=796706.6666666666, ans=0.125 2023-10-02 07:21:14,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=796706.6666666666, ans=0.1 2023-10-02 07:21:14,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=796706.6666666666, ans=0.05 2023-10-02 07:21:15,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.98 vs. limit=10.0 2023-10-02 07:21:15,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 07:21:15,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:17,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:21:18,615 INFO [train.py:1046] (2/4) Epoch 23, batch 2650, loss[loss=0.1483, simple_loss=0.232, pruned_loss=0.0323, over 24508.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2486, pruned_loss=0.04765, over 4721021.53 frames. ], batch size: 63, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:21:22,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 07:21:22,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:24,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:21:24,227 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 07:21:24,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:21:26,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:28,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:21:29,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:21:32,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:32,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 07:21:32,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:21:32,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:21:37,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 07:21:38,620 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 07:21:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:21:44,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 07:21:44,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:21:44,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 07:21:48,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:21:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:21:48,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:21:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:21:55,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 07:21:55,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 07:21:56,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:22:01,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 07:22:01,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:22:03,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:03,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:22:04,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:22:04,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:22:06,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:22:08,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:22:10,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:22:10,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:22:12,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:22:13,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:14,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:22:14,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:17,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:22:17,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:22:20,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:22,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:22:22,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:22,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 07:22:24,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=797040.0, ans=0.125 2023-10-02 07:22:28,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:22:29,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:29,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:30,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:30,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:22:32,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:33,647 INFO [train.py:1046] (2/4) Epoch 23, batch 2700, loss[loss=0.192, simple_loss=0.2755, pruned_loss=0.05426, over 24457.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2498, pruned_loss=0.04812, over 4718601.95 frames. ], batch size: 69, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:22:35,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:22:35,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 07:22:37,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:22:39,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 07:22:41,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:22:41,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:41,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:44,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:22:44,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:44,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:22:44,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:22:44,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 07:22:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:22:45,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:22:46,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:22:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:50,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:22:51,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 07:22:52,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:22:59,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:22:59,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:22:59,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=797173.3333333334, ans=0.0 2023-10-02 07:23:01,836 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.80 vs. limit=12.0 2023-10-02 07:23:03,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:23:04,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:23:04,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:23:05,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:23:08,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:09,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:23:09,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:23:09,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:23:11,501 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.838e+02 2.027e+02 2.219e+02 3.329e+02, threshold=4.053e+02, percent-clipped=0.0 2023-10-02 07:23:14,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:14,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:23:16,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=797240.0, ans=0.1 2023-10-02 07:23:20,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=797306.6666666666, ans=0.125 2023-10-02 07:23:23,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:23:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:23:23,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=797306.6666666666, ans=0.5 2023-10-02 07:23:23,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.23 vs. limit=10.0 2023-10-02 07:23:27,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:23:27,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:30,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:32,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:32,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:23:33,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:23:33,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=797373.3333333334, ans=0.125 2023-10-02 07:23:35,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:23:39,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:23:39,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:39,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:43,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 07:23:43,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:23:46,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 07:23:48,021 INFO [train.py:1046] (2/4) Epoch 23, batch 2750, loss[loss=0.1551, simple_loss=0.2322, pruned_loss=0.03898, over 24619.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2501, pruned_loss=0.04839, over 4707460.01 frames. ], batch size: 60, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:23:49,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 07:23:49,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:52,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:23:53,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:55,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:23:56,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:23:56,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:00,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:00,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:24:00,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:24:00,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:00,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 07:24:00,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:24:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:24:06,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 07:24:09,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:24:09,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:09,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:24:09,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:24:10,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:24:12,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:24:12,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:14,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:16,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:24:16,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:24:16,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=797573.3333333334, ans=0.2 2023-10-02 07:24:18,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:24:19,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:20,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:24:25,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=797573.3333333334, ans=0.125 2023-10-02 07:24:26,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:30,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:24:30,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:34,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:34,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:24:36,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:24:40,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:24:40,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:24:40,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 07:24:45,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:47,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 07:24:51,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:24:53,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:24:55,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 07:24:55,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:24:56,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:24:58,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 07:24:58,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:25:01,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 07:25:02,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:03,233 INFO [train.py:1046] (2/4) Epoch 23, batch 2800, loss[loss=0.1775, simple_loss=0.2442, pruned_loss=0.05539, over 23796.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2486, pruned_loss=0.04796, over 4712430.26 frames. ], batch size: 179, lr: 4.46e-03, grad_scale: 32.0 2023-10-02 07:25:03,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:03,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 07:25:03,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:04,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:04,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:06,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 07:25:06,106 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 07:25:08,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:10,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:25:10,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:25:14,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:25:16,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 07:25:16,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=797840.0, ans=0.0 2023-10-02 07:25:17,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 07:25:17,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 07:25:18,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.39 vs. limit=10.0 2023-10-02 07:25:19,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=797840.0, ans=0.125 2023-10-02 07:25:19,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=797840.0, ans=0.1 2023-10-02 07:25:20,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:20,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:25:20,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:25:25,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:25:25,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:25,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:25:25,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:25:36,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:25:37,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:40,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:40,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:25:41,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:25:43,044 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.426e+02 1.852e+02 2.120e+02 2.350e+02 3.658e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-02 07:25:47,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:25:47,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 07:25:49,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:50,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:25:50,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:25:54,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:54,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=797973.3333333334, ans=0.025 2023-10-02 07:25:56,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:56,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.42 vs. limit=10.0 2023-10-02 07:25:57,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:25:59,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:25:59,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:59,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:26:01,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:26:01,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:26:03,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:26:03,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 07:26:03,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:05,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:26:06,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:07,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 07:26:07,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:07,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:26:09,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:26:09,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 07:26:15,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:26:16,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:26:16,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:26:17,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=798106.6666666666, ans=0.2 2023-10-02 07:26:18,358 INFO [train.py:1046] (2/4) Epoch 23, batch 2850, loss[loss=0.1616, simple_loss=0.2332, pruned_loss=0.04502, over 23688.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2476, pruned_loss=0.04739, over 4716752.57 frames. ], batch size: 232, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:26:18,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:26:22,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:26:22,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:26:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:26:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:25,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:26:28,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:26:28,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 07:26:35,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 07:26:35,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:26:37,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 07:26:37,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:40,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 07:26:40,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 07:26:43,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:47,124 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:26:55,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:56,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:26:56,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:26:57,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:26:57,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:26:58,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:26:59,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:26:59,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 07:27:02,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:27:04,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:27:04,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:27:06,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:07,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:08,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:08,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:11,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:27:11,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:27:13,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:14,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:17,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:27:20,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:27:20,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=798373.3333333334, ans=0.0 2023-10-02 07:27:22,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 07:27:23,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 07:27:23,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:27:24,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:27:24,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 07:27:25,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:27:26,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:27:26,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:27:26,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:27:26,373 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 07:27:27,674 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 07:27:27,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:27:27,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:32,439 INFO [train.py:1046] (2/4) Epoch 23, batch 2900, loss[loss=0.1758, simple_loss=0.2588, pruned_loss=0.04636, over 23765.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2478, pruned_loss=0.04731, over 4723857.30 frames. ], batch size: 85, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:27:34,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:27:34,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:27:34,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:27:36,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 07:27:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:40,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 07:27:40,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 07:27:42,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:27:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:27:43,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=798440.0, ans=0.125 2023-10-02 07:27:44,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:46,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.86 vs. limit=12.0 2023-10-02 07:27:47,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:27:49,169 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.81 vs. limit=15.0 2023-10-02 07:27:51,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:27:51,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:54,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:27:54,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 07:27:54,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:27:56,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:58,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 07:27:58,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 07:28:03,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:28:03,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 07:28:03,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:28:06,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:28:06,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:28:08,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:28:10,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:28:13,038 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.877e+02 2.119e+02 2.424e+02 3.198e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 07:28:13,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:28:14,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:17,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 07:28:17,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 07:28:17,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:28:21,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:28:23,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 07:28:24,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:28:26,411 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.94 vs. limit=8.0 2023-10-02 07:28:28,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:28:31,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.84 vs. limit=6.0 2023-10-02 07:28:37,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:28:37,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:28:39,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 07:28:42,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=798706.6666666666, ans=0.0 2023-10-02 07:28:43,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:43,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 07:28:43,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:28:44,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:28:47,397 INFO [train.py:1046] (2/4) Epoch 23, batch 2950, loss[loss=0.1767, simple_loss=0.2551, pruned_loss=0.04912, over 23724.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2484, pruned_loss=0.04753, over 4729489.58 frames. ], batch size: 85, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:28:50,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:28:50,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=798773.3333333334, ans=0.2 2023-10-02 07:28:52,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 07:28:52,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:28:53,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:56,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:28:56,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=798773.3333333334, ans=0.125 2023-10-02 07:28:57,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:28:57,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 07:28:59,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 07:28:59,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:28:59,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:29:03,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=798840.0, ans=0.0 2023-10-02 07:29:05,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:29:06,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:29:07,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:09,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:29:12,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.27 vs. limit=15.0 2023-10-02 07:29:12,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:29:12,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:29:14,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:29:15,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:29:15,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:29:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 07:29:22,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 07:29:22,354 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 07:29:23,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:29:25,684 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 07:29:27,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 07:29:28,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:29:28,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:29:28,497 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 07:29:28,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:29:31,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 07:29:31,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:29:32,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:29:32,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=798973.3333333334, ans=0.125 2023-10-02 07:29:34,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=798973.3333333334, ans=0.0 2023-10-02 07:29:35,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:29:35,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:29:35,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:37,491 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 07:29:37,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:29:37,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 07:29:44,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:45,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:29:45,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=799040.0, ans=0.0 2023-10-02 07:29:46,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 07:29:46,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:29:48,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 07:29:50,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:29:52,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:29:52,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:29:52,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=799040.0, ans=0.2 2023-10-02 07:29:53,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:53,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:29:54,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:29:55,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:55,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:29:56,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:29:56,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:29:57,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:29:58,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:58,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 07:30:00,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:30:01,194 INFO [train.py:1046] (2/4) Epoch 23, batch 3000, loss[loss=0.2278, simple_loss=0.2891, pruned_loss=0.08325, over 19250.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2491, pruned_loss=0.04811, over 4716269.40 frames. ], batch size: 388, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:30:01,194 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 07:30:13,756 INFO [train.py:1078] (2/4) Epoch 23, validation: loss=0.3132, simple_loss=0.2719, pruned_loss=0.1772, over 1125622.00 frames. 2023-10-02 07:30:13,757 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 07:30:15,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:30:16,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:30:18,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.37 vs. limit=15.0 2023-10-02 07:30:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 07:30:19,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 07:30:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:30:22,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:30:23,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 07:30:23,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:30:30,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:30:40,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:30:45,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=799240.0, ans=0.1 2023-10-02 07:30:46,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 07:30:49,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:30:52,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:30:52,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:30:53,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:30:54,302 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.810e+02 1.967e+02 2.227e+02 3.390e+02, threshold=3.934e+02, percent-clipped=0.0 2023-10-02 07:30:54,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:30:55,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 07:30:55,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 07:30:56,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=799240.0, ans=0.07 2023-10-02 07:30:57,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:30:57,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:30:59,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:31:00,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:31:00,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:00,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:31:03,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=799306.6666666666, ans=0.125 2023-10-02 07:31:04,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:31:04,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:31:04,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:31:05,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:31:08,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 07:31:10,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:31:10,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=799306.6666666666, ans=0.04949747468305833 2023-10-02 07:31:11,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:31:18,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:18,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:19,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 07:31:21,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 07:31:21,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:31:21,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 07:31:22,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:31:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 07:31:25,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:31:27,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:31:28,384 INFO [train.py:1046] (2/4) Epoch 23, batch 3050, loss[loss=0.1622, simple_loss=0.2398, pruned_loss=0.0423, over 24612.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2496, pruned_loss=0.04805, over 4713324.89 frames. ], batch size: 60, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:31:28,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 07:31:28,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 07:31:28,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:31:29,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:31:31,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:31,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:31:31,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:32,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:31:34,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 07:31:34,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=799440.0, ans=0.0 2023-10-02 07:31:35,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:31:37,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:31:37,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:31:40,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:42,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 07:31:44,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.37 vs. limit=22.5 2023-10-02 07:31:49,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 07:31:49,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 07:31:51,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:31:53,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:31:56,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=799506.6666666666, ans=0.09899494936611666 2023-10-02 07:31:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:57,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:31:58,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:00,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=799573.3333333334, ans=0.125 2023-10-02 07:32:01,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:32:01,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:32:02,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:32:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:02,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:32:04,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=799573.3333333334, ans=0.125 2023-10-02 07:32:05,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:07,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=799573.3333333334, ans=0.125 2023-10-02 07:32:08,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:08,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 07:32:08,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=799573.3333333334, ans=0.1 2023-10-02 07:32:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:32:09,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:32:11,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:32:12,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:32:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:32:14,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:20,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:20,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:26,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:26,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:32:26,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:29,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:32:29,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:32:31,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:32:32,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 07:32:33,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:32:33,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:33,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 07:32:36,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:40,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:42,354 INFO [train.py:1046] (2/4) Epoch 23, batch 3100, loss[loss=0.1676, simple_loss=0.2418, pruned_loss=0.04671, over 23478.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2486, pruned_loss=0.04821, over 4707489.35 frames. ], batch size: 134, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:32:42,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:32:46,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:32:47,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 07:32:50,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 07:32:51,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 07:32:53,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:32:57,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:32:57,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:58,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 07:33:00,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=799840.0, ans=0.0 2023-10-02 07:33:02,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:05,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=799840.0, ans=0.1 2023-10-02 07:33:07,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 07:33:12,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:33:14,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:14,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:33:14,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:33:15,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 07:33:17,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:33:19,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 07:33:19,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:33:19,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:20,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 07:33:22,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:33:22,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=799906.6666666666, ans=0.0 2023-10-02 07:33:23,450 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.736e+02 1.938e+02 2.240e+02 3.003e+02, threshold=3.876e+02, percent-clipped=0.0 2023-10-02 07:33:24,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:33:25,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 07:33:26,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 07:33:26,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:28,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:30,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:33:30,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:30,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:33:31,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:33:31,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:33:37,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:33:37,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:33:37,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 07:33:42,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:33:44,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 07:33:45,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:33:46,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 07:33:46,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:33:48,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:48,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 07:33:55,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.32 vs. limit=22.5 2023-10-02 07:34:00,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 07:34:01,927 INFO [train.py:1046] (2/4) Epoch 23, batch 3150, loss[loss=0.1494, simple_loss=0.2312, pruned_loss=0.03381, over 24611.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2476, pruned_loss=0.04797, over 4707142.56 frames. ], batch size: 60, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:34:03,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:03,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:04,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:34:04,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:34:06,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 07:34:07,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:07,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:34:09,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 07:34:11,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:13,302 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 07:34:14,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 07:34:16,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:34:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 07:34:17,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 07:34:17,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 07:34:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 07:34:19,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 07:34:21,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:21,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:34:21,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:21,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=800173.3333333334, ans=0.125 2023-10-02 07:34:24,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 07:34:25,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:26,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:27,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:34:28,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:34:32,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 07:34:32,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=15.0 2023-10-02 07:34:33,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:34:36,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:34:37,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:34:39,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 07:34:41,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 07:34:42,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:34:43,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 07:34:43,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:34:43,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:43,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:34:44,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:34:44,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:34:46,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 07:34:46,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:34:46,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:34:47,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:34:47,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:34:49,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 07:34:49,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:34:51,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 07:34:51,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:34:52,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 07:34:52,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 07:34:54,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:34:55,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:34:56,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=800306.6666666666, ans=0.125 2023-10-02 07:34:57,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 07:34:58,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 07:34:58,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:58,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=800306.6666666666, ans=0.2 2023-10-02 07:35:02,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:35:02,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:02,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:35:07,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:35:09,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:10,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 07:35:16,081 INFO [train.py:1046] (2/4) Epoch 23, batch 3200, loss[loss=0.1583, simple_loss=0.2393, pruned_loss=0.03864, over 24353.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2473, pruned_loss=0.04745, over 4714885.76 frames. ], batch size: 61, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:35:16,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:35:16,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:35:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:22,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:35:22,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 07:35:25,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:35:25,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=800440.0, ans=15.0 2023-10-02 07:35:29,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:35:32,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:38,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:35:49,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 07:35:50,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:35:52,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=800573.3333333334, ans=0.125 2023-10-02 07:35:54,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 07:35:54,599 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.07 vs. limit=15.0 2023-10-02 07:35:55,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:35:56,707 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.871e+02 2.114e+02 2.432e+02 3.533e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-02 07:35:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:35:58,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:35:58,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:35:59,007 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.40 vs. limit=6.0 2023-10-02 07:36:04,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 07:36:04,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=800640.0, ans=0.2 2023-10-02 07:36:05,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 07:36:06,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 07:36:10,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 07:36:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:36:18,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:18,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:36:18,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:18,710 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 07:36:18,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:36:23,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:36:25,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 07:36:26,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 07:36:26,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 07:36:28,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 07:36:29,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:36:30,762 INFO [train.py:1046] (2/4) Epoch 23, batch 3250, loss[loss=0.1758, simple_loss=0.244, pruned_loss=0.05378, over 23729.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2478, pruned_loss=0.04774, over 4717327.48 frames. ], batch size: 232, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:36:32,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:36:32,254 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 07:36:32,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:36:32,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:35,333 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 07:36:35,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=800773.3333333334, ans=0.025 2023-10-02 07:36:36,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:36:40,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:36:40,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=800773.3333333334, ans=0.0 2023-10-02 07:36:40,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=800773.3333333334, ans=0.2 2023-10-02 07:36:47,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:36:47,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 07:36:48,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:36:48,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:48,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:36:50,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:36:51,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:36:52,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=800840.0, ans=0.125 2023-10-02 07:36:55,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:55,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:36:55,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=800840.0, ans=0.125 2023-10-02 07:36:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:36:56,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:56,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:56,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:36:59,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:36:59,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:37:02,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:37:02,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:37:04,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:37:04,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:37:05,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:37:10,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 07:37:10,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:37:10,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:37:11,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:11,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:37:11,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=800906.6666666666, ans=0.025 2023-10-02 07:37:13,655 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=12.0 2023-10-02 07:37:17,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:37:22,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.97 vs. limit=15.0 2023-10-02 07:37:25,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:37:25,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:25,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 07:37:25,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:37:25,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:37:25,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=800973.3333333334, ans=0.1 2023-10-02 07:37:26,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:29,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 07:37:29,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 07:37:30,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:37:30,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:37:32,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 07:37:32,706 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.13 vs. limit=6.0 2023-10-02 07:37:33,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:37:35,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=801040.0, ans=0.05 2023-10-02 07:37:37,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:37:37,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:37:39,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 07:37:39,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:37:42,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:37:42,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 07:37:45,276 INFO [train.py:1046] (2/4) Epoch 23, batch 3300, loss[loss=0.1455, simple_loss=0.222, pruned_loss=0.03455, over 24372.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2482, pruned_loss=0.04749, over 4716019.59 frames. ], batch size: 56, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:37:45,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:37:45,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 07:37:45,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.32 vs. limit=15.0 2023-10-02 07:37:48,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 07:37:48,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 07:37:49,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:49,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=801106.6666666666, ans=0.2 2023-10-02 07:37:52,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:37:53,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:37:53,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:55,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:37:55,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:37:58,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:01,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:38:02,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 07:38:04,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:38:04,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:07,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:07,338 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 07:38:08,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:38:08,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:38:09,655 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.78 vs. limit=15.0 2023-10-02 07:38:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:38:10,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:10,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 07:38:14,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:38:14,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:38:15,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:17,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 07:38:19,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 07:38:19,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:20,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:38:20,612 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 07:38:22,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 07:38:22,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=801240.0, ans=0.125 2023-10-02 07:38:23,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:38:24,713 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.780e+02 1.961e+02 2.106e+02 2.870e+02, threshold=3.921e+02, percent-clipped=0.0 2023-10-02 07:38:26,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 07:38:28,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:38:29,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:38:31,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:38:33,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:38:33,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:33,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:38:35,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:38:37,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:38:37,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:38,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:38:41,370 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 07:38:41,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 07:38:44,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:38:44,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:38:44,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:38:45,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:45,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:38:47,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:38:47,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:48,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:38:48,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:50,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.08 vs. limit=15.0 2023-10-02 07:38:51,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:38:54,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 07:38:56,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:38:56,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:56,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.65 vs. limit=15.0 2023-10-02 07:38:59,501 INFO [train.py:1046] (2/4) Epoch 23, batch 3350, loss[loss=0.1569, simple_loss=0.2398, pruned_loss=0.03704, over 24484.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2486, pruned_loss=0.04783, over 4717482.39 frames. ], batch size: 63, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:38:59,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:38:59,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:39:00,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:03,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:39:03,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:05,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:39:05,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:06,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:39:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:11,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:39:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:12,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:39:13,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 07:39:16,810 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 07:39:16,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:19,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 07:39:19,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 07:39:20,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:39:20,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:39:22,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:22,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 07:39:23,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=801506.6666666666, ans=0.125 2023-10-02 07:39:24,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:24,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:39:26,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.81 vs. limit=15.0 2023-10-02 07:39:27,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:29,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:29,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:29,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:39:32,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:33,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:34,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:37,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:39:39,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:42,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:42,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:44,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:47,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 07:39:47,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:39:47,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 07:39:49,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:39:49,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 07:39:50,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:52,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:59,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:59,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 07:40:01,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:40:03,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:40:03,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:40:08,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:40:09,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 07:40:11,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:40:11,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:40:12,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:40:12,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 07:40:12,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:40:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 07:40:13,872 INFO [train.py:1046] (2/4) Epoch 23, batch 3400, loss[loss=0.1731, simple_loss=0.2612, pruned_loss=0.04253, over 24461.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2494, pruned_loss=0.04825, over 4717665.95 frames. ], batch size: 66, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:40:15,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:40:15,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:40:15,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:40:17,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:40:17,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 07:40:19,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=801773.3333333334, ans=0.95 2023-10-02 07:40:20,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 07:40:20,919 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 07:40:20,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:40:23,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=801773.3333333334, ans=0.125 2023-10-02 07:40:25,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:40:26,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:40:26,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:40:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:40:34,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:40:34,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 07:40:38,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:40:40,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=801840.0, ans=0.1 2023-10-02 07:40:42,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:40:42,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:40:43,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:40:49,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:40:54,711 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.893e+02 2.076e+02 2.373e+02 3.275e+02, threshold=4.152e+02, percent-clipped=0.0 2023-10-02 07:40:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 07:41:00,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:41:00,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:41:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 07:41:02,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:41:02,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:04,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:41:04,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:41:05,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=801973.3333333334, ans=0.1 2023-10-02 07:41:07,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:41:11,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:41:11,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:41:17,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:41:18,254 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.21 vs. limit=22.5 2023-10-02 07:41:18,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 07:41:20,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=802040.0, ans=0.1 2023-10-02 07:41:23,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:41:27,210 INFO [train.py:1046] (2/4) Epoch 23, batch 3450, loss[loss=0.1676, simple_loss=0.2286, pruned_loss=0.05333, over 23704.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2495, pruned_loss=0.04861, over 4720149.21 frames. ], batch size: 232, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:41:27,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 07:41:30,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 07:41:30,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:41:32,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:41:34,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 07:41:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:41:38,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:41:42,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:41:45,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:41:45,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:41:45,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:47,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:50,214 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:41:52,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 07:41:58,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 07:41:58,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:41:59,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:42:00,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:06,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 07:42:07,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:42:11,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:42:12,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:42:13,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:42:13,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=802306.6666666666, ans=0.125 2023-10-02 07:42:14,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:42:17,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 07:42:17,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:42:17,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:42:20,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:42:22,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.45 vs. limit=22.5 2023-10-02 07:42:23,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 07:42:25,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:42:30,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:42:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:35,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:40,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:40,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:42:40,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:42:41,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:42:43,272 INFO [train.py:1046] (2/4) Epoch 23, batch 3500, loss[loss=0.1546, simple_loss=0.2, pruned_loss=0.05461, over 19355.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2483, pruned_loss=0.04845, over 4710267.02 frames. ], batch size: 388, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:42:44,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:42:48,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 07:42:48,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:42:53,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 07:42:57,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:57,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 07:43:01,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:43:01,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:43:03,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:43:03,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:03,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:43:03,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:03,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:43:03,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=802506.6666666666, ans=0.0 2023-10-02 07:43:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 07:43:06,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:06,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:43:09,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:43:13,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:13,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 07:43:13,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:43:16,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:43:18,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:43:18,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=802573.3333333334, ans=0.125 2023-10-02 07:43:19,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:19,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:43:19,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:43:19,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=802573.3333333334, ans=0.1 2023-10-02 07:43:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 07:43:22,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 07:43:22,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 07:43:22,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=802573.3333333334, ans=0.0 2023-10-02 07:43:23,839 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.940e+02 2.168e+02 2.529e+02 4.138e+02, threshold=4.336e+02, percent-clipped=0.0 2023-10-02 07:43:23,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:43:24,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:25,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:25,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:43:28,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:43:28,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:43:31,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=802640.0, ans=0.1 2023-10-02 07:43:33,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:43:34,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 07:43:34,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 07:43:34,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:43:37,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:43:39,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:43:40,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:43,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 07:43:45,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:43:46,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:47,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 07:43:49,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=802706.6666666666, ans=0.125 2023-10-02 07:43:50,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 07:43:51,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:53,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:43:53,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:43:53,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:43:56,088 INFO [train.py:1046] (2/4) Epoch 23, batch 3550, loss[loss=0.172, simple_loss=0.235, pruned_loss=0.05449, over 23456.00 frames. ], tot_loss[loss=0.172, simple_loss=0.248, pruned_loss=0.04795, over 4715993.10 frames. ], batch size: 285, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:43:56,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=802773.3333333334, ans=0.07 2023-10-02 07:43:57,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:44:01,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=802773.3333333334, ans=0.125 2023-10-02 07:44:05,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:07,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 07:44:10,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:44:11,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:44:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:12,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:44:14,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:44:16,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:44:17,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:44:17,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:18,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:44:20,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:44:25,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:44:25,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:44:26,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:44:26,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:28,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:44:28,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 07:44:28,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:30,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:31,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:44:35,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:44:36,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:44:37,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:44:39,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=802973.3333333334, ans=0.2 2023-10-02 07:44:40,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 07:44:40,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:44:42,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 07:44:42,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:44:44,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:44:44,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:44:48,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 07:44:49,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:44:55,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:44:56,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 07:44:57,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:44:57,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=803040.0, ans=0.125 2023-10-02 07:45:01,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:45:01,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=803040.0, ans=10.0 2023-10-02 07:45:02,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 07:45:07,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 07:45:07,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:45:09,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:45:10,509 INFO [train.py:1046] (2/4) Epoch 23, batch 3600, loss[loss=0.1776, simple_loss=0.2485, pruned_loss=0.05337, over 23845.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2482, pruned_loss=0.04766, over 4717434.23 frames. ], batch size: 179, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:45:10,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:10,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:12,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:45:15,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:45:16,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:18,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:45:18,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:45:19,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:19,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 07:45:23,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:45:24,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:26,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:45:28,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=15.0 2023-10-02 07:45:29,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:45:31,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:45:31,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:45:31,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 07:45:33,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:45:36,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:36,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:45:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:45:39,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:45:40,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:45:42,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 07:45:44,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=803240.0, ans=0.125 2023-10-02 07:45:45,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=803240.0, ans=0.0 2023-10-02 07:45:50,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:45:52,129 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.380e+02 1.763e+02 1.994e+02 2.308e+02 3.230e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 07:45:52,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:45:53,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 07:45:57,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:46:02,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:03,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:10,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:46:11,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:46:11,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 07:46:12,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 07:46:13,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 07:46:15,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:46:15,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:46:17,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 07:46:17,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:46:18,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:46:18,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:46:20,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 07:46:20,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 07:46:24,919 INFO [train.py:1046] (2/4) Epoch 23, batch 3650, loss[loss=0.1557, simple_loss=0.2302, pruned_loss=0.04057, over 23393.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2489, pruned_loss=0.04781, over 4723643.13 frames. ], batch size: 119, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:46:24,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:25,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 07:46:27,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 07:46:29,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:46:35,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 07:46:37,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 07:46:40,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:46:40,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:46:42,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:46:42,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=803506.6666666666, ans=0.0 2023-10-02 07:46:43,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:46:43,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:46:43,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 07:46:44,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:46:46,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:46:46,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 07:46:46,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:46:47,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:46:47,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:46:50,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:46:52,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 07:46:53,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 07:46:53,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:46:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 07:46:56,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:46:56,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:47:02,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:47:04,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=803573.3333333334, ans=0.125 2023-10-02 07:47:06,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:47:06,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:47:06,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=803573.3333333334, ans=0.1 2023-10-02 07:47:07,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:47:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:47:08,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:47:12,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:47:14,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:14,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:47:14,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:47:16,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:47:17,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:47:17,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.13 vs. limit=12.0 2023-10-02 07:47:20,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=803640.0, ans=0.125 2023-10-02 07:47:24,797 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 07:47:25,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=803706.6666666666, ans=0.125 2023-10-02 07:47:29,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:47:29,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:47:30,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:47:30,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=803706.6666666666, ans=0.125 2023-10-02 07:47:31,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:32,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.14 vs. limit=15.0 2023-10-02 07:47:33,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:47:35,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:36,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 07:47:36,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:39,220 INFO [train.py:1046] (2/4) Epoch 23, batch 3700, loss[loss=0.1606, simple_loss=0.2459, pruned_loss=0.03764, over 24421.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2498, pruned_loss=0.04791, over 4736661.33 frames. ], batch size: 69, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:47:39,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:47:42,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:47:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:47:45,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:45,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 07:47:45,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:46,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 07:47:46,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:47:47,823 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-10-02 07:47:50,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:47:52,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:47:52,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:47:54,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:47:55,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:56,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:47:59,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:48:00,427 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 07:48:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:48:08,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:48:10,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:48:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 07:48:11,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:48:15,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:16,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 07:48:16,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:18,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:48:21,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:21,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:48:22,657 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.421e+02 1.851e+02 2.016e+02 2.342e+02 4.002e+02, threshold=4.032e+02, percent-clipped=1.0 2023-10-02 07:48:24,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:48:28,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:48:28,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 07:48:29,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.96 vs. limit=15.0 2023-10-02 07:48:30,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:48:30,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 07:48:36,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:48:36,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:48:39,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:48:39,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 07:48:42,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:48:42,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:48:42,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:48:42,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:48:42,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=804040.0, ans=0.0 2023-10-02 07:48:45,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:48:45,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 07:48:47,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 07:48:48,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:48:48,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:48:50,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:48:51,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:48:52,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:54,810 INFO [train.py:1046] (2/4) Epoch 23, batch 3750, loss[loss=0.1956, simple_loss=0.2595, pruned_loss=0.0658, over 23709.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2512, pruned_loss=0.04872, over 4728917.67 frames. ], batch size: 212, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:48:54,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:48:56,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:48:57,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 07:48:59,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 07:49:02,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:49:02,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 07:49:02,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=804106.6666666666, ans=0.125 2023-10-02 07:49:03,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:49:05,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:49:06,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:49:06,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=804106.6666666666, ans=0.125 2023-10-02 07:49:07,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:49:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:49:11,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=804173.3333333334, ans=0.2 2023-10-02 07:49:12,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=804173.3333333334, ans=0.0 2023-10-02 07:49:12,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=804173.3333333334, ans=0.125 2023-10-02 07:49:14,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:49:15,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:49:18,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:49:19,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=804173.3333333334, ans=0.2 2023-10-02 07:49:21,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:49:22,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 07:49:23,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:49:24,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:49:24,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:49:29,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 07:49:32,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 07:49:33,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:49:33,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:49:36,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:49:39,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:49:41,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:49:43,754 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-10-02 07:49:44,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 07:49:48,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:49:51,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:49:52,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:49:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:49:59,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:50:00,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:50:02,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:50:03,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:50:07,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:50:08,912 INFO [train.py:1046] (2/4) Epoch 23, batch 3800, loss[loss=0.1607, simple_loss=0.2382, pruned_loss=0.04155, over 23394.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2511, pruned_loss=0.04871, over 4721368.65 frames. ], batch size: 93, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:50:12,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=804440.0, ans=0.125 2023-10-02 07:50:14,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:50:17,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:18,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:50:20,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 07:50:21,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:50:24,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:50:24,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:50:27,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 07:50:27,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:28,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:50:29,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:50:29,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:50:30,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:32,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 07:50:35,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 07:50:36,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:50:38,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:50:41,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:50:41,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:50:42,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:50:42,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:44,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:45,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:49,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:50:49,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 07:50:52,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:50:53,401 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.883e+02 2.089e+02 2.519e+02 3.424e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 07:50:59,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:51:01,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=804640.0, ans=0.125 2023-10-02 07:51:04,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:51:05,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 07:51:08,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 07:51:08,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:10,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:51:10,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:11,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 07:51:16,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 07:51:16,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 07:51:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:18,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:51:23,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:51:23,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:51:25,052 INFO [train.py:1046] (2/4) Epoch 23, batch 3850, loss[loss=0.1611, simple_loss=0.2393, pruned_loss=0.04149, over 24677.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.25, pruned_loss=0.04861, over 4708593.27 frames. ], batch size: 60, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:51:25,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.40 vs. limit=10.0 2023-10-02 07:51:28,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:51:28,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=804773.3333333334, ans=0.1 2023-10-02 07:51:29,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 07:51:31,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:51:31,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:35,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:51:37,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:40,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:51:41,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 07:51:48,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:51:49,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:51,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:51:51,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:51:55,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:51:57,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:51:57,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:57,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:51:58,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:01,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:01,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:03,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:52:03,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 07:52:05,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 07:52:05,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:52:06,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:09,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:10,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:10,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 07:52:13,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 07:52:14,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:16,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 07:52:17,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:52:23,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:24,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:28,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:28,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 07:52:30,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 07:52:33,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:33,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:38,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:52:38,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:52:38,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:39,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:39,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:52:39,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 07:52:39,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:52:40,976 INFO [train.py:1046] (2/4) Epoch 23, batch 3900, loss[loss=0.1534, simple_loss=0.2039, pruned_loss=0.0515, over 19240.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04859, over 4699952.67 frames. ], batch size: 388, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:52:41,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 07:52:41,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:41,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:41,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.61 vs. limit=22.5 2023-10-02 07:52:42,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:52:42,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:43,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:52:45,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:45,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:45,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:52:45,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 07:52:46,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:53,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:52:53,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:52:53,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:52:56,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:52:59,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:52:59,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:53:00,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:53:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 07:53:01,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:01,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 07:53:03,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:53:03,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 07:53:04,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 07:53:09,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:53:11,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:53:11,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:53:11,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:14,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:53:16,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:53:18,864 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.42 vs. limit=15.0 2023-10-02 07:53:19,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:53:19,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:53:19,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=805240.0, ans=0.07 2023-10-02 07:53:20,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.36 vs. limit=10.0 2023-10-02 07:53:21,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:53:24,062 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.850e+02 2.030e+02 2.348e+02 4.332e+02, threshold=4.060e+02, percent-clipped=1.0 2023-10-02 07:53:27,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:28,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:53:34,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:53:36,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:53:45,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:53:47,547 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.50 vs. limit=22.5 2023-10-02 07:53:48,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:48,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 07:53:49,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 07:53:49,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:51,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 07:53:51,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:53:52,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 07:53:54,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=805440.0, ans=0.0 2023-10-02 07:53:56,301 INFO [train.py:1046] (2/4) Epoch 23, batch 3950, loss[loss=0.1615, simple_loss=0.2362, pruned_loss=0.04337, over 23337.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2478, pruned_loss=0.04779, over 4701742.28 frames. ], batch size: 119, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:53:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:58,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=805440.0, ans=0.125 2023-10-02 07:53:59,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 07:53:59,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:54:02,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:54:04,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:54:10,848 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 07:54:10,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:54:12,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 07:54:12,248 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 07:54:12,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.47 vs. limit=22.5 2023-10-02 07:54:13,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:54:16,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:54:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:54:16,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:54:19,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 07:54:22,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:54:23,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:54:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:54:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:54:26,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:54:37,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:54:37,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:54:40,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 07:54:42,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=805640.0, ans=0.1 2023-10-02 07:54:46,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 07:54:46,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 07:54:46,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:54:46,832 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:54:47,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:54:54,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:54:54,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:54:55,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:54:55,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:54:55,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 07:54:56,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=805706.6666666666, ans=0.2 2023-10-02 07:54:59,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:55:00,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:55:05,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 07:55:11,772 INFO [train.py:1046] (2/4) Epoch 23, batch 4000, loss[loss=0.1862, simple_loss=0.2511, pruned_loss=0.06059, over 23771.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2482, pruned_loss=0.04798, over 4698990.74 frames. ], batch size: 164, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:55:13,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:19,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:19,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=805773.3333333334, ans=0.0 2023-10-02 07:55:24,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:55:24,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:55:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:26,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 07:55:28,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:55:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 07:55:29,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:55:29,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 07:55:30,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:55:33,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:55:33,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:55:33,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:55:33,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:55:33,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:55:35,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:55:36,518 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 07:55:36,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:55:36,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:55:40,848 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 07:55:41,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:55:42,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:55:45,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=805906.6666666666, ans=0.05 2023-10-02 07:55:46,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 07:55:47,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:55:50,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:55:50,971 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 07:55:52,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:55:52,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 07:55:52,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:55:53,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:55:55,525 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.792e+02 1.999e+02 2.170e+02 4.369e+02, threshold=3.999e+02, percent-clipped=1.0 2023-10-02 07:55:55,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:55:57,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:55:57,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:55:57,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:55:59,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 07:55:59,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:56:00,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=805973.3333333334, ans=0.125 2023-10-02 07:56:01,311 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 07:56:01,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=805973.3333333334, ans=0.1 2023-10-02 07:56:01,872 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.35 vs. limit=10.0 2023-10-02 07:56:03,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.60 vs. limit=15.0 2023-10-02 07:56:06,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:56:09,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 07:56:11,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:56:13,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:56:14,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:56:14,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:56:19,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:56:20,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 07:56:20,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 07:56:21,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=806040.0, ans=0.0 2023-10-02 07:56:23,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:56:23,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:56:25,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:56:26,962 INFO [train.py:1046] (2/4) Epoch 23, batch 4050, loss[loss=0.1874, simple_loss=0.2547, pruned_loss=0.06006, over 23762.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2484, pruned_loss=0.04778, over 4712426.28 frames. ], batch size: 212, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:56:27,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:56:28,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:56:31,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:56:31,607 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:56:32,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:56:34,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:56:36,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:56:36,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:56:42,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:56:44,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:56:47,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 07:56:49,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 07:56:49,403 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 07:56:52,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:56:52,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=806173.3333333334, ans=0.95 2023-10-02 07:56:53,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=806173.3333333334, ans=0.125 2023-10-02 07:56:54,716 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=22.5 2023-10-02 07:56:57,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 07:56:58,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:57:01,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:57:05,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:57:05,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:57:05,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:57:07,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:57:07,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=806240.0, ans=0.1 2023-10-02 07:57:08,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=806240.0, ans=0.125 2023-10-02 07:57:10,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 07:57:10,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:57:13,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:57:14,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 07:57:16,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=806306.6666666666, ans=0.0 2023-10-02 07:57:19,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:57:26,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 07:57:27,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:57:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:57:29,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 07:57:29,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 07:57:29,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:31,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:57:34,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:34,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:57:39,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 07:57:41,180 INFO [train.py:1046] (2/4) Epoch 23, batch 4100, loss[loss=0.1651, simple_loss=0.2369, pruned_loss=0.04667, over 20540.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2494, pruned_loss=0.0484, over 4705920.93 frames. ], batch size: 45, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:57:41,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=806440.0, ans=0.125 2023-10-02 07:57:42,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 07:57:42,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=806440.0, ans=0.125 2023-10-02 07:57:44,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 07:57:45,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 07:57:45,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:45,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:46,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:47,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:57:47,466 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 07:57:52,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:57:52,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:57:52,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:54,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:57:54,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=806440.0, ans=0.0 2023-10-02 07:57:58,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:58:00,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:58:00,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:58:01,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 07:58:01,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:58:03,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:58:03,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:58:03,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:58:03,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 07:58:05,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:06,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=806506.6666666666, ans=0.125 2023-10-02 07:58:07,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 07:58:08,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:58:11,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:58:11,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 07:58:11,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=806573.3333333334, ans=0.125 2023-10-02 07:58:12,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:58:13,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:58:13,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:58:16,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 07:58:18,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:58:18,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:58:19,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=806573.3333333334, ans=0.125 2023-10-02 07:58:20,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 07:58:20,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:58:20,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=806573.3333333334, ans=0.0 2023-10-02 07:58:22,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:58:25,104 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.962e+02 2.255e+02 2.740e+02 4.048e+02, threshold=4.511e+02, percent-clipped=1.0 2023-10-02 07:58:25,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:29,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:58:32,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:58:34,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:58:41,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:58:41,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:45,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:58:46,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:58:48,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=806706.6666666666, ans=0.125 2023-10-02 07:58:51,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:58:53,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:58:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:58:54,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:58:56,618 INFO [train.py:1046] (2/4) Epoch 23, batch 4150, loss[loss=0.1545, simple_loss=0.2373, pruned_loss=0.03582, over 20258.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2499, pruned_loss=0.04849, over 4695809.25 frames. ], batch size: 44, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:58:57,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 07:58:59,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:58:59,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 07:58:59,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 07:59:00,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 07:59:02,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:59:08,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:59:08,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:59:12,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:59:12,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:59:13,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:59:13,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=806840.0, ans=0.1 2023-10-02 07:59:16,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:59:16,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:59:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:59:21,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:59:23,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=806906.6666666666, ans=0.125 2023-10-02 07:59:23,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=806906.6666666666, ans=0.0 2023-10-02 07:59:26,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:59:27,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 07:59:29,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 07:59:30,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:59:30,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 07:59:30,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:59:30,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:59:33,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:35,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:59:38,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 07:59:41,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:59:42,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:59:44,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 07:59:44,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:59:45,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 07:59:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:59:48,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:59:49,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:52,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 07:59:52,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:59:52,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:59:53,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:59:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 07:59:58,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:58,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:59:58,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:00:00,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 08:00:00,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:00:00,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 08:00:00,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:00:01,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:00:01,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 08:00:03,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:00:08,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:00:09,585 INFO [train.py:1046] (2/4) Epoch 23, batch 4200, loss[loss=0.1695, simple_loss=0.2536, pruned_loss=0.04267, over 24452.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2494, pruned_loss=0.04787, over 4706680.13 frames. ], batch size: 69, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 08:00:09,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 08:00:12,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:00:13,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:00:13,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:00:15,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:00:15,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:00:18,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=807106.6666666666, ans=0.125 2023-10-02 08:00:19,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 08:00:21,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 08:00:23,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:25,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:00:27,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:00:31,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 08:00:31,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:00:32,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:32,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 08:00:32,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:00:34,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:35,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:00:35,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:00:37,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:00:39,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 08:00:39,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:40,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=807240.0, ans=0.0 2023-10-02 08:00:43,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:00:44,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:00:47,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:00:48,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:00:51,053 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.783e+02 1.946e+02 2.154e+02 3.104e+02, threshold=3.892e+02, percent-clipped=0.0 2023-10-02 08:00:51,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:00:51,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 08:00:51,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:00:52,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:00:57,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:00:58,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:01:05,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:01:07,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 08:01:11,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:01:14,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:01:14,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:16,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 08:01:22,250 INFO [train.py:1046] (2/4) Epoch 23, batch 4250, loss[loss=0.1651, simple_loss=0.2414, pruned_loss=0.04436, over 23425.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2485, pruned_loss=0.04713, over 4711013.67 frames. ], batch size: 105, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:01:22,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:01:25,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:01:25,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:01:28,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:33,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:01:33,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 08:01:33,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:01:36,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:40,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:01:45,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:46,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:47,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:01:47,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:01:49,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:50,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:50,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:50,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=807573.3333333334, ans=0.125 2023-10-02 08:01:52,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:01:54,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:01:54,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 08:01:59,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 08:01:59,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:59,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:00,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:02:00,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:02:01,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:02,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:02:04,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=807573.3333333334, ans=0.0 2023-10-02 08:02:05,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:02:07,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:02:07,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-02 08:02:13,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:02:14,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:14,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 08:02:14,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:02:15,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 08:02:17,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:02:18,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:02:18,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:02:21,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 08:02:22,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:02:24,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:02:27,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:29,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:31,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:02:31,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:02:32,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:02:34,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:02:35,932 INFO [train.py:1046] (2/4) Epoch 23, batch 4300, loss[loss=0.1596, simple_loss=0.2432, pruned_loss=0.03798, over 24665.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2482, pruned_loss=0.04684, over 4708735.49 frames. ], batch size: 65, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:02:35,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:02:36,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 08:02:37,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:37,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=807773.3333333334, ans=0.125 2023-10-02 08:02:39,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=807773.3333333334, ans=0.125 2023-10-02 08:02:44,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:02:44,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:02:47,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:54,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:54,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 08:02:55,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:02:57,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:02:58,042 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.46 vs. limit=15.0 2023-10-02 08:02:58,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:02:58,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 08:03:01,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:03:02,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:03:06,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 08:03:06,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:03:06,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 08:03:09,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:03:09,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:03:09,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.95 vs. limit=12.0 2023-10-02 08:03:12,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:03:12,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:03:12,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:03:15,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:03:15,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:03:15,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 08:03:17,962 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.923e+02 2.251e+02 2.726e+02 4.370e+02, threshold=4.501e+02, percent-clipped=2.0 2023-10-02 08:03:18,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 08:03:18,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=807973.3333333334, ans=0.125 2023-10-02 08:03:19,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:03:20,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:20,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:03:20,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:22,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:03:22,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 08:03:22,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 08:03:22,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 08:03:23,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:03:23,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 08:03:23,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 08:03:27,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:03:28,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 08:03:28,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=807973.3333333334, ans=0.0 2023-10-02 08:03:29,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:03:31,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:31,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:03:31,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=807973.3333333334, ans=0.0 2023-10-02 08:03:34,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 08:03:36,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:03:36,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:36,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:03:37,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:03:37,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:03:40,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:03:42,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:42,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=808040.0, ans=0.1 2023-10-02 08:03:43,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:43,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:03:47,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 08:03:47,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:03:48,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=808106.6666666666, ans=0.125 2023-10-02 08:03:49,202 INFO [train.py:1046] (2/4) Epoch 23, batch 4350, loss[loss=0.1783, simple_loss=0.2702, pruned_loss=0.04325, over 24434.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2485, pruned_loss=0.04685, over 4716156.05 frames. ], batch size: 69, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:03:53,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:03:56,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:59,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:03:59,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:04:01,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=808106.6666666666, ans=0.125 2023-10-02 08:04:02,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:04:04,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:04:07,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:04:07,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:04:09,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=808173.3333333334, ans=0.0 2023-10-02 08:04:10,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:04:11,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:04:14,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:04:17,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 08:04:18,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:04:19,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:22,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:26,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 08:04:28,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:04:28,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=808240.0, ans=0.0 2023-10-02 08:04:32,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:04:37,102 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 08:04:38,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:04:38,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:04:39,853 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 08:04:41,190 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 08:04:41,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:04:41,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:04:42,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:04:42,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:04:43,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:04:44,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:04:47,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 08:04:47,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:47,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:04:48,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:49,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 08:04:50,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 08:04:50,680 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 08:04:50,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 08:04:54,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:04:54,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:04:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:04:55,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:04:57,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 08:05:00,593 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 08:05:00,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:01,929 INFO [train.py:1046] (2/4) Epoch 23, batch 4400, loss[loss=0.1742, simple_loss=0.2551, pruned_loss=0.0467, over 24437.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2496, pruned_loss=0.04775, over 4709873.24 frames. ], batch size: 69, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:05:02,138 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:05:04,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:05:04,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:06,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:05:09,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 08:05:09,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 08:05:09,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 08:05:09,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 08:05:09,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:05:09,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:05:11,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 08:05:13,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:15,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:15,237 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 08:05:19,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:19,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 08:05:19,197 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 08:05:21,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 08:05:22,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 08:05:22,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 08:05:22,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:23,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:05:24,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:05:24,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:05:27,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 08:05:27,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 08:05:27,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:28,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:05:28,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:30,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:30,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:30,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 08:05:32,000 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 08:05:35,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:42,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:05:43,845 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.961e+02 2.160e+02 2.507e+02 3.743e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-02 08:05:45,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 08:05:48,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=808640.0, ans=0.2 2023-10-02 08:05:49,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:05:52,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:05:54,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:05:54,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 08:05:54,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:05:54,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:05:54,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:05:56,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:06:00,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 08:06:04,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 08:06:05,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 08:06:05,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:05,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 08:06:05,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:06:11,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:06:14,559 INFO [train.py:1046] (2/4) Epoch 23, batch 4450, loss[loss=0.187, simple_loss=0.2591, pruned_loss=0.05745, over 23425.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2503, pruned_loss=0.04843, over 4712982.42 frames. ], batch size: 105, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:06:14,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 08:06:17,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:06:20,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:20,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:06:25,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:06:25,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:06:28,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:31,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:06:32,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=808840.0, ans=0.0 2023-10-02 08:06:34,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:06:34,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:35,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 08:06:35,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:06:35,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:37,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:06:37,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:06:39,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:06:45,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:06:45,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:06:46,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:06:47,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:47,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:06:48,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=808906.6666666666, ans=0.2 2023-10-02 08:06:49,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=808906.6666666666, ans=0.1 2023-10-02 08:06:52,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 08:06:52,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=808906.6666666666, ans=0.1 2023-10-02 08:06:53,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 08:06:54,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 08:06:54,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:06:56,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:06:56,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 08:06:59,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:07:03,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:07:03,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 08:07:03,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:03,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:07:03,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:07:03,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:07:08,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:07:10,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:07:10,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 08:07:13,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:07:15,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:07:16,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.32 vs. limit=6.0 2023-10-02 08:07:16,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:07:18,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:18,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:07:19,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:07:22,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 08:07:23,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:07:26,430 INFO [train.py:1046] (2/4) Epoch 23, batch 4500, loss[loss=0.2115, simple_loss=0.2783, pruned_loss=0.07238, over 19384.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2501, pruned_loss=0.04807, over 4717006.05 frames. ], batch size: 388, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:07:27,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:07:29,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 08:07:29,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 08:07:30,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:07:35,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:37,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:07:38,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:07:38,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:07:40,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:07:40,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:07:50,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=809173.3333333334, ans=0.125 2023-10-02 08:07:51,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:07:52,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:07:54,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:07:54,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:07:55,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:08:01,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:08:04,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:08:07,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:08:09,149 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.850e+02 2.045e+02 2.410e+02 3.743e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-02 08:08:10,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:08:10,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 08:08:10,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:12,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:14,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:14,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:08:15,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=22.5 2023-10-02 08:08:17,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:08:17,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 08:08:17,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:08:17,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:23,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:08:24,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:08:26,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:28,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.85 vs. limit=15.0 2023-10-02 08:08:28,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:08:28,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:08:31,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 08:08:32,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 08:08:32,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 08:08:34,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=809373.3333333334, ans=0.125 2023-10-02 08:08:34,938 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:08:38,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 08:08:38,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=809440.0, ans=0.2 2023-10-02 08:08:39,843 INFO [train.py:1046] (2/4) Epoch 23, batch 4550, loss[loss=0.1732, simple_loss=0.243, pruned_loss=0.05166, over 23647.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2498, pruned_loss=0.04866, over 4706608.09 frames. ], batch size: 149, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:08:39,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 08:08:41,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:08:44,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:08:45,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:08:46,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:08:50,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:08:52,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:54,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:08:54,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:08:54,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:08:58,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:08:59,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:03,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 08:09:03,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 08:09:03,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=809506.6666666666, ans=0.0 2023-10-02 08:09:03,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=809506.6666666666, ans=0.125 2023-10-02 08:09:04,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=809506.6666666666, ans=0.025 2023-10-02 08:09:06,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:09:07,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 08:09:10,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 08:09:10,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:09:13,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 08:09:15,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:09:19,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:19,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:20,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:09:23,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 08:09:25,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:09:25,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=809640.0, ans=0.125 2023-10-02 08:09:27,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:27,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:09:29,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:09:29,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 08:09:30,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 08:09:30,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:09:32,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 08:09:34,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 08:09:34,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:09:35,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=809640.0, ans=0.0 2023-10-02 08:09:37,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:09:37,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:37,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=809706.6666666666, ans=0.125 2023-10-02 08:09:38,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:38,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:09:38,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:09:39,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 08:09:41,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:09:41,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 08:09:41,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 08:09:41,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:09:41,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 08:09:44,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:09:44,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:09:46,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-02 08:09:47,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:09:47,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:49,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:09:50,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:09:51,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:09:53,330 INFO [train.py:1046] (2/4) Epoch 23, batch 4600, loss[loss=0.1634, simple_loss=0.2544, pruned_loss=0.03617, over 24321.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2474, pruned_loss=0.04801, over 4702402.23 frames. ], batch size: 74, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:09:54,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:09:56,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:57,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:09:57,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:09:59,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:09:59,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=809773.3333333334, ans=0.125 2023-10-02 08:10:00,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 08:10:02,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:10:08,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:10:08,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:09,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:12,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=809840.0, ans=0.0 2023-10-02 08:10:16,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 08:10:16,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:20,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:20,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=809840.0, ans=0.125 2023-10-02 08:10:22,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:10:22,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:29,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 08:10:29,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:10:29,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:10:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:36,007 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.884e+02 2.130e+02 2.615e+02 3.495e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-02 08:10:36,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:10:36,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=809973.3333333334, ans=0.2 2023-10-02 08:10:37,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:10:41,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 08:10:42,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:10:43,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=809973.3333333334, ans=0.125 2023-10-02 08:10:47,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:48,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:10:49,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.24 vs. limit=22.5 2023-10-02 08:10:50,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:50,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 08:10:50,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 08:10:52,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:52,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:10:53,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:53,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=810040.0, ans=0.125 2023-10-02 08:10:54,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-10-02 08:10:54,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:54,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:10:56,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 08:10:56,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 08:10:56,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 08:10:57,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:10:58,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:00,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:11:00,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:11:05,881 INFO [train.py:1046] (2/4) Epoch 23, batch 4650, loss[loss=0.148, simple_loss=0.2231, pruned_loss=0.03645, over 24435.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2469, pruned_loss=0.0475, over 4715421.98 frames. ], batch size: 58, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:11:07,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=810106.6666666666, ans=0.0 2023-10-02 08:11:10,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:11:12,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:11:13,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:11:13,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:11:13,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:11:13,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:14,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:11:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 08:11:21,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:11:23,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 08:11:23,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:11:24,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 08:11:24,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:11:26,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 08:11:26,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 08:11:26,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:27,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:11:31,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:11:31,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 08:11:34,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:36,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 08:11:38,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:38,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:11:39,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 08:11:41,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:11:42,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:11:46,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:46,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=810240.0, ans=0.125 2023-10-02 08:11:49,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:52,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:54,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:55,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:11:58,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 08:11:58,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 08:11:58,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=810306.6666666666, ans=0.2 2023-10-02 08:11:59,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 08:11:59,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 08:12:01,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:08,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:12:09,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:09,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 08:12:09,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:09,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:12:09,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:12:10,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=810373.3333333334, ans=0.125 2023-10-02 08:12:10,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.44 vs. limit=10.0 2023-10-02 08:12:12,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:12:12,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=810373.3333333334, ans=0.125 2023-10-02 08:12:15,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:12:15,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:12:17,763 INFO [train.py:1046] (2/4) Epoch 23, batch 4700, loss[loss=0.178, simple_loss=0.2608, pruned_loss=0.04763, over 24599.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2466, pruned_loss=0.04719, over 4711209.72 frames. ], batch size: 68, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:12:17,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:12:21,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:21,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:12:21,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:12:21,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 08:12:23,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:12:24,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 08:12:31,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:32,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:32,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:12:34,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:36,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:12:40,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 08:12:40,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=810506.6666666666, ans=0.1 2023-10-02 08:12:41,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 08:12:43,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:44,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:12:45,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:12:46,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=810573.3333333334, ans=0.1 2023-10-02 08:12:48,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:52,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:12:53,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:12:56,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:13:01,990 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.784e+02 2.018e+02 2.230e+02 2.934e+02, threshold=4.036e+02, percent-clipped=0.0 2023-10-02 08:13:02,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 08:13:03,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:13:06,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:09,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 08:13:10,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:13,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:13:14,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=810640.0, ans=0.125 2023-10-02 08:13:15,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 08:13:16,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:16,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:19,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:13:19,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:13:19,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 08:13:21,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 08:13:24,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:25,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:25,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 08:13:27,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:29,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.87 vs. limit=15.0 2023-10-02 08:13:29,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 08:13:31,247 INFO [train.py:1046] (2/4) Epoch 23, batch 4750, loss[loss=0.1682, simple_loss=0.2431, pruned_loss=0.04671, over 23631.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2483, pruned_loss=0.04779, over 4709352.72 frames. ], batch size: 149, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:13:32,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:13:34,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:36,195 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:13:37,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:37,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:13:39,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 08:13:39,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:13:42,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 08:13:43,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:13:43,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:44,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:13:47,884 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:13:50,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 08:13:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:13:55,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 08:13:56,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:13:56,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=810840.0, ans=0.125 2023-10-02 08:13:59,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:59,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:59,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:59,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 08:14:00,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 08:14:05,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 08:14:08,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:14:08,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=810906.6666666666, ans=0.2 2023-10-02 08:14:10,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:10,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=810906.6666666666, ans=0.125 2023-10-02 08:14:12,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:14:12,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 08:14:12,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:14:13,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=810906.6666666666, ans=0.07 2023-10-02 08:14:15,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:14:16,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:14:19,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 08:14:19,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=810973.3333333334, ans=0.0 2023-10-02 08:14:19,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=810973.3333333334, ans=0.1 2023-10-02 08:14:20,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 08:14:20,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:14:22,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:14:22,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:14:22,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:14:23,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 08:14:26,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 08:14:28,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:14:29,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:14:29,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 08:14:29,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:14:30,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:14:32,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:14:34,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:35,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:14:37,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=811040.0, ans=0.125 2023-10-02 08:14:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:14:39,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 08:14:41,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 08:14:42,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 08:14:42,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=811106.6666666666, ans=0.125 2023-10-02 08:14:43,873 INFO [train.py:1046] (2/4) Epoch 23, batch 4800, loss[loss=0.1682, simple_loss=0.2538, pruned_loss=0.04128, over 24597.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2494, pruned_loss=0.04823, over 4715171.18 frames. ], batch size: 71, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:14:45,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:14:45,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:14:46,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 08:14:49,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=811106.6666666666, ans=0.1 2023-10-02 08:14:50,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:52,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:14:58,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:15:00,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:01,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:01,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 08:15:01,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:15:01,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:15:03,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:15:07,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:07,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:09,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:15:10,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:10,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 08:15:10,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:10,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:12,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:14,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:16,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:16,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:15:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:15:19,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:20,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 08:15:20,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 08:15:22,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:22,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:15:23,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:15:23,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:15:23,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:15:26,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:15:26,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:15:28,382 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 1.977e+02 2.264e+02 3.634e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-02 08:15:31,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:15:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:34,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:15:34,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=811306.6666666666, ans=0.0 2023-10-02 08:15:39,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 08:15:39,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:40,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:40,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:15:40,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=811306.6666666666, ans=0.1 2023-10-02 08:15:41,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:42,418 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.53 vs. limit=15.0 2023-10-02 08:15:44,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:15:46,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:15:46,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:46,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:15:46,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:15:47,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:15:47,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=811373.3333333334, ans=0.125 2023-10-02 08:15:52,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:15:52,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:52,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:53,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 08:15:56,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 08:15:56,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:56,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:56,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:15:56,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:57,956 INFO [train.py:1046] (2/4) Epoch 23, batch 4850, loss[loss=0.165, simple_loss=0.2563, pruned_loss=0.03688, over 24636.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2501, pruned_loss=0.0485, over 4713818.78 frames. ], batch size: 68, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:15:59,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:16:07,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 08:16:09,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:16:11,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=811506.6666666666, ans=0.0 2023-10-02 08:16:15,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:16:15,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:16:16,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:16:19,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:16:20,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:16:20,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:16:20,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 08:16:26,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:16:28,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:16:28,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:16:28,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:16:28,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 08:16:32,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:16:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:36,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:36,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 08:16:37,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 08:16:40,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:16:47,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:16:47,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 08:16:48,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:16:48,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:16:50,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:16:52,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 08:16:52,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:52,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 08:16:52,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:16:54,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:16:54,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 08:16:54,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=811706.6666666666, ans=0.05 2023-10-02 08:17:02,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:17:03,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=811706.6666666666, ans=0.125 2023-10-02 08:17:08,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:17:08,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:08,510 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:17:11,412 INFO [train.py:1046] (2/4) Epoch 23, batch 4900, loss[loss=0.1592, simple_loss=0.2243, pruned_loss=0.04699, over 23604.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2491, pruned_loss=0.04787, over 4714480.50 frames. ], batch size: 256, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:17:14,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 08:17:14,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:17:18,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:19,368 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-10-02 08:17:19,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:17:21,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:17:22,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 08:17:22,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=811773.3333333334, ans=0.125 2023-10-02 08:17:26,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 08:17:31,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 08:17:31,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 08:17:33,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:17:33,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:17:33,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:17:33,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:33,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:17:33,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 08:17:35,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=811840.0, ans=0.125 2023-10-02 08:17:38,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 08:17:38,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:17:41,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:17:41,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:17:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:17:43,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:45,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:17:45,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 08:17:46,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:17:46,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:48,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 08:17:48,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 08:17:50,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 08:17:52,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:17:52,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=811906.6666666666, ans=0.125 2023-10-02 08:17:53,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:17:53,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:17:53,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:54,801 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.777e+02 1.982e+02 2.157e+02 3.751e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 08:17:54,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 08:17:54,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:17:54,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 08:17:58,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:00,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:18:02,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:18:06,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 08:18:06,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:18:07,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 08:18:08,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 08:18:13,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:18:14,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:18:15,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 08:18:15,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:18:15,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:18:17,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:18:21,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:18:21,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:18:21,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 08:18:22,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:18:24,008 INFO [train.py:1046] (2/4) Epoch 23, batch 4950, loss[loss=0.1833, simple_loss=0.2522, pruned_loss=0.05719, over 23710.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2476, pruned_loss=0.04742, over 4699714.32 frames. ], batch size: 164, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:18:25,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:18:25,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:18:28,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 08:18:30,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 08:18:30,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:18:31,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 08:18:31,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:31,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:18:31,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:18:33,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:35,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:35,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:18:36,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:18:37,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:18:40,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:40,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:18:42,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:18:45,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:46,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=812173.3333333334, ans=0.125 2023-10-02 08:18:47,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:18:48,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:50,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:50,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=812173.3333333334, ans=0.0 2023-10-02 08:18:51,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:18:51,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=812240.0, ans=0.2 2023-10-02 08:18:52,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 08:18:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 08:18:55,019 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.99 vs. limit=15.0 2023-10-02 08:18:57,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:58,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:18:59,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:18:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:18:59,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:19:01,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:19:04,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:19:07,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:19:10,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:19:11,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:19:11,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:13,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 08:19:13,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:19:14,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-02 08:19:15,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:19:18,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:19:19,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:19:19,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:19:19,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:20,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:19:21,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:19:22,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:19:22,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:19:22,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:19:24,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 08:19:30,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:19:34,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 08:19:35,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 08:19:36,741 INFO [train.py:1046] (2/4) Epoch 23, batch 5000, loss[loss=0.142, simple_loss=0.2164, pruned_loss=0.03376, over 22148.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.247, pruned_loss=0.04684, over 4704800.30 frames. ], batch size: 48, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:19:41,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:41,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:19:43,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 08:19:43,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 08:19:46,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:19:48,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 08:19:48,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:19:48,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:19:49,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 08:19:50,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:19:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:19:53,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 08:19:53,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:19:53,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:19:54,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 08:19:56,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 08:19:57,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:19:58,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 08:19:58,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:19:58,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=812506.6666666666, ans=0.0 2023-10-02 08:19:58,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=22.5 2023-10-02 08:19:59,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:19:59,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:19:59,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 08:19:59,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 08:20:02,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 08:20:02,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:20:02,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:03,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=812573.3333333334, ans=0.1 2023-10-02 08:20:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 08:20:05,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:20:05,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:06,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:20:08,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 08:20:09,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=812573.3333333334, ans=0.07 2023-10-02 08:20:10,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 08:20:11,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:20:12,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:20:15,594 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 08:20:19,580 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.897e+02 2.081e+02 2.436e+02 3.526e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-02 08:20:19,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:20:19,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:19,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:22,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 08:20:22,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:20:23,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:20:23,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:20:27,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 08:20:27,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:20:29,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:20:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:20:37,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 08:20:42,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:47,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=812773.3333333334, ans=0.0 2023-10-02 08:20:48,594 INFO [train.py:1046] (2/4) Epoch 23, batch 5050, loss[loss=0.21, simple_loss=0.2631, pruned_loss=0.07849, over 19353.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2475, pruned_loss=0.0468, over 4717251.05 frames. ], batch size: 388, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:20:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:20:50,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:50,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:20:50,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:20:51,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:20:51,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:20:51,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:55,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:55,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 08:20:57,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:20:58,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:20:59,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:21:00,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 08:21:01,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:21:01,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:21:02,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:21:04,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:21:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:21:08,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=812840.0, ans=0.125 2023-10-02 08:21:14,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 08:21:14,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:21:15,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:21:16,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 08:21:17,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:21:18,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:21:18,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:21:18,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 08:21:20,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 08:21:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:23,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:21:27,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:29,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 08:21:30,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:21:33,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 08:21:34,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:21:34,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:21:34,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:21:36,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:21:37,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:21:40,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:21:41,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:41,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:21:41,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:21:41,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 08:21:43,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:21:44,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:21:46,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.73 vs. limit=22.5 2023-10-02 08:21:47,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:21:47,381 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 08:21:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:21:48,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:21:49,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:49,965 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 08:21:52,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:21:52,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 08:21:52,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:56,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:21:56,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:58,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 08:21:58,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 08:22:01,339 INFO [train.py:1046] (2/4) Epoch 23, batch 5100, loss[loss=0.1742, simple_loss=0.2452, pruned_loss=0.05161, over 23730.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2479, pruned_loss=0.04685, over 4702107.30 frames. ], batch size: 232, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:22:01,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:01,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:01,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:22:05,519 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 08:22:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:22:10,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 08:22:11,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 08:22:12,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:13,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:22:16,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:22:16,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 08:22:16,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 08:22:18,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.19 vs. limit=15.0 2023-10-02 08:22:20,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:22:20,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:22:24,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:27,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 08:22:27,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:29,696 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:22:30,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:22:30,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 08:22:32,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:33,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:33,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 08:22:36,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 08:22:36,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 08:22:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 08:22:38,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=813240.0, ans=0.125 2023-10-02 08:22:42,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:47,180 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.431e+02 1.831e+02 1.978e+02 2.257e+02 3.540e+02, threshold=3.956e+02, percent-clipped=0.0 2023-10-02 08:22:48,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:22:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 08:22:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 08:22:52,931 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 08:22:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 08:22:54,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:55,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 08:22:59,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 08:23:00,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:23:03,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:23:04,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 08:23:07,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:23:07,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 08:23:13,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:23:14,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:23:14,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:23:15,856 INFO [train.py:1046] (2/4) Epoch 23, batch 5150, loss[loss=0.1683, simple_loss=0.2526, pruned_loss=0.042, over 24488.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2483, pruned_loss=0.04688, over 4699037.16 frames. ], batch size: 66, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:23:15,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:23:15,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:23:17,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:23:17,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 08:23:17,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 08:23:17,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 08:23:18,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:23:18,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 08:23:20,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:23:20,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 08:23:21,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:23:23,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=813440.0, ans=0.0 2023-10-02 08:23:24,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:23:27,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=813440.0, ans=0.1 2023-10-02 08:23:28,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:23:28,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 08:23:31,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:23:31,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:23:31,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=813506.6666666666, ans=0.125 2023-10-02 08:23:34,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:23:34,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:23:34,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:23:35,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:23:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:23:35,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 08:23:37,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:23:38,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:23:39,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:23:41,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 08:23:43,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:23:49,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:23:52,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 08:23:55,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:23:59,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:24:01,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:02,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=813640.0, ans=0.0 2023-10-02 08:24:04,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=813640.0, ans=0.025 2023-10-02 08:24:05,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:06,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:24:08,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 08:24:08,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=813640.0, ans=0.1 2023-10-02 08:24:12,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:24:13,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:24:13,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:24:16,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:18,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:24:19,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 08:24:21,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=813706.6666666666, ans=0.2 2023-10-02 08:24:23,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:24:26,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:24:26,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:24:26,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:24:28,066 INFO [train.py:1046] (2/4) Epoch 23, batch 5200, loss[loss=0.1647, simple_loss=0.2568, pruned_loss=0.0363, over 24644.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2487, pruned_loss=0.04707, over 4709796.54 frames. ], batch size: 68, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:24:28,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:24:28,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:24:28,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:24:31,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:24:32,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:24:34,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:24:38,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 08:24:39,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:24:39,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:41,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=813840.0, ans=0.025 2023-10-02 08:24:42,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:24:42,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:24:44,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:44,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 08:24:47,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:24:47,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:49,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 08:24:52,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:24:52,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:24:52,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 08:24:53,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 08:24:56,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 08:24:56,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:56,748 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 08:24:56,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:58,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=813906.6666666666, ans=0.125 2023-10-02 08:24:59,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:59,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:24:59,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 08:25:01,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:25:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:25:06,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 08:25:08,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 08:25:08,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 08:25:11,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 08:25:11,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:25:13,377 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.905e+02 1.999e+02 2.249e+02 3.561e+02, threshold=3.998e+02, percent-clipped=0.0 2023-10-02 08:25:19,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:25:19,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:20,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 08:25:22,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:25:22,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:25:22,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:22,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:25:26,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:25:26,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:25:30,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:25:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:25:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:36,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:37,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 08:25:37,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:25:37,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:25:39,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:39,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:25:40,414 INFO [train.py:1046] (2/4) Epoch 23, batch 5250, loss[loss=0.1756, simple_loss=0.2531, pruned_loss=0.04904, over 23298.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2483, pruned_loss=0.04743, over 4715478.16 frames. ], batch size: 119, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:25:40,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:25:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:25:45,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:25:47,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:25:48,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:25:52,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:54,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:25:55,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:25:56,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:26:00,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 08:26:00,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:26:00,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:26:01,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=814173.3333333334, ans=0.0 2023-10-02 08:26:06,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=814173.3333333334, ans=0.125 2023-10-02 08:26:08,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-10-02 08:26:10,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=814240.0, ans=0.125 2023-10-02 08:26:22,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=814306.6666666666, ans=0.125 2023-10-02 08:26:34,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.01 vs. limit=15.0 2023-10-02 08:26:48,278 INFO [train.py:1046] (2/4) Epoch 23, batch 5300, loss[loss=0.192, simple_loss=0.26, pruned_loss=0.06201, over 23695.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2468, pruned_loss=0.04728, over 4688648.04 frames. ], batch size: 179, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:26:53,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=814440.0, ans=0.125 2023-10-02 08:26:55,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=814440.0, ans=0.125 2023-10-02 08:27:02,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:27:02,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 08:27:02,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 08:27:02,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:02,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:03,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:03,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:03,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:03,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:03,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:03,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:27:03,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:27:03,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 08:27:03,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 08:27:03,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 08:27:04,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:27:04,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 08:27:04,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 08:27:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:04,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:04,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:27:04,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:27:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:27:04,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:27:04,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:05,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:27:05,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:05,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:27:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:05,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:27:05,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 08:27:05,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:27:06,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:06,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 08:27:06,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 08:27:06,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:27:06,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:06,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 08:27:06,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 08:27:06,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:27:06,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:27:07,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:27:07,110 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 08:27:07,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 08:27:07,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:27:07,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:07,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 08:27:07,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 08:27:07,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 08:27:07,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:27:14,099 INFO [train.py:1046] (2/4) Epoch 24, batch 0, loss[loss=0.1836, simple_loss=0.2587, pruned_loss=0.05428, over 23653.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2587, pruned_loss=0.05428, over 23653.00 frames. ], batch size: 149, lr: 4.32e-03, grad_scale: 32.0 2023-10-02 08:27:14,099 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 08:27:27,306 INFO [train.py:1078] (2/4) Epoch 24, validation: loss=0.3245, simple_loss=0.2712, pruned_loss=0.1889, over 1125622.00 frames. 2023-10-02 08:27:27,307 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 08:27:28,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 08:27:28,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:27:30,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:27:34,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=814520.0, ans=0.1 2023-10-02 08:27:35,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:35,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:27:35,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:36,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 08:27:40,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 08:27:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:43,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:43,629 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:27:47,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:47,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:48,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:27:48,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:27:49,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 08:27:51,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=814586.6666666666, ans=0.1 2023-10-02 08:27:52,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:27:56,503 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 1.867e+02 2.102e+02 2.500e+02 3.375e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-02 08:27:58,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:28:00,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:28:01,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 08:28:04,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:28:04,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:28:05,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:09,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:28:13,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:18,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 08:28:20,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=814720.0, ans=0.1 2023-10-02 08:28:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 08:28:21,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:28:21,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:28:24,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:28:27,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 08:28:28,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:29,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:33,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:28:35,914 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 08:28:37,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:28:40,396 INFO [train.py:1046] (2/4) Epoch 24, batch 50, loss[loss=0.1662, simple_loss=0.2411, pruned_loss=0.04564, over 24446.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.248, pruned_loss=0.04613, over 1064244.44 frames. ], batch size: 58, lr: 4.32e-03, grad_scale: 32.0 2023-10-02 08:28:40,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:28:41,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:28:41,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 08:28:43,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:28:43,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:28:45,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=814853.3333333334, ans=0.1 2023-10-02 08:28:46,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:28:47,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:28:49,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:28:53,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 08:28:53,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:59,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:29:00,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 08:29:03,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 08:29:04,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:29:04,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:29:04,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:29:06,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:29:07,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:29:07,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:29:07,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:29:15,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:29:16,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:29:16,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:29:16,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 08:29:18,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:29:19,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:29:19,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 08:29:20,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:29:22,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 08:29:28,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:29:30,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:29:30,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=815053.3333333334, ans=0.0 2023-10-02 08:29:31,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:29:32,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:29:32,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:29:34,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=815053.3333333334, ans=22.5 2023-10-02 08:29:34,790 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.15 vs. limit=22.5 2023-10-02 08:29:35,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 08:29:36,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 08:29:38,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:29:38,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:29:39,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:29:41,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:29:41,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 08:29:41,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 08:29:42,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 08:29:46,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:29:46,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:29:46,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 08:29:47,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 08:29:48,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:29:49,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:29:50,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:29:51,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:29:53,008 INFO [train.py:1046] (2/4) Epoch 24, batch 100, loss[loss=0.1805, simple_loss=0.2658, pruned_loss=0.0476, over 24568.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2505, pruned_loss=0.04681, over 1878171.87 frames. ], batch size: 71, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:29:53,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:29:56,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:29:59,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:30:02,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 08:30:02,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:30:05,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:30:05,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:30:06,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:30:06,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:30:06,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:30:07,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 08:30:10,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:30:10,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=815253.3333333334, ans=0.2 2023-10-02 08:30:11,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:11,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:11,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:30:14,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 08:30:15,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:17,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:17,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:30:19,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:30:23,759 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.848e+02 2.049e+02 2.242e+02 3.447e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 08:30:23,846 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 08:30:23,860 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 08:30:25,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:30:25,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:30:28,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:30:30,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:30,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:34,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:36,182 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 08:30:37,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 08:30:40,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:30:40,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=815386.6666666666, ans=10.0 2023-10-02 08:30:41,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:30:43,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:43,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=815386.6666666666, ans=0.125 2023-10-02 08:30:47,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:30:48,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=815386.6666666666, ans=0.125 2023-10-02 08:30:50,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:30:51,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:30:54,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:54,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:57,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:30:57,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:30:57,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:57,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 08:30:59,233 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 08:30:59,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:00,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:31:02,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:02,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:02,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 08:31:02,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:31:02,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:31:02,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:02,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:03,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:03,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:31:05,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:31:06,728 INFO [train.py:1046] (2/4) Epoch 24, batch 150, loss[loss=0.1628, simple_loss=0.232, pruned_loss=0.04678, over 23430.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2494, pruned_loss=0.04704, over 2516392.70 frames. ], batch size: 134, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:31:08,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:10,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:31:10,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:12,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:13,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:13,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:16,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:31:18,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:24,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 08:31:24,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 08:31:24,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 08:31:26,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:31:26,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:31:28,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:31:30,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:31:30,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:30,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:31,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 08:31:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:40,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:42,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:31:44,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 08:31:44,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=815653.3333333334, ans=0.125 2023-10-02 08:31:48,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:31:48,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:49,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:31:51,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:31:52,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.44 vs. limit=15.0 2023-10-02 08:31:53,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:53,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=815720.0, ans=0.125 2023-10-02 08:31:53,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=815720.0, ans=0.0 2023-10-02 08:31:54,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:31:55,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:56,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 08:32:00,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=815720.0, ans=15.0 2023-10-02 08:32:01,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:02,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:02,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:32:02,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:32:04,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:06,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 08:32:08,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=815786.6666666666, ans=6.0 2023-10-02 08:32:09,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:32:10,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:32:11,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=815786.6666666666, ans=0.125 2023-10-02 08:32:12,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:32:13,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:32:13,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 08:32:13,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:32:13,732 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 08:32:13,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=815786.6666666666, ans=0.2 2023-10-02 08:32:15,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:32:19,659 INFO [train.py:1046] (2/4) Epoch 24, batch 200, loss[loss=0.1889, simple_loss=0.2549, pruned_loss=0.06146, over 23351.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2507, pruned_loss=0.04864, over 3002011.01 frames. ], batch size: 285, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:32:19,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:32:19,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:32:21,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 08:32:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:32:23,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:25,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 08:32:25,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:32:27,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:28,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:28,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=815853.3333333334, ans=0.125 2023-10-02 08:32:29,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.66 vs. limit=22.5 2023-10-02 08:32:30,579 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-10-02 08:32:33,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:32:34,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:32:34,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:44,033 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.66 vs. limit=15.0 2023-10-02 08:32:49,984 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.916e+02 2.110e+02 2.574e+02 4.556e+02, threshold=4.220e+02, percent-clipped=1.0 2023-10-02 08:32:54,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:32:54,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:32:55,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.13 vs. limit=10.0 2023-10-02 08:32:56,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:32:56,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:32:57,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 08:32:57,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:32:58,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:00,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:33:00,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:33:00,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:33:02,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 08:33:02,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:33:04,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:08,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:33:13,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:33:13,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=816053.3333333334, ans=0.125 2023-10-02 08:33:22,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:22,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:33:30,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:32,202 INFO [train.py:1046] (2/4) Epoch 24, batch 250, loss[loss=0.1753, simple_loss=0.2602, pruned_loss=0.04518, over 24456.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2511, pruned_loss=0.04908, over 3385055.98 frames. ], batch size: 69, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:33:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 08:33:33,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:33,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:33:33,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:33:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:33:35,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 08:33:36,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:33:36,595 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 08:33:38,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:38,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=816186.6666666666, ans=0.125 2023-10-02 08:33:40,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:33:42,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:42,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:45,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:33:45,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:47,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:33:49,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:33:53,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=816253.3333333334, ans=0.0 2023-10-02 08:33:58,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:34:01,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:34:01,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:34:02,414 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.48 vs. limit=15.0 2023-10-02 08:34:03,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=816320.0, ans=0.1 2023-10-02 08:34:08,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:34:08,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:34:10,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:34:10,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:34:11,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:34:11,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:34:13,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:34:14,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:34:17,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 08:34:17,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:34:19,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:34:19,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:34:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:34:20,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:34:22,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:34:22,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:34:23,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:24,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:34:24,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:27,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:34:32,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:33,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:34:33,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=816453.3333333334, ans=0.1 2023-10-02 08:34:39,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:41,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:34:44,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 08:34:45,738 INFO [train.py:1046] (2/4) Epoch 24, batch 300, loss[loss=0.1782, simple_loss=0.2561, pruned_loss=0.05022, over 23345.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2479, pruned_loss=0.04828, over 3680131.24 frames. ], batch size: 93, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:34:45,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:34:45,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:34:46,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 08:34:46,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:34:47,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:34:47,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 08:34:51,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:52,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:34:56,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:34:57,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 08:34:58,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:58,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:34:58,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 08:35:00,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:03,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=816586.6666666666, ans=0.2 2023-10-02 08:35:04,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:35:09,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:35:09,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 08:35:09,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=816586.6666666666, ans=0.1 2023-10-02 08:35:11,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=816586.6666666666, ans=0.04949747468305833 2023-10-02 08:35:14,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 08:35:14,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:14,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:15,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:16,750 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.883e+02 2.136e+02 2.437e+02 4.219e+02, threshold=4.271e+02, percent-clipped=0.0 2023-10-02 08:35:18,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:18,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 08:35:18,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:35:20,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:35:20,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:22,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:35:22,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:35:25,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 08:35:25,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 08:35:25,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=816653.3333333334, ans=0.0 2023-10-02 08:35:27,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:35:30,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:31,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 08:35:32,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:35:36,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:35:37,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.00 vs. limit=22.5 2023-10-02 08:35:41,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:35:41,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 08:35:41,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=816720.0, ans=0.125 2023-10-02 08:35:43,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:43,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:35:46,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:48,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:35:48,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 08:35:48,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=816786.6666666666, ans=0.1 2023-10-02 08:35:48,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=12.89 vs. limit=15.0 2023-10-02 08:35:49,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:35:49,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:35:50,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 08:35:50,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:52,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:35:53,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:35:55,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:35:55,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.15 vs. limit=22.5 2023-10-02 08:35:58,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=816853.3333333334, ans=0.09899494936611666 2023-10-02 08:35:59,414 INFO [train.py:1046] (2/4) Epoch 24, batch 350, loss[loss=0.1543, simple_loss=0.2317, pruned_loss=0.03842, over 24304.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2466, pruned_loss=0.04776, over 3898320.24 frames. ], batch size: 56, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:35:59,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:35:59,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 08:36:02,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:04,372 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.93 vs. limit=15.0 2023-10-02 08:36:07,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:36:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:10,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:13,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 08:36:15,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:36:15,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 08:36:18,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:18,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 08:36:19,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:36:21,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 08:36:22,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:36:24,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:36:25,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:36:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:26,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:28,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:36:28,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:28,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:36:31,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:36:31,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:38,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:36:38,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:36:38,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:36:38,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:41,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=817053.3333333334, ans=0.1 2023-10-02 08:36:44,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 08:36:44,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:47,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:47,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:36:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:36:49,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 08:36:49,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=817053.3333333334, ans=0.1 2023-10-02 08:36:50,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:36:52,087 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 08:36:53,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 08:36:54,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:57,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:36:57,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 08:37:00,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:00,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=817120.0, ans=0.125 2023-10-02 08:37:03,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:37:03,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:05,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:37:07,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:37:08,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:37:10,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:37:11,549 INFO [train.py:1046] (2/4) Epoch 24, batch 400, loss[loss=0.1784, simple_loss=0.2624, pruned_loss=0.04724, over 24565.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.246, pruned_loss=0.04751, over 4069215.62 frames. ], batch size: 71, lr: 4.31e-03, grad_scale: 32.0 2023-10-02 08:37:12,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 08:37:12,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:12,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:15,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:37:15,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:16,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:18,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:18,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 08:37:19,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 08:37:19,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:21,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 08:37:21,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:21,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=817186.6666666666, ans=0.125 2023-10-02 08:37:25,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:37:25,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:37:25,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 08:37:25,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:37:27,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:27,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:37:27,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:28,548 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 08:37:29,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 08:37:35,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:36,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=817253.3333333334, ans=0.0 2023-10-02 08:37:38,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:38,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 08:37:39,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 08:37:42,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:37:44,456 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.820e+02 2.038e+02 2.400e+02 4.228e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-02 08:37:47,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:37:53,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 08:37:57,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:38:00,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 08:38:01,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:38:01,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=817386.6666666666, ans=0.125 2023-10-02 08:38:03,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:38:04,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 08:38:05,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:38:06,135 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:38:08,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:38:09,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:38:13,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:13,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 08:38:14,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:38:14,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 08:38:17,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:38:17,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:38:19,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 08:38:22,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:38:22,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:38:22,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:38:23,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 08:38:24,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:38:24,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:38:25,676 INFO [train.py:1046] (2/4) Epoch 24, batch 450, loss[loss=0.179, simple_loss=0.2594, pruned_loss=0.04929, over 24120.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2471, pruned_loss=0.04719, over 4230358.31 frames. ], batch size: 80, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:38:25,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:38:25,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 08:38:25,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:38:27,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:38:29,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:38:38,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:38,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:38:38,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 08:38:39,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 08:38:39,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=817586.6666666666, ans=0.1 2023-10-02 08:38:42,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:38:43,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=817586.6666666666, ans=0.2 2023-10-02 08:38:45,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:47,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:38:47,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=15.0 2023-10-02 08:38:50,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:38:52,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:38:55,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 08:38:55,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 08:38:57,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 08:38:57,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:38:59,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:38:59,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:39:02,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 08:39:02,043 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 08:39:02,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:39:03,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:39:04,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 08:39:07,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:39:07,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:39:08,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 08:39:09,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.24 vs. limit=22.5 2023-10-02 08:39:10,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 08:39:11,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=817720.0, ans=0.125 2023-10-02 08:39:12,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:39:15,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:39:15,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:39:16,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 08:39:19,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:39:20,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 08:39:21,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 08:39:23,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:39:28,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:39:29,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:39:31,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:39:31,347 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 08:39:31,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=817786.6666666666, ans=0.125 2023-10-02 08:39:36,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:39:36,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=817853.3333333334, ans=0.125 2023-10-02 08:39:38,131 INFO [train.py:1046] (2/4) Epoch 24, batch 500, loss[loss=0.1666, simple_loss=0.2491, pruned_loss=0.04202, over 24114.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2482, pruned_loss=0.04774, over 4337351.16 frames. ], batch size: 80, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:39:38,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:39:39,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:39:39,553 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 08:39:40,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 08:39:40,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:39:43,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:39:48,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:39:50,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:39:51,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:39:51,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:39:51,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=817920.0, ans=0.125 2023-10-02 08:39:53,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:02,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:02,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:40:02,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:40:02,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:02,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 08:40:03,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:40:06,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:40:07,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:40:07,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:40:07,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:08,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 08:40:09,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=817986.6666666666, ans=0.0 2023-10-02 08:40:10,158 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.880e+02 2.084e+02 2.383e+02 5.306e+02, threshold=4.168e+02, percent-clipped=1.0 2023-10-02 08:40:11,656 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 08:40:14,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:15,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:16,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:18,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:18,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:40:22,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 08:40:25,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:40:26,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:31,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:32,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:35,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=12.0 2023-10-02 08:40:36,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=818120.0, ans=0.0 2023-10-02 08:40:38,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:41,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 08:40:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:41,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:41,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.04 vs. limit=15.0 2023-10-02 08:40:45,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 08:40:45,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:40:45,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:47,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.19 vs. limit=15.0 2023-10-02 08:40:50,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten.whitening_limit, batch_count=818186.6666666666, ans=15.0 2023-10-02 08:40:51,338 INFO [train.py:1046] (2/4) Epoch 24, batch 550, loss[loss=0.1587, simple_loss=0.2345, pruned_loss=0.04148, over 24355.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2496, pruned_loss=0.04821, over 4416554.79 frames. ], batch size: 61, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:40:51,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 08:40:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 08:40:54,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:54,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 08:40:56,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:40:56,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:56,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:56,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:40:58,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:40:59,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:41:00,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 08:41:00,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:41:06,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:06,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:07,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:41:07,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:12,554 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.90 vs. limit=12.0 2023-10-02 08:41:13,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 08:41:14,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 08:41:14,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:41:20,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:41:20,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:41:22,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:41:22,639 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.34 vs. limit=15.0 2023-10-02 08:41:25,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:27,430 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 08:41:27,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:28,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 08:41:31,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:41:31,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:41:31,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:41:33,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:34,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 08:41:35,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 08:41:37,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:41:37,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:41:38,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:41:38,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:41:42,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:41:43,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:41:45,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:41:46,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:48,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:41:49,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:41:52,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:41:52,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:41:52,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:54,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:41:54,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 08:42:00,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 08:42:03,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 08:42:04,493 INFO [train.py:1046] (2/4) Epoch 24, batch 600, loss[loss=0.1789, simple_loss=0.2657, pruned_loss=0.04609, over 24673.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2502, pruned_loss=0.04829, over 4489298.49 frames. ], batch size: 73, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:42:04,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:42:04,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:42:04,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:10,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:42:10,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=818520.0, ans=0.0 2023-10-02 08:42:11,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:42:12,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 08:42:13,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=818520.0, ans=0.125 2023-10-02 08:42:14,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:42:15,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:42:17,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:20,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 08:42:20,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:42:28,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 08:42:31,781 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.33 vs. limit=22.5 2023-10-02 08:42:32,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:42:32,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:32,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:42:37,052 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.807e+02 2.027e+02 2.301e+02 3.422e+02, threshold=4.053e+02, percent-clipped=0.0 2023-10-02 08:42:38,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:42:38,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:42:38,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:43,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=818653.3333333334, ans=0.2 2023-10-02 08:42:45,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:42:47,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=818720.0, ans=0.125 2023-10-02 08:42:49,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:50,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:42:50,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:58,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 08:43:03,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:43:03,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:43:06,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 08:43:06,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:43:07,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=818786.6666666666, ans=0.125 2023-10-02 08:43:09,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 08:43:10,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:43:12,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:43:17,583 INFO [train.py:1046] (2/4) Epoch 24, batch 650, loss[loss=0.1694, simple_loss=0.2417, pruned_loss=0.04856, over 23247.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2491, pruned_loss=0.0481, over 4552911.47 frames. ], batch size: 119, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:43:17,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 08:43:19,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:43:21,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:43:23,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:43:25,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:26,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 08:43:27,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.18 vs. limit=15.0 2023-10-02 08:43:27,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:43:33,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:43:33,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:43:37,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:37,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=818920.0, ans=0.125 2023-10-02 08:43:39,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 08:43:42,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:43:42,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:43:46,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:43:46,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 08:43:48,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:49,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:50,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:43:50,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:51,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:43:54,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:43:54,587 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 08:43:55,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:55,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:43:57,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:59,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:44:00,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:01,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:44:01,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 08:44:03,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:44:03,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:44:04,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:44:05,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:44:06,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:44:06,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 08:44:07,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 08:44:09,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:09,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:44:09,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:44:09,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:44:09,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=819053.3333333334, ans=0.0 2023-10-02 08:44:11,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:44:17,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:17,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:44:18,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:44:22,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:22,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 08:44:23,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:24,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=819120.0, ans=0.125 2023-10-02 08:44:28,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:44:28,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:44:29,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:44:29,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:44:31,240 INFO [train.py:1046] (2/4) Epoch 24, batch 700, loss[loss=0.1781, simple_loss=0.2624, pruned_loss=0.04693, over 24666.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2483, pruned_loss=0.0477, over 4581147.11 frames. ], batch size: 73, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:44:34,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 08:44:34,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 08:44:36,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 08:44:36,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:44:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 08:44:42,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:44:45,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:44:47,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:48,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:44:48,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:44:50,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=819253.3333333334, ans=0.2 2023-10-02 08:44:51,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:54,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 08:44:54,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:44:56,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 08:44:59,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 08:45:03,639 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.854e+02 2.026e+02 2.248e+02 3.229e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-02 08:45:03,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:45:03,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:45:05,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:45:09,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:45:09,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 08:45:13,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:13,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:45:13,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 08:45:17,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=819386.6666666666, ans=0.1 2023-10-02 08:45:19,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:45:20,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:22,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=819386.6666666666, ans=0.125 2023-10-02 08:45:22,982 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.89 vs. limit=15.0 2023-10-02 08:45:23,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:45:31,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:45:31,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 08:45:34,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 08:45:34,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=819453.3333333334, ans=0.125 2023-10-02 08:45:35,281 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.57 vs. limit=10.0 2023-10-02 08:45:35,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 08:45:37,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:38,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:45:38,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:45:41,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:41,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 08:45:44,208 INFO [train.py:1046] (2/4) Epoch 24, batch 750, loss[loss=0.165, simple_loss=0.2545, pruned_loss=0.03774, over 24587.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2474, pruned_loss=0.04763, over 4597522.37 frames. ], batch size: 73, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:45:45,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 08:45:47,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 08:45:47,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 08:45:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 08:45:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 08:45:50,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:45:51,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 08:45:52,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:52,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:45:52,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=819520.0, ans=0.125 2023-10-02 08:45:54,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:45:56,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:56,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:45:56,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:46:00,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:46:01,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:46:03,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:46:06,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:46:07,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:46:07,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 08:46:09,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:46:09,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:46:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:46:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:46:14,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 08:46:14,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:46:16,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 08:46:16,078 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 08:46:16,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 08:46:17,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:46:17,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:46:18,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:46:26,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:46:26,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:26,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:46:27,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=819720.0, ans=0.2 2023-10-02 08:46:29,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:46:30,337 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=24.35 vs. limit=22.5 2023-10-02 08:46:30,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:46:30,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 08:46:30,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:46:32,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 08:46:34,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:46:36,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:46:37,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 08:46:37,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:37,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.90 vs. limit=15.0 2023-10-02 08:46:41,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:46:44,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:46:44,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:46:46,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:46:49,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 08:46:49,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:46:50,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=819786.6666666666, ans=0.05 2023-10-02 08:46:51,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:46:53,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:46:53,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:46:55,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:55,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:46:56,922 INFO [train.py:1046] (2/4) Epoch 24, batch 800, loss[loss=0.1622, simple_loss=0.2523, pruned_loss=0.03606, over 24683.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2479, pruned_loss=0.04752, over 4603736.33 frames. ], batch size: 73, lr: 4.31e-03, grad_scale: 32.0 2023-10-02 08:47:03,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:47:03,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:05,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:47:05,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:47:06,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:07,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:08,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:08,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=819853.3333333334, ans=0.125 2023-10-02 08:47:12,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:13,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:47:16,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 08:47:17,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:17,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:47:18,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:47:18,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:47:18,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 08:47:18,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 08:47:22,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:24,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:27,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:47:27,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:47:30,993 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.828e+02 1.971e+02 2.242e+02 3.409e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-02 08:47:31,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:31,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:37,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:47:38,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:47:38,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 08:47:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 08:47:40,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 08:47:40,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:47:40,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:47:42,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:42,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:47:43,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=820053.3333333334, ans=0.125 2023-10-02 08:47:43,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=820053.3333333334, ans=0.0 2023-10-02 08:47:45,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=820053.3333333334, ans=0.1 2023-10-02 08:47:47,014 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 08:47:47,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 08:47:48,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:47:49,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:47:54,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:47:55,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:57,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 08:47:57,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.76 vs. limit=15.0 2023-10-02 08:47:58,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:48:01,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 08:48:09,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:48:09,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=820186.6666666666, ans=0.0 2023-10-02 08:48:10,747 INFO [train.py:1046] (2/4) Epoch 24, batch 850, loss[loss=0.1639, simple_loss=0.2412, pruned_loss=0.04328, over 23689.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2481, pruned_loss=0.04758, over 4632726.03 frames. ], batch size: 149, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:48:11,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=820186.6666666666, ans=0.09899494936611666 2023-10-02 08:48:11,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.29 vs. limit=22.5 2023-10-02 08:48:12,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:48:12,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 08:48:13,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:48:13,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:48:14,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.48 vs. limit=22.5 2023-10-02 08:48:15,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 08:48:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:16,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:48:17,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:19,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:48:19,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:48:20,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 08:48:20,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=820186.6666666666, ans=0.125 2023-10-02 08:48:21,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 08:48:21,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 08:48:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:48:23,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:48:26,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:27,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:48:27,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:48:30,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:32,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:48:32,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 08:48:37,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 08:48:40,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:41,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 08:48:42,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=820320.0, ans=0.125 2023-10-02 08:48:43,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=820320.0, ans=0.0 2023-10-02 08:48:44,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 08:48:44,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 08:48:47,458 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 08:48:47,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:48:47,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:48:47,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 08:48:50,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:51,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:51,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 08:48:51,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=820320.0, ans=0.0 2023-10-02 08:48:54,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:48:55,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:48:57,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:48:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:48:58,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:49:00,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:49:00,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 08:49:04,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:49:04,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:49:04,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:49:04,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:49:04,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=820386.6666666666, ans=0.0 2023-10-02 08:49:06,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:49:10,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:49:11,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:49:13,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:49:15,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:15,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:49:16,435 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.42 vs. limit=22.5 2023-10-02 08:49:20,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:49:22,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:49:22,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 08:49:23,585 INFO [train.py:1046] (2/4) Epoch 24, batch 900, loss[loss=0.1539, simple_loss=0.2358, pruned_loss=0.03603, over 20993.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2489, pruned_loss=0.04758, over 4651161.89 frames. ], batch size: 46, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:49:23,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:49:23,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:49:24,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.53 vs. limit=22.5 2023-10-02 08:49:25,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 08:49:31,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:49:31,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=820520.0, ans=0.125 2023-10-02 08:49:34,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:34,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 08:49:34,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=820520.0, ans=0.125 2023-10-02 08:49:37,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:49:37,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 08:49:40,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 08:49:41,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:49:41,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:49:41,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:49:41,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:49:48,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=820586.6666666666, ans=0.125 2023-10-02 08:49:51,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:49:51,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:52,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:49:55,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:49:58,565 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.862e+02 2.065e+02 2.310e+02 4.708e+02, threshold=4.130e+02, percent-clipped=1.0 2023-10-02 08:49:59,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 08:50:02,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:50:04,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.05 vs. limit=15.0 2023-10-02 08:50:05,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:50:07,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:50:08,584 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 08:50:10,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 08:50:15,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:50:15,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:50:16,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:50:22,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:22,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:50:22,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 08:50:22,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:50:25,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 08:50:26,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:50:26,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:30,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:50:30,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:50:34,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 08:50:34,578 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 08:50:35,953 INFO [train.py:1046] (2/4) Epoch 24, batch 950, loss[loss=0.1431, simple_loss=0.2212, pruned_loss=0.03256, over 16807.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2497, pruned_loss=0.04794, over 4663595.26 frames. ], batch size: 36, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:50:36,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 08:50:36,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 08:50:37,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:40,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 08:50:46,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:50:48,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:48,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:50,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:50:51,594 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 08:50:55,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:56,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:50:56,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:50:57,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:50:57,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 08:50:58,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:51:00,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:01,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 08:51:01,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:51:04,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=22.5 2023-10-02 08:51:07,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:07,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:51:08,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:51:08,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 08:51:09,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:51:10,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:51:12,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:51:12,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=820986.6666666666, ans=0.125 2023-10-02 08:51:15,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:51:15,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:51:18,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 08:51:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 08:51:19,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:51:21,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:51:22,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:51:26,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 08:51:28,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:51:29,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:51:30,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:30,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 08:51:30,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:51:30,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:51:31,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 08:51:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:51:38,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:51:43,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:51:45,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 08:51:45,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 08:51:47,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:48,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-10-02 08:51:49,059 INFO [train.py:1046] (2/4) Epoch 24, batch 1000, loss[loss=0.1681, simple_loss=0.2596, pruned_loss=0.03834, over 24657.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2487, pruned_loss=0.04781, over 4668514.17 frames. ], batch size: 68, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:51:51,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 08:51:51,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:51:56,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:51:59,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 08:51:59,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 08:52:03,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:03,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:52:05,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:08,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=821253.3333333334, ans=0.1 2023-10-02 08:52:09,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 08:52:11,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 08:52:12,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 08:52:12,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:52:14,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 08:52:15,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 08:52:17,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 08:52:18,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:18,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:23,907 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.855e+02 2.029e+02 2.321e+02 3.051e+02, threshold=4.057e+02, percent-clipped=0.0 2023-10-02 08:52:25,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:25,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:52:27,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:27,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:27,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 08:52:27,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:52:28,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:52:28,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:30,578 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 08:52:33,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 08:52:35,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 08:52:39,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 08:52:40,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:52:46,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:46,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:52:48,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:48,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:52:50,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 08:52:50,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:52:50,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 08:52:52,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 08:52:52,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:52:52,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:54,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:52:58,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:52:58,449 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:52:59,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:53:02,888 INFO [train.py:1046] (2/4) Epoch 24, batch 1050, loss[loss=0.1815, simple_loss=0.2676, pruned_loss=0.04771, over 24659.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2473, pruned_loss=0.04703, over 4674098.15 frames. ], batch size: 68, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:53:03,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=821520.0, ans=0.0 2023-10-02 08:53:04,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:53:04,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:53:05,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:53:07,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:53:09,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:53:09,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=821520.0, ans=0.0 2023-10-02 08:53:11,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:53:13,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:53:14,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=821520.0, ans=0.1 2023-10-02 08:53:15,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:53:17,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:53:18,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:53:18,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:53:19,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 08:53:19,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:53:20,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 08:53:21,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:53:21,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 08:53:21,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 08:53:23,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=821586.6666666666, ans=0.2 2023-10-02 08:53:30,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:53:31,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:53:31,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:53:33,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=821653.3333333334, ans=0.1 2023-10-02 08:53:35,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 08:53:35,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 08:53:35,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:53:35,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=821653.3333333334, ans=0.125 2023-10-02 08:53:36,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 08:53:38,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=821653.3333333334, ans=0.125 2023-10-02 08:53:39,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 08:53:39,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.93 vs. limit=22.5 2023-10-02 08:53:41,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:53:45,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 08:53:46,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 08:53:46,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:53:46,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:53:50,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:53:53,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 08:53:56,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 08:53:57,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 08:53:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:53:58,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:53:59,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 08:54:04,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:54:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:54:04,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:54:06,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:54:06,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:10,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 08:54:12,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:54:12,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 08:54:12,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 08:54:12,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:54:16,459 INFO [train.py:1046] (2/4) Epoch 24, batch 1100, loss[loss=0.1571, simple_loss=0.2225, pruned_loss=0.04583, over 22894.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.247, pruned_loss=0.04692, over 4664643.13 frames. ], batch size: 322, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:54:16,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:54:21,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:54:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:54:27,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:54:27,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:54:27,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 08:54:29,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:54:32,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:54:34,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:54:35,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:54:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 08:54:37,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 08:54:38,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:54:38,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:54:38,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=821920.0, ans=0.125 2023-10-02 08:54:40,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=821920.0, ans=0.2 2023-10-02 08:54:41,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:54:44,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:54:48,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:54:51,383 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.803e+02 1.981e+02 2.331e+02 3.257e+02, threshold=3.962e+02, percent-clipped=0.0 2023-10-02 08:54:51,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 08:54:51,532 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 08:54:51,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:53,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:54,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:54:54,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:54:56,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 08:54:57,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:54:57,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:54:57,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:54:58,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:58,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 08:55:05,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:55:06,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 08:55:07,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:55:14,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:55:14,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=822120.0, ans=0.1 2023-10-02 08:55:16,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 08:55:16,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:55:18,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:55:21,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:55:21,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:55:22,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 08:55:22,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:55:23,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:55:23,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 08:55:25,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:55:25,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 08:55:27,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:55:27,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:55:28,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:55:29,867 INFO [train.py:1046] (2/4) Epoch 24, batch 1150, loss[loss=0.1652, simple_loss=0.2479, pruned_loss=0.0413, over 24321.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2474, pruned_loss=0.04739, over 4662320.54 frames. ], batch size: 61, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:55:31,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-10-02 08:55:33,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:35,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:55:37,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:55:37,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:55:37,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 08:55:37,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:55:39,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=822186.6666666666, ans=0.07 2023-10-02 08:55:42,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 08:55:43,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:43,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:55:45,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=822253.3333333334, ans=0.125 2023-10-02 08:55:48,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=822253.3333333334, ans=0.125 2023-10-02 08:55:50,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 08:55:52,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:55:54,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:56,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:55:56,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 08:55:56,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:55:56,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:56:00,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 08:56:02,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:56:03,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:56:14,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:56:20,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:56:20,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 08:56:21,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:21,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:25,950 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 08:56:28,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:36,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 08:56:40,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:56:40,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:56:40,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:56:41,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:56:43,603 INFO [train.py:1046] (2/4) Epoch 24, batch 1200, loss[loss=0.1741, simple_loss=0.2467, pruned_loss=0.05082, over 23574.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2487, pruned_loss=0.04789, over 4682278.70 frames. ], batch size: 120, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:56:43,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:56:49,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:56:49,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:56:52,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:56:52,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:56:52,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:56:54,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:56:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:56:56,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:56:56,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:59,638 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 08:57:02,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 08:57:02,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=822586.6666666666, ans=0.035 2023-10-02 08:57:07,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:57:09,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:57:11,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:57:12,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=822653.3333333334, ans=0.2 2023-10-02 08:57:13,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:57:13,901 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 08:57:13,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:57:16,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.32 vs. limit=12.0 2023-10-02 08:57:19,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.901e+02 2.150e+02 2.548e+02 4.088e+02, threshold=4.300e+02, percent-clipped=1.0 2023-10-02 08:57:22,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:57:22,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:57:23,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 08:57:23,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:57:27,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 08:57:30,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 08:57:31,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:57:31,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:57:32,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:57:33,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:57:33,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:57:34,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:57:35,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:57:36,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 08:57:36,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:57:37,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:57:37,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 08:57:40,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:57:40,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:57:43,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:57:46,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:57:46,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=822786.6666666666, ans=0.125 2023-10-02 08:57:49,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 08:57:54,537 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 08:57:55,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:57:57,172 INFO [train.py:1046] (2/4) Epoch 24, batch 1250, loss[loss=0.2043, simple_loss=0.2679, pruned_loss=0.07035, over 22700.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2491, pruned_loss=0.04822, over 4696055.98 frames. ], batch size: 322, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:57:57,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:57:59,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.05 vs. limit=10.0 2023-10-02 08:58:00,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:58:00,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:58:03,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 08:58:06,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:58:08,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.74 vs. limit=22.5 2023-10-02 08:58:08,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 08:58:11,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:58:13,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:58:16,123 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:58:17,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:58:17,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:19,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:58:19,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:58:22,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:58:25,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:58:27,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:58:27,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:58:28,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:58:28,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:31,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:34,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 08:58:37,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 08:58:38,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:58:39,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=822986.6666666666, ans=0.125 2023-10-02 08:58:40,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:58:41,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 08:58:43,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:43,111 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 08:58:43,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:44,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:46,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=823053.3333333334, ans=0.0 2023-10-02 08:58:47,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:50,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:50,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:58:50,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 08:58:51,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 08:58:51,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 08:58:55,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:58:57,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 08:58:57,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:59,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 08:59:00,287 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.60 vs. limit=15.0 2023-10-02 08:59:01,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:59:02,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 08:59:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:59:03,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:59:03,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 08:59:05,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:59:06,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 08:59:08,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:59:09,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:59:11,135 INFO [train.py:1046] (2/4) Epoch 24, batch 1300, loss[loss=0.1572, simple_loss=0.2359, pruned_loss=0.03923, over 24478.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2497, pruned_loss=0.04836, over 4706545.37 frames. ], batch size: 63, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:59:11,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:59:13,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:59:16,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:59:16,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 08:59:18,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.83 vs. limit=12.0 2023-10-02 08:59:20,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:59:22,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:59:22,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:59:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:59:26,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:59:27,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 08:59:31,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:59:32,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:59:32,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 08:59:35,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=823253.3333333334, ans=0.0 2023-10-02 08:59:36,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:59:40,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:59:40,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:59:41,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:59:43,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:59:43,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:59:44,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:59:44,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 08:59:46,394 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.906e+02 2.045e+02 2.364e+02 3.578e+02, threshold=4.090e+02, percent-clipped=0.0 2023-10-02 08:59:50,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:59:50,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:59:52,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 08:59:52,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:59:54,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:59:56,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:59:58,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 08:59:58,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:59:58,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 09:00:00,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:00:05,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:00:05,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:00:09,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 09:00:09,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 09:00:10,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 09:00:15,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:00:16,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 09:00:19,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:00:21,928 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=15.0 2023-10-02 09:00:23,805 INFO [train.py:1046] (2/4) Epoch 24, batch 1350, loss[loss=0.1581, simple_loss=0.2254, pruned_loss=0.0454, over 23584.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2488, pruned_loss=0.04775, over 4700890.52 frames. ], batch size: 256, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 09:00:25,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 09:00:25,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=823520.0, ans=0.125 2023-10-02 09:00:29,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=823520.0, ans=0.0 2023-10-02 09:00:30,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:00:30,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:00:33,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:00:34,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:00:35,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:00:35,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=823520.0, ans=0.0 2023-10-02 09:00:36,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:00:42,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:00:43,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 09:00:45,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:00:45,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:00:47,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 09:00:49,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:00:50,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:00:50,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 09:00:51,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 09:00:52,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 09:00:55,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:00:55,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 09:01:04,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=823653.3333333334, ans=0.1 2023-10-02 09:01:06,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:01:10,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=823720.0, ans=0.125 2023-10-02 09:01:15,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:01:15,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:15,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 09:01:16,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:16,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=823720.0, ans=0.125 2023-10-02 09:01:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 09:01:17,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:01:19,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:01:21,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:01:25,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 09:01:27,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:01:30,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 09:01:30,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=823786.6666666666, ans=0.1 2023-10-02 09:01:32,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 09:01:37,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 09:01:37,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=823853.3333333334, ans=0.125 2023-10-02 09:01:38,734 INFO [train.py:1046] (2/4) Epoch 24, batch 1400, loss[loss=0.1576, simple_loss=0.2338, pruned_loss=0.04067, over 21105.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2477, pruned_loss=0.04722, over 4706388.72 frames. ], batch size: 46, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 09:01:38,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:42,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=823853.3333333334, ans=0.125 2023-10-02 09:01:43,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:01:43,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:01:48,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 09:01:50,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 09:01:58,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:02:00,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:02:02,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:02:02,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:02:05,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:02:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 09:02:15,579 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.897e+02 2.132e+02 2.372e+02 2.905e+02, threshold=4.263e+02, percent-clipped=0.0 2023-10-02 09:02:15,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:15,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:18,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 09:02:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:02:21,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:02:23,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:02:23,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:02:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:02:24,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:02:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:02:25,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 09:02:25,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:02:30,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:02:41,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 09:02:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:02:44,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:02:45,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 09:02:46,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:02:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:02:51,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:02:52,958 INFO [train.py:1046] (2/4) Epoch 24, batch 1450, loss[loss=0.1708, simple_loss=0.255, pruned_loss=0.04336, over 24458.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2469, pruned_loss=0.04726, over 4696921.77 frames. ], batch size: 63, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 09:02:53,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:02:53,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:53,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 09:02:57,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:02:59,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:03:00,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:03:00,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 09:03:01,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:03:03,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 09:03:03,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:05,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:05,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 09:03:06,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:03:06,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:03:07,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 09:03:07,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:09,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:03:11,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:13,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:15,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=824253.3333333334, ans=0.95 2023-10-02 09:03:16,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:03:16,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:03:18,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:03:18,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:21,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:22,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:03:22,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:23,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:03:27,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.42 vs. limit=15.0 2023-10-02 09:03:27,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 09:03:31,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:03:34,162 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 09:03:35,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:03:36,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:03:38,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:03:40,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 09:03:44,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:03:46,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 09:03:47,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 09:03:48,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:03:53,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:03:54,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:03:54,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=824453.3333333334, ans=0.125 2023-10-02 09:03:55,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 09:03:57,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 09:03:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 09:04:00,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:00,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:04:06,023 INFO [train.py:1046] (2/4) Epoch 24, batch 1500, loss[loss=0.1756, simple_loss=0.2645, pruned_loss=0.04332, over 24617.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2473, pruned_loss=0.04682, over 4716925.32 frames. ], batch size: 71, lr: 4.29e-03, grad_scale: 8.0 2023-10-02 09:04:12,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 09:04:12,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:04:12,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:04:13,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:04:13,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:04:16,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:04:16,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 09:04:18,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:04:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:04:18,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:04:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:04:20,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:04:22,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:04:29,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:04:29,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 09:04:30,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:04:30,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:04:30,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=824586.6666666666, ans=0.0 2023-10-02 09:04:32,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:34,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 09:04:35,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=824653.3333333334, ans=0.0 2023-10-02 09:04:38,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 09:04:39,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:04:40,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 09:04:42,142 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.830e+02 2.016e+02 2.339e+02 3.423e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 09:04:42,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:04:44,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:04:45,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:45,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:04:48,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 09:04:48,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.66 vs. limit=12.0 2023-10-02 09:04:49,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:04:49,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:04:49,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 09:04:49,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:04:50,249 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.02 vs. limit=15.0 2023-10-02 09:04:54,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:04:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 09:04:58,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:05:00,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:05:04,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 09:05:04,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 09:05:06,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:07,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:05:07,580 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 09:05:08,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:05:11,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 09:05:13,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:16,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:05:16,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:18,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:05:18,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:18,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:05:19,558 INFO [train.py:1046] (2/4) Epoch 24, batch 1550, loss[loss=0.1727, simple_loss=0.2403, pruned_loss=0.05255, over 23359.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2477, pruned_loss=0.04697, over 4730640.84 frames. ], batch size: 119, lr: 4.29e-03, grad_scale: 8.0 2023-10-02 09:05:19,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 09:05:20,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 09:05:20,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:05:22,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 09:05:22,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 09:05:26,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:05:26,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:28,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:05:28,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:05:29,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:29,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:29,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=824853.3333333334, ans=0.125 2023-10-02 09:05:32,629 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 09:05:32,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:33,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:05:33,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:05:34,675 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.75 vs. limit=6.0 2023-10-02 09:05:37,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:05:37,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 09:05:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:05:39,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 09:05:41,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 09:05:41,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 09:05:41,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:42,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:05:47,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:05:47,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=824986.6666666666, ans=0.0 2023-10-02 09:05:50,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 09:05:50,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 09:05:56,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:05:59,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:06:01,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:06:01,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:06:01,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 09:06:04,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=825053.3333333334, ans=0.0 2023-10-02 09:06:07,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:06:08,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:10,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:06:12,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:06:12,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:06:12,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 09:06:13,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:06:14,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:06:15,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:17,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 09:06:17,148 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 09:06:20,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:06:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 09:06:28,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:06:29,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:30,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 09:06:32,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:06:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:06:32,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:06:32,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:06:33,690 INFO [train.py:1046] (2/4) Epoch 24, batch 1600, loss[loss=0.1846, simple_loss=0.2549, pruned_loss=0.05714, over 22655.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2483, pruned_loss=0.0472, over 4723015.20 frames. ], batch size: 322, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:06:35,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:06:38,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:06:39,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 09:06:40,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 09:06:42,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 09:06:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:06:45,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 09:06:45,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:06:46,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:06:53,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:06:55,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=15.0 2023-10-02 09:06:56,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 09:06:57,539 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.23 vs. limit=15.0 2023-10-02 09:06:59,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:07:00,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 09:07:00,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:01,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 09:07:06,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 09:07:11,392 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.850e+02 2.023e+02 2.271e+02 3.157e+02, threshold=4.047e+02, percent-clipped=0.0 2023-10-02 09:07:12,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:07:13,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 09:07:14,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:07:14,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:07:14,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:07:17,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 09:07:21,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:07:25,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:07:26,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:26,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:28,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:07:29,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:07:31,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:07:32,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:07:39,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:39,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:07:40,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 09:07:40,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:07:42,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 09:07:48,250 INFO [train.py:1046] (2/4) Epoch 24, batch 1650, loss[loss=0.1613, simple_loss=0.2497, pruned_loss=0.0365, over 24487.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2489, pruned_loss=0.04746, over 4720786.70 frames. ], batch size: 69, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:07:48,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:07:48,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=825520.0, ans=0.07 2023-10-02 09:07:49,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:07:51,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:07:51,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 09:07:51,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 09:07:51,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 09:07:51,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 09:07:57,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:57,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:07:57,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:07:59,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:07:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:08:01,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=825520.0, ans=0.2 2023-10-02 09:08:02,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 09:08:05,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:08:05,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:08:05,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:08:05,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:08:05,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 09:08:05,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 09:08:11,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:08:15,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:08:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 09:08:24,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:25,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 09:08:28,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:32,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:08:32,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:08:33,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:08:33,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:08:34,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:37,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:08:37,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:08:38,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:08:40,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:08:40,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:08:40,908 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-10-02 09:08:43,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:08:44,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 09:08:46,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:08:46,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 09:08:47,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 09:08:47,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 09:08:47,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:08:49,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:08:49,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:49,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:49,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 09:08:54,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:56,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:08:56,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:08:57,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 09:09:02,448 INFO [train.py:1046] (2/4) Epoch 24, batch 1700, loss[loss=0.173, simple_loss=0.257, pruned_loss=0.04454, over 24481.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2491, pruned_loss=0.04739, over 4725434.44 frames. ], batch size: 69, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:09:03,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:09:03,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:09:03,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 09:09:05,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:09:05,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:09:05,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:09:06,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:09:06,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:09:08,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 09:09:09,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:09:18,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:09:19,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:09:21,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=825920.0, ans=0.125 2023-10-02 09:09:24,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=825920.0, ans=0.0 2023-10-02 09:09:25,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:09:25,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:09:25,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:09:26,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:09:29,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 09:09:31,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:09:31,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:34,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:09:35,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:09:38,221 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.847e+02 2.054e+02 2.399e+02 3.587e+02, threshold=4.108e+02, percent-clipped=0.0 2023-10-02 09:09:38,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 09:09:38,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 09:09:38,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=825986.6666666666, ans=0.2 2023-10-02 09:09:39,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:41,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 09:09:41,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:09:48,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:09:50,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:09:50,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:09:52,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:09:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 09:09:52,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:09:54,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=826053.3333333334, ans=0.05 2023-10-02 09:09:56,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:56,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 09:09:58,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:09:58,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:09:58,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:58,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:00,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:10:00,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:10:01,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:01,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:10:03,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:05,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:07,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 09:10:08,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:09,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:11,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 09:10:15,714 INFO [train.py:1046] (2/4) Epoch 24, batch 1750, loss[loss=0.1638, simple_loss=0.2337, pruned_loss=0.04696, over 23911.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2483, pruned_loss=0.04725, over 4719118.76 frames. ], batch size: 195, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:10:17,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:19,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:19,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:10:20,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=826186.6666666666, ans=0.125 2023-10-02 09:10:20,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.34 vs. limit=10.0 2023-10-02 09:10:21,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 09:10:21,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:10:22,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:10:24,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:27,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 09:10:30,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:33,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 09:10:33,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:10:34,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:10:34,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=826253.3333333334, ans=0.1 2023-10-02 09:10:37,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:10:38,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 09:10:41,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:10:41,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 09:10:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:10:52,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:10:52,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:55,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:55,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:57,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:58,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:58,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=826386.6666666666, ans=0.2 2023-10-02 09:10:58,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=826386.6666666666, ans=0.0 2023-10-02 09:11:01,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:11:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:04,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 09:11:05,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:11:07,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 09:11:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:11:09,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:11:11,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:11:14,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:11:15,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 09:11:15,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:18,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:11:21,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:11:22,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:11:24,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:11:26,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 09:11:26,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:11:26,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:11:27,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:27,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:11:27,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:11:29,221 INFO [train.py:1046] (2/4) Epoch 24, batch 1800, loss[loss=0.177, simple_loss=0.2424, pruned_loss=0.0558, over 23806.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2465, pruned_loss=0.0466, over 4707821.13 frames. ], batch size: 164, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:11:29,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:11:30,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:11:32,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:33,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:11:33,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=826520.0, ans=0.125 2023-10-02 09:11:36,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:11:40,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:11:40,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:11:42,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:11:43,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:43,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:45,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:11:46,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:11:46,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 09:11:48,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:52,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:55,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 09:11:59,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 09:11:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 09:11:59,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:00,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:12:00,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:12:04,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:12:04,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=826653.3333333334, ans=0.125 2023-10-02 09:12:07,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=826653.3333333334, ans=0.125 2023-10-02 09:12:08,721 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.863e+02 2.128e+02 2.344e+02 3.480e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-02 09:12:11,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 09:12:11,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:12:12,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 09:12:14,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 09:12:14,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:12:16,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:12:17,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:12:17,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=826720.0, ans=10.0 2023-10-02 09:12:20,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 09:12:27,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:12:29,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 09:12:29,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:12:30,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=826786.6666666666, ans=0.015 2023-10-02 09:12:31,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:31,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:12:31,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=826786.6666666666, ans=0.0 2023-10-02 09:12:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 09:12:34,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:12:34,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:12:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 09:12:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:39,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:12:39,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:12:39,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:42,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:42,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:12:44,913 INFO [train.py:1046] (2/4) Epoch 24, batch 1850, loss[loss=0.1825, simple_loss=0.2674, pruned_loss=0.04885, over 24084.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2479, pruned_loss=0.04711, over 4701478.44 frames. ], batch size: 80, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:12:45,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:12:45,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:12:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:12:47,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:12:52,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:12:54,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 09:12:56,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 09:12:56,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=826853.3333333334, ans=0.025 2023-10-02 09:12:59,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 09:13:02,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:04,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 09:13:04,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 09:13:05,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=826920.0, ans=0.0 2023-10-02 09:13:09,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:13:12,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 09:13:15,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:13:15,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:13:18,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 09:13:19,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:19,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:13:21,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:13:23,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:13:26,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:13:29,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:13:29,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.33 vs. limit=15.0 2023-10-02 09:13:31,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:32,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:13:32,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:32,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:13:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:13:36,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 09:13:37,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:13:40,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:13:42,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:13:42,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 09:13:42,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 09:13:44,770 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 09:13:44,838 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 09:13:47,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:13:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:13:47,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:13:47,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:48,868 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 09:13:48,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:13:50,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:52,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:13:52,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:13:53,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:13:53,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 09:13:55,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:55,086 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 09:13:56,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:13:56,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:59,743 INFO [train.py:1046] (2/4) Epoch 24, batch 1900, loss[loss=0.1732, simple_loss=0.2434, pruned_loss=0.0515, over 23487.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2486, pruned_loss=0.04728, over 4689652.40 frames. ], batch size: 285, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:14:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:14:04,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:14:05,915 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 09:14:07,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 09:14:08,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:14:10,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:14:10,135 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 09:14:10,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 09:14:14,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 09:14:15,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:14:18,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=827253.3333333334, ans=0.125 2023-10-02 09:14:20,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 09:14:21,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=827253.3333333334, ans=0.04949747468305833 2023-10-02 09:14:22,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 09:14:25,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=827253.3333333334, ans=0.0 2023-10-02 09:14:30,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 09:14:33,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 09:14:34,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=827320.0, ans=0.1 2023-10-02 09:14:35,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:14:35,198 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 09:14:35,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 09:14:35,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 09:14:35,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 09:14:35,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:14:36,439 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.881e+02 2.012e+02 2.248e+02 2.968e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-02 09:14:40,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 09:14:41,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:14:44,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:14:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 09:14:45,496 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.13 vs. limit=22.5 2023-10-02 09:14:47,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:14:47,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=827386.6666666666, ans=0.0 2023-10-02 09:14:49,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 09:14:49,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=827386.6666666666, ans=0.125 2023-10-02 09:14:50,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:14:56,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:14:56,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:14:56,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:14:56,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=827453.3333333334, ans=10.0 2023-10-02 09:14:57,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:14:59,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:14:59,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:15:01,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:15:03,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:15:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:15:03,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.74 vs. limit=15.0 2023-10-02 09:15:05,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:15:05,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:15:07,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:15:08,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=827453.3333333334, ans=0.07 2023-10-02 09:15:09,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:15:12,546 INFO [train.py:1046] (2/4) Epoch 24, batch 1950, loss[loss=0.1719, simple_loss=0.2467, pruned_loss=0.04856, over 23464.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2494, pruned_loss=0.04775, over 4697381.91 frames. ], batch size: 93, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:15:12,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:15:14,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:15:14,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:14,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:15:18,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 09:15:18,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 09:15:18,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:20,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:23,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:15:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:15:23,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:26,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:15:29,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:15:29,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:15:29,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:15:31,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:34,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:38,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:15:38,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:15:38,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:15:38,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 09:15:39,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:15:39,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:15:39,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:44,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:46,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:15:49,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:15:50,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=827653.3333333334, ans=0.95 2023-10-02 09:15:54,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:15:54,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:15:55,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 09:15:55,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:15:58,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:16:00,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:16:00,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:16:02,077 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:16:06,596 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:16:09,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:09,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:11,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:14,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:16:14,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=827786.6666666666, ans=0.1 2023-10-02 09:16:15,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:16:16,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-02 09:16:17,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:16:17,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 09:16:17,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:16:17,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=827786.6666666666, ans=0.125 2023-10-02 09:16:18,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:16:18,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=827786.6666666666, ans=0.09899494936611666 2023-10-02 09:16:19,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 09:16:22,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:16:25,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:16:26,780 INFO [train.py:1046] (2/4) Epoch 24, batch 2000, loss[loss=0.1811, simple_loss=0.2718, pruned_loss=0.04525, over 24605.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2495, pruned_loss=0.04768, over 4697406.42 frames. ], batch size: 71, lr: 4.29e-03, grad_scale: 32.0 2023-10-02 09:16:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:16:26,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:16:28,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=827853.3333333334, ans=0.0 2023-10-02 09:16:30,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:16:30,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:33,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 09:16:34,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:16:38,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:16:38,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=827853.3333333334, ans=0.0 2023-10-02 09:16:40,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 09:16:42,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:16:44,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:16:45,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:16:47,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 09:16:47,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=827920.0, ans=0.0 2023-10-02 09:16:49,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:51,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:51,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:53,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 09:16:53,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:16:55,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 09:16:55,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:16:58,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:16:58,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 09:16:58,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:59,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:01,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:17:01,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 09:17:04,253 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.913e+02 2.238e+02 2.677e+02 4.135e+02, threshold=4.476e+02, percent-clipped=1.0 2023-10-02 09:17:05,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 09:17:05,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:17:05,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:12,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:13,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:17:14,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:17:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:17:15,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:17,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:17,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:17:17,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:18,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:20,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:17:21,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 09:17:24,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:17:26,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:28,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:28,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:17:33,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:35,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:17:35,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:37,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:17:37,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:17:38,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:40,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:42,157 INFO [train.py:1046] (2/4) Epoch 24, batch 2050, loss[loss=0.1524, simple_loss=0.2329, pruned_loss=0.0359, over 24545.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2481, pruned_loss=0.04768, over 4689395.28 frames. ], batch size: 60, lr: 4.28e-03, grad_scale: 32.0 2023-10-02 09:17:45,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:17:46,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:48,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=828186.6666666666, ans=0.125 2023-10-02 09:17:52,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:52,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=828186.6666666666, ans=0.0 2023-10-02 09:17:53,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:17:53,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:55,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:17:55,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=828253.3333333334, ans=0.125 2023-10-02 09:17:57,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 09:17:57,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:17:59,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:59,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:17:59,467 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:18:08,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:18:08,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:18:10,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 09:18:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:18:14,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 09:18:15,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:18:18,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:18:20,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:21,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:18:21,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:18:24,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:18:25,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:18:25,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:18:28,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:29,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:18:31,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:18:32,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:18:35,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:18:35,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=828386.6666666666, ans=0.1 2023-10-02 09:18:39,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:18:40,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=828453.3333333334, ans=0.125 2023-10-02 09:18:41,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 09:18:44,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=828453.3333333334, ans=0.025 2023-10-02 09:18:45,461 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-02 09:18:46,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:18:47,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:18:49,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=828453.3333333334, ans=0.0 2023-10-02 09:18:50,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:18:51,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 09:18:54,450 INFO [train.py:1046] (2/4) Epoch 24, batch 2100, loss[loss=0.1818, simple_loss=0.2624, pruned_loss=0.05061, over 24073.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2464, pruned_loss=0.04745, over 4686253.83 frames. ], batch size: 80, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:18:54,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=828520.0, ans=0.125 2023-10-02 09:18:55,904 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 09:18:55,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:18:57,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:57,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:18:57,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:18:58,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 09:18:58,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 09:19:01,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:19:02,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:19:04,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:19:04,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:06,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:19:06,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 09:19:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:19:08,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 09:19:08,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 09:19:10,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:10,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:19:10,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 09:19:12,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 09:19:17,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 09:19:17,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:19:21,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:19:21,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:19:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:19:25,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 09:19:25,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:25,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 09:19:27,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 09:19:28,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:28,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 09:19:28,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 09:19:29,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 09:19:32,603 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.934e+02 2.200e+02 2.587e+02 4.169e+02, threshold=4.400e+02, percent-clipped=0.0 2023-10-02 09:19:32,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:19:33,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:19:34,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=828653.3333333334, ans=0.1 2023-10-02 09:19:35,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:19:35,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=828653.3333333334, ans=0.125 2023-10-02 09:19:36,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:19:38,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:40,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:40,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 09:19:40,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:41,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:41,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:42,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 09:19:45,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 09:19:45,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 09:19:49,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:19:51,836 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=12.0 2023-10-02 09:19:52,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:19:52,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 09:19:56,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:59,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:19:59,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:19:59,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:20:00,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 09:20:00,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:20:02,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:20:02,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:20:03,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:20:03,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:06,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 09:20:08,320 INFO [train.py:1046] (2/4) Epoch 24, batch 2150, loss[loss=0.1792, simple_loss=0.2516, pruned_loss=0.05339, over 23812.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2457, pruned_loss=0.0471, over 4693722.33 frames. ], batch size: 195, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:20:08,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 09:20:08,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:11,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:20:11,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:20:11,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:20:11,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:20:17,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 09:20:18,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:20,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:22,461 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-10-02 09:20:23,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:20:23,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:23,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:20:24,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:24,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:20:26,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:20:28,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:30,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 09:20:34,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:36,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:20:37,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:37,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:39,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:20:40,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:40,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:20:40,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:20:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 09:20:42,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:20:44,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:44,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:44,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=828986.6666666666, ans=0.0 2023-10-02 09:20:45,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:20:47,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:20:50,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:52,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:20:52,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:52,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 09:20:52,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:20:54,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:56,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:57,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:57,864 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:20:57,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=829053.3333333334, ans=0.125 2023-10-02 09:20:58,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:20:59,779 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=15.0 2023-10-02 09:21:00,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:01,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:01,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 09:21:03,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 09:21:03,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:21:04,531 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 09:21:04,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:05,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:21:07,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 09:21:07,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:21:07,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 09:21:07,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 09:21:07,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 09:21:07,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 09:21:08,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:10,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:21:10,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:21:11,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:13,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:21:14,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:14,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:21,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:21:23,007 INFO [train.py:1046] (2/4) Epoch 24, batch 2200, loss[loss=0.1504, simple_loss=0.2316, pruned_loss=0.03456, over 24503.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2455, pruned_loss=0.04658, over 4692041.10 frames. ], batch size: 63, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:21:23,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 09:21:27,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:21:28,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:29,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:21:29,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:21:31,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:21:31,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=829186.6666666666, ans=0.125 2023-10-02 09:21:32,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:34,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:21:34,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 09:21:38,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 09:21:40,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:21:44,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 09:21:47,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:48,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:21:48,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:21:52,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:21:52,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 09:21:56,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:21:58,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:59,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 09:21:59,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=829320.0, ans=0.09899494936611666 2023-10-02 09:22:01,830 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.810e+02 2.099e+02 2.472e+02 3.900e+02, threshold=4.197e+02, percent-clipped=0.0 2023-10-02 09:22:03,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=829320.0, ans=0.1 2023-10-02 09:22:03,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=829320.0, ans=0.125 2023-10-02 09:22:04,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:22:06,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:22:06,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=829386.6666666666, ans=0.125 2023-10-02 09:22:07,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:22:10,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 09:22:13,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:15,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 09:22:16,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:16,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:22:16,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:18,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:22:19,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:22:19,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:19,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:20,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:22:20,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:22:22,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:22:26,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:22:26,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:22:28,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:22:30,021 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 09:22:31,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:22:32,747 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 09:22:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:22:34,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 09:22:35,405 INFO [train.py:1046] (2/4) Epoch 24, batch 2250, loss[loss=0.1663, simple_loss=0.2499, pruned_loss=0.04139, over 23950.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2467, pruned_loss=0.04717, over 4685247.99 frames. ], batch size: 86, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:22:35,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:22:35,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:22:36,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:22:38,348 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 09:22:40,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:22:40,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=829520.0, ans=0.1 2023-10-02 09:22:42,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:22:47,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:22:49,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:22:51,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:22:51,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:22:52,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:22:55,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 09:22:55,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:55,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:22:57,592 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.08 vs. limit=15.0 2023-10-02 09:22:58,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 09:22:59,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:23:01,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:23:02,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:23:08,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:23:08,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:23:08,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:23:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 09:23:11,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:23:14,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:23:16,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:23:17,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=829653.3333333334, ans=0.1 2023-10-02 09:23:18,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:23:19,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:23:19,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:23:23,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:23:23,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:23:27,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:23:28,540 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.12 vs. limit=22.5 2023-10-02 09:23:29,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:23:31,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:23:31,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:23:33,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:23:38,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:23:40,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:23:40,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 09:23:40,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:41,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:23:42,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 09:23:46,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:23:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:49,569 INFO [train.py:1046] (2/4) Epoch 24, batch 2300, loss[loss=0.2231, simple_loss=0.2832, pruned_loss=0.08153, over 19606.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2485, pruned_loss=0.04775, over 4697426.48 frames. ], batch size: 390, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:23:51,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:52,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:23:53,997 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 09:23:54,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:02,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:24:02,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:24:03,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:03,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:03,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 09:24:04,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:24:07,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:24:07,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:24:07,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.90 vs. limit=6.0 2023-10-02 09:24:11,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:24:13,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:24:15,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:24:21,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:24:21,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=829986.6666666666, ans=0.0 2023-10-02 09:24:22,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:25,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:24:28,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:24:29,935 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.846e+02 2.063e+02 2.350e+02 3.320e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-02 09:24:31,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:24:32,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:24:32,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:24:32,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 09:24:35,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:24:35,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:35,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:24:35,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:24:36,494 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.73 vs. limit=15.0 2023-10-02 09:24:37,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:24:37,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 09:24:37,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:24:38,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 09:24:38,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:24:38,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:40,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 09:24:40,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=830053.3333333334, ans=0.125 2023-10-02 09:24:46,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:24:50,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:24:53,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:24:53,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:24:55,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:24:55,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=830120.0, ans=0.0 2023-10-02 09:24:56,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:24:56,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:24:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:24:58,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 09:25:03,534 INFO [train.py:1046] (2/4) Epoch 24, batch 2350, loss[loss=0.173, simple_loss=0.2421, pruned_loss=0.05199, over 23546.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2487, pruned_loss=0.04762, over 4709226.31 frames. ], batch size: 256, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:25:03,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:25:03,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 09:25:06,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=830186.6666666666, ans=0.0 2023-10-02 09:25:07,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.09 vs. limit=6.0 2023-10-02 09:25:09,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 09:25:12,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:25:15,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:15,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:15,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:25:16,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:25:18,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 09:25:22,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:25:29,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 09:25:29,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:25:33,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:25:34,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:25:35,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:25:37,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 09:25:38,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:25:40,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:25:40,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:25:40,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:25:41,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.33 vs. limit=12.0 2023-10-02 09:25:43,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:25:43,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=12.0 2023-10-02 09:25:47,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 09:25:48,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:25:50,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:50,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:25:51,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-02 09:25:52,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 09:25:53,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:25:54,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 09:25:54,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:26:01,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 09:26:06,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 09:26:08,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:26:08,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 09:26:08,325 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 09:26:08,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 09:26:11,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 09:26:14,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:26:17,296 INFO [train.py:1046] (2/4) Epoch 24, batch 2400, loss[loss=0.1734, simple_loss=0.251, pruned_loss=0.04786, over 24053.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2484, pruned_loss=0.04708, over 4715601.23 frames. ], batch size: 80, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:26:17,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:26:20,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:26:22,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:26:23,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 09:26:23,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 09:26:28,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:26:28,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:26:28,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=830520.0, ans=0.125 2023-10-02 09:26:30,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 09:26:30,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:26:31,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=830586.6666666666, ans=0.125 2023-10-02 09:26:32,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:32,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 09:26:38,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:41,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 09:26:46,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:26:50,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 09:26:50,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=830653.3333333334, ans=0.1 2023-10-02 09:26:51,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:26:53,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:56,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:26:57,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.801e+02 2.069e+02 2.462e+02 3.779e+02, threshold=4.137e+02, percent-clipped=0.0 2023-10-02 09:26:57,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 09:26:57,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:27:05,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:06,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:27:08,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:08,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:27:10,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:27:10,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:27:10,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:11,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:27:11,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:27:15,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:27:17,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:27:17,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 09:27:18,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 09:27:20,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:27:20,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:21,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 09:27:21,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 09:27:21,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 09:27:21,891 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 09:27:21,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 09:27:23,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:27:25,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:25,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:27:26,915 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 09:27:28,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:28,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:27:31,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:27:31,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:27:32,551 INFO [train.py:1046] (2/4) Epoch 24, batch 2450, loss[loss=0.1403, simple_loss=0.1854, pruned_loss=0.04766, over 19357.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2458, pruned_loss=0.04693, over 4693253.27 frames. ], batch size: 388, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:27:32,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=830853.3333333334, ans=0.0 2023-10-02 09:27:34,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=830853.3333333334, ans=0.125 2023-10-02 09:27:35,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:35,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:27:37,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 09:27:40,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:27:40,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:43,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:27:45,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:27:45,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:27:45,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 09:27:48,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=830920.0, ans=0.1 2023-10-02 09:27:49,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:50,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:27:50,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:27:53,358 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.29 vs. limit=10.0 2023-10-02 09:27:55,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:27:55,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:27:56,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:27:56,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:59,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 09:27:59,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:28:08,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:10,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:28:11,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:12,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:28:12,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:14,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:28:15,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 09:28:19,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:28:19,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:28:21,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:28:21,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:25,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:28:26,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 09:28:26,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:28:28,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:28:28,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 09:28:29,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:28:29,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:28:32,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:28:35,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:35,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:28:41,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 09:28:41,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:28:46,888 INFO [train.py:1046] (2/4) Epoch 24, batch 2500, loss[loss=0.1879, simple_loss=0.2716, pruned_loss=0.05211, over 24062.00 frames. ], tot_loss[loss=0.169, simple_loss=0.245, pruned_loss=0.04649, over 4697193.84 frames. ], batch size: 80, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:28:48,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:28:55,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:28:57,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:59,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:28:59,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 09:29:04,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:29:04,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:29:06,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:29:06,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:29:06,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 09:29:07,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:08,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:29:10,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 09:29:10,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:12,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 09:29:12,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:16,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:29:18,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:29:20,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:29:21,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 09:29:22,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:29:25,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:28,277 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.796e+02 1.940e+02 2.141e+02 3.270e+02, threshold=3.880e+02, percent-clipped=0.0 2023-10-02 09:29:29,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:33,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:35,472 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.40 vs. limit=15.0 2023-10-02 09:29:36,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:29:40,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:29:41,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 09:29:42,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:29:42,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:29:43,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=831386.6666666666, ans=0.125 2023-10-02 09:29:44,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:29:44,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:29:47,489 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 09:29:47,490 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 09:29:47,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 09:29:50,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:53,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 09:29:53,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 09:29:53,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:29:53,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 09:29:57,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 09:30:00,295 INFO [train.py:1046] (2/4) Epoch 24, batch 2550, loss[loss=0.1832, simple_loss=0.2502, pruned_loss=0.05807, over 22782.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2461, pruned_loss=0.04701, over 4696185.49 frames. ], batch size: 322, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:30:00,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:30:01,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:30:01,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:30:05,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:30:06,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 09:30:06,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:30:10,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 09:30:10,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:30:13,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:16,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:30:16,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 09:30:18,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:30:18,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:30:20,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:30:21,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:30:21,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 09:30:23,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:30:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:23,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 09:30:24,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=831586.6666666666, ans=0.125 2023-10-02 09:30:27,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=831586.6666666666, ans=0.2 2023-10-02 09:30:34,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:30:39,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=831653.3333333334, ans=0.2 2023-10-02 09:30:40,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:30:40,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:40,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:30:41,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:30:46,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:30:49,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:30:49,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:30:49,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:30:51,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:30:51,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:30:54,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=831720.0, ans=0.125 2023-10-02 09:30:55,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:30:55,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:58,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:30:58,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 09:30:58,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:30:58,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:31:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:31:02,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:31:04,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:10,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:31:11,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:13,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 09:31:14,772 INFO [train.py:1046] (2/4) Epoch 24, batch 2600, loss[loss=0.184, simple_loss=0.2685, pruned_loss=0.04976, over 24371.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.247, pruned_loss=0.04706, over 4710458.07 frames. ], batch size: 77, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:31:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 09:31:16,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:31:16,265 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 09:31:18,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 09:31:18,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=831853.3333333334, ans=0.125 2023-10-02 09:31:19,332 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 09:31:20,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:31:20,874 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 09:31:22,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 09:31:24,064 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 09:31:24,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=831853.3333333334, ans=0.125 2023-10-02 09:31:26,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:31:28,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 09:31:29,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 09:31:30,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:31:31,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=831920.0, ans=0.125 2023-10-02 09:31:32,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 09:31:33,787 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 09:31:33,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 09:31:42,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:31:42,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:42,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:31:42,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 09:31:45,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:31:46,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=831986.6666666666, ans=0.125 2023-10-02 09:31:51,625 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 09:31:55,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:55,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:31:56,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=831986.6666666666, ans=0.125 2023-10-02 09:31:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 09:31:58,176 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.850e+02 2.063e+02 2.376e+02 4.529e+02, threshold=4.127e+02, percent-clipped=3.0 2023-10-02 09:31:58,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:31:58,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:31:59,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 09:32:01,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:32:02,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:32:03,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 09:32:06,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:06,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:32:12,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:32:13,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:32:13,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 09:32:15,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:32:17,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:32:18,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:32:24,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 09:32:24,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:27,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:32:29,178 INFO [train.py:1046] (2/4) Epoch 24, batch 2650, loss[loss=0.1778, simple_loss=0.2477, pruned_loss=0.05393, over 23638.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2472, pruned_loss=0.04698, over 4712084.17 frames. ], batch size: 256, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:32:30,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 09:32:30,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:33,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:32:34,648 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 09:32:34,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:32:38,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:40,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:32:41,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:32:42,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:44,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 09:32:44,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:32:44,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:32:47,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 09:32:48,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=832253.3333333334, ans=0.0 2023-10-02 09:32:49,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 09:32:51,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:32:54,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 09:32:55,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:32:55,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 09:33:01,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:01,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:33:01,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:01,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:07,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 09:33:07,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 09:33:08,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:33:10,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=832320.0, ans=0.0 2023-10-02 09:33:11,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 09:33:11,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:14,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:14,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:33:14,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:33:15,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:33:17,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:33:18,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:33:20,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:33:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:33:22,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:33:23,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:25,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:33:25,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:26,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:33:28,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:33:31,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:31,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:33:31,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:31,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=832453.3333333334, ans=0.125 2023-10-02 09:33:32,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 09:33:35,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:33:36,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:36,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:38,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:38,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:33:39,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:42,195 INFO [train.py:1046] (2/4) Epoch 24, batch 2700, loss[loss=0.1765, simple_loss=0.2467, pruned_loss=0.05316, over 23505.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2489, pruned_loss=0.0477, over 4711450.06 frames. ], batch size: 256, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:33:42,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:33:42,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 09:33:45,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:33:48,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 09:33:50,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:50,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:50,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:51,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:33:51,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:51,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:33:52,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:33:52,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 09:33:52,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:33:54,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:33:56,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:33:57,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:00,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:34:02,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 09:34:02,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:34:06,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:34:06,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:11,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:34:11,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:34:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:34:13,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:34:17,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:18,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:34:18,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:34:18,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:34:23,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:23,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:34:24,653 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.849e+02 2.017e+02 2.176e+02 3.250e+02, threshold=4.034e+02, percent-clipped=0.0 2023-10-02 09:34:31,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:34:31,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:34:35,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:34:35,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:36,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:38,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:38,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=832720.0, ans=0.125 2023-10-02 09:34:39,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:34:41,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:41,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:41,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:34:43,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:34:43,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:45,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:49,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 09:34:49,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:51,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:34:51,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 09:34:54,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 09:34:54,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:56,392 INFO [train.py:1046] (2/4) Epoch 24, batch 2750, loss[loss=0.1621, simple_loss=0.2449, pruned_loss=0.03965, over 24610.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2484, pruned_loss=0.04786, over 4694387.78 frames. ], batch size: 65, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:34:56,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:34:57,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:59,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:59,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:35:01,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:02,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:03,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:35:03,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:35:03,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:03,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 09:35:03,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:35:03,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:35:10,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 09:35:12,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:35:13,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:13,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:35:13,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:35:15,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:35:15,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:35:16,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:16,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:17,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=832920.0, ans=0.125 2023-10-02 09:35:19,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=832920.0, ans=0.0 2023-10-02 09:35:20,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:35:20,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:35:20,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:35:22,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:24,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:35:28,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:31,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:35:31,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:35,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:35,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:35:35,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:35:42,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:35:44,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:35:44,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 09:35:48,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:50,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 09:35:54,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:35:58,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:35:58,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 09:35:59,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:36:02,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:36:02,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 09:36:02,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=833120.0, ans=0.125 2023-10-02 09:36:03,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:36:05,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=833120.0, ans=0.125 2023-10-02 09:36:07,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 09:36:07,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:09,148 INFO [train.py:1046] (2/4) Epoch 24, batch 2800, loss[loss=0.1478, simple_loss=0.232, pruned_loss=0.03185, over 24352.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2469, pruned_loss=0.0473, over 4704889.17 frames. ], batch size: 61, lr: 4.27e-03, grad_scale: 16.0 2023-10-02 09:36:09,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:09,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 09:36:09,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:09,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:12,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:12,074 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 09:36:12,074 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 09:36:14,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.01 vs. limit=10.0 2023-10-02 09:36:16,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:17,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:36:17,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:36:22,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:36:23,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 09:36:25,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 09:36:26,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 09:36:28,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:28,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:36:30,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:36:34,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:36:34,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:34,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:36:34,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:36:43,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:36:44,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:47,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:47,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:36:49,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:36:52,042 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.978e+02 2.120e+02 2.446e+02 3.436e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-02 09:36:52,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:36:52,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 09:36:53,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:54,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:36:54,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:36:59,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:59,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:04,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:37:06,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:37:06,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:06,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:37:06,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:37:08,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:37:08,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:37:08,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 09:37:08,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:09,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:37:09,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:11,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 09:37:12,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:12,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:37:12,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:37:16,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 09:37:21,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:37:21,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:37:22,922 INFO [train.py:1046] (2/4) Epoch 24, batch 2850, loss[loss=0.1674, simple_loss=0.2481, pruned_loss=0.04332, over 24476.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2462, pruned_loss=0.04692, over 4700577.21 frames. ], batch size: 66, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:37:22,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:37:23,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:37:27,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:37:27,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:37:27,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:37:29,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:31,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:32,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:37:34,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 09:37:39,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 09:37:39,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:37:41,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 09:37:41,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:41,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=833586.6666666666, ans=0.2 2023-10-02 09:37:43,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 09:37:45,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 09:37:46,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:49,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=833586.6666666666, ans=0.05 2023-10-02 09:37:55,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=833653.3333333334, ans=0.125 2023-10-02 09:37:57,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:58,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:37:58,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:37:58,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=833653.3333333334, ans=0.0 2023-10-02 09:38:01,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:38:01,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:38:01,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:38:04,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:38:04,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 09:38:06,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:38:07,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:38:08,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:38:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:09,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=833720.0, ans=0.125 2023-10-02 09:38:11,051 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.12 vs. limit=15.0 2023-10-02 09:38:11,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:11,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:12,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:14,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:38:15,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:38:15,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:17,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:20,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:38:23,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:38:25,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 09:38:25,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 09:38:25,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=833786.6666666666, ans=0.125 2023-10-02 09:38:28,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:38:28,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:38:28,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 09:38:30,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:38:30,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:38:32,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:38:32,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:38:32,478 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 09:38:32,520 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 09:38:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:38:33,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:37,054 INFO [train.py:1046] (2/4) Epoch 24, batch 2900, loss[loss=0.1761, simple_loss=0.2619, pruned_loss=0.04519, over 24666.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2461, pruned_loss=0.04701, over 4703933.78 frames. ], batch size: 73, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:38:37,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:38:37,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=833853.3333333334, ans=0.0 2023-10-02 09:38:38,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:38:38,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:38:38,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=833853.3333333334, ans=0.0 2023-10-02 09:38:39,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 09:38:44,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:44,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 09:38:45,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 09:38:46,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:38:46,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:38:49,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:49,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:38:54,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:38:54,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:56,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=833920.0, ans=0.125 2023-10-02 09:38:57,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:38:58,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 09:38:58,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:39:00,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=833920.0, ans=0.0 2023-10-02 09:39:01,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:03,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 09:39:05,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 09:39:07,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-10-02 09:39:07,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:39:07,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 09:39:07,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:39:09,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:39:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:39:12,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:39:13,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:16,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:39:17,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=15.0 2023-10-02 09:39:19,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:20,634 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.897e+02 2.159e+02 2.476e+02 3.741e+02, threshold=4.318e+02, percent-clipped=0.0 2023-10-02 09:39:22,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 09:39:22,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 09:39:22,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:39:26,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:39:28,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 09:39:28,532 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:39:29,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:39:29,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=834053.3333333334, ans=0.0 2023-10-02 09:39:34,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:43,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:39:44,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:39:46,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 09:39:46,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=834120.0, ans=0.1 2023-10-02 09:39:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:47,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 09:39:47,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:39:49,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:39:50,568 INFO [train.py:1046] (2/4) Epoch 24, batch 2950, loss[loss=0.1705, simple_loss=0.2495, pruned_loss=0.04578, over 24284.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2463, pruned_loss=0.04669, over 4705437.06 frames. ], batch size: 61, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:39:53,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:39:54,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.07 vs. limit=22.5 2023-10-02 09:39:56,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 09:39:56,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:39:56,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:59,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:00,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:40:00,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 09:40:02,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 09:40:02,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:40:02,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:40:05,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=834253.3333333334, ans=0.125 2023-10-02 09:40:05,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=12.0 2023-10-02 09:40:06,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=834253.3333333334, ans=0.1 2023-10-02 09:40:10,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:40:12,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:40:12,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:40:14,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:40:16,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:40:16,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:40:17,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:40:18,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:40:18,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:40:23,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 09:40:25,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=834320.0, ans=0.1 2023-10-02 09:40:26,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=834320.0, ans=0.0 2023-10-02 09:40:27,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 09:40:27,828 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 09:40:27,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=834320.0, ans=0.2 2023-10-02 09:40:29,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:40:30,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 09:40:30,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 09:40:30,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:40:32,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:40:32,542 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 09:40:32,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:40:37,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 09:40:37,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:40:38,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:40:39,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:41,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:40:41,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=834386.6666666666, ans=0.2 2023-10-02 09:40:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:42,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 09:40:42,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:42,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 09:40:49,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:51,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:40:52,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 09:40:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:40:52,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 09:40:54,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=834453.3333333334, ans=0.05 2023-10-02 09:40:55,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:40:57,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:40:58,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:40:58,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:58,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:40:59,556 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.47 vs. limit=10.0 2023-10-02 09:41:00,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:41:01,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:01,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:41:01,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:41:02,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:41:04,745 INFO [train.py:1046] (2/4) Epoch 24, batch 3000, loss[loss=0.1882, simple_loss=0.2598, pruned_loss=0.05827, over 23705.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2475, pruned_loss=0.04722, over 4696194.74 frames. ], batch size: 212, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:41:04,745 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 09:41:17,330 INFO [train.py:1078] (2/4) Epoch 24, validation: loss=0.349, simple_loss=0.2892, pruned_loss=0.2044, over 1125622.00 frames. 2023-10-02 09:41:17,331 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 09:41:17,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:41:19,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:19,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 09:41:20,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:22,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:41:24,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:41:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 09:41:25,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 09:41:29,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:41:29,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:41:29,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 09:41:31,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:41:33,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.30 vs. limit=12.0 2023-10-02 09:41:37,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:41:44,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:41:53,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 09:41:53,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:41:55,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:41:55,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:41:55,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:41:57,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:41:57,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 09:42:00,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 09:42:01,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:42:03,110 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.840e+02 2.041e+02 2.384e+02 3.232e+02, threshold=4.082e+02, percent-clipped=0.0 2023-10-02 09:42:03,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:42:05,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:42:05,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=834720.0, ans=0.125 2023-10-02 09:42:07,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:42:07,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:07,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:42:07,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=834720.0, ans=0.0 2023-10-02 09:42:10,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:42:11,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:42:11,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:42:12,602 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.88 vs. limit=12.0 2023-10-02 09:42:13,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:42:16,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 09:42:16,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:42:16,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:17,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:42:19,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.33 vs. limit=12.0 2023-10-02 09:42:21,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:23,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:23,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 09:42:23,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 09:42:24,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:42:24,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 09:42:24,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:42:27,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 09:42:29,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:42:31,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:42:31,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 09:42:31,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 09:42:31,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:42:32,492 INFO [train.py:1046] (2/4) Epoch 24, batch 3050, loss[loss=0.1583, simple_loss=0.2426, pruned_loss=0.03696, over 24476.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2484, pruned_loss=0.04735, over 4695495.83 frames. ], batch size: 63, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:42:32,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:42:34,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:34,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:42:34,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:35,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:42:38,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 09:42:41,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:42:43,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:42:43,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:42:47,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:50,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 09:42:56,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 09:42:56,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 09:42:56,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:42:59,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:43:03,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:03,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:43:03,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:06,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:43:08,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:43:08,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:08,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:43:08,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:09,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:11,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:12,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.22 vs. limit=15.0 2023-10-02 09:43:14,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:14,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=834986.6666666666, ans=0.0 2023-10-02 09:43:15,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 09:43:15,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:15,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:43:18,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:43:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:43:20,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:43:21,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:21,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=835053.3333333334, ans=0.125 2023-10-02 09:43:26,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:26,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:29,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=835053.3333333334, ans=0.125 2023-10-02 09:43:34,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:43:34,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:35,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:43:37,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:43:37,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:43:38,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 09:43:39,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:43:39,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:41,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 09:43:42,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:42,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=835120.0, ans=0.1 2023-10-02 09:43:47,134 INFO [train.py:1046] (2/4) Epoch 24, batch 3100, loss[loss=0.1487, simple_loss=0.225, pruned_loss=0.03622, over 24321.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2484, pruned_loss=0.04723, over 4697497.63 frames. ], batch size: 56, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:43:47,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:48,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:43:51,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:43:52,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 09:43:56,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 09:43:56,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 09:43:58,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:44:00,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:44:02,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:04,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 09:44:08,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:12,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 09:44:16,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:44:17,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:18,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:44:18,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:44:19,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 09:44:21,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:44:21,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 09:44:21,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:44:22,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 09:44:27,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:44:29,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=835320.0, ans=0.125 2023-10-02 09:44:30,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:44:30,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 09:44:31,465 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.869e+02 2.084e+02 2.364e+02 3.517e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 09:44:31,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 09:44:32,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:33,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:36,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:44:36,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:36,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:44:37,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:44:37,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:44:39,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:44:39,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:44:39,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:39,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 09:44:44,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:44:45,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 09:44:45,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=835453.3333333334, ans=0.1 2023-10-02 09:44:48,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:44:49,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 09:44:50,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:44:50,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:50,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 09:45:00,578 INFO [train.py:1046] (2/4) Epoch 24, batch 3150, loss[loss=0.1687, simple_loss=0.2297, pruned_loss=0.05381, over 23741.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2472, pruned_loss=0.04662, over 4699207.51 frames. ], batch size: 232, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:45:00,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 09:45:04,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:05,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:45:06,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:45:06,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:45:06,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 09:45:07,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=835520.0, ans=0.1 2023-10-02 09:45:08,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:08,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 09:45:08,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 09:45:10,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:13,662 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 09:45:15,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 09:45:15,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:45:16,474 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 09:45:17,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 09:45:18,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=835586.6666666666, ans=0.125 2023-10-02 09:45:19,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 09:45:20,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 09:45:20,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 09:45:20,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:20,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:45:21,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:23,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 09:45:25,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:25,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:26,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.07 vs. limit=10.0 2023-10-02 09:45:27,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:45:28,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:45:31,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 09:45:33,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:45:36,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=835653.3333333334, ans=0.0 2023-10-02 09:45:37,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:45:37,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:45:37,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 09:45:40,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 09:45:41,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:45:41,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:45:41,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=835653.3333333334, ans=0.125 2023-10-02 09:45:42,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.71 vs. limit=15.0 2023-10-02 09:45:42,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:45:42,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:45:44,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:45:45,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:45:45,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:45:45,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 09:45:45,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:45:46,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:45:48,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:45:48,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:45:48,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 09:45:48,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:45:48,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=835720.0, ans=0.125 2023-10-02 09:45:49,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 09:45:49,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:45:50,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=835720.0, ans=0.2 2023-10-02 09:45:51,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 09:45:53,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 09:45:54,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:45:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:45:56,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 09:45:57,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 09:45:58,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:46:01,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:46:01,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=835786.6666666666, ans=0.125 2023-10-02 09:46:03,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:03,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:46:07,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:46:07,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:10,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 09:46:12,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-10-02 09:46:14,237 INFO [train.py:1046] (2/4) Epoch 24, batch 3200, loss[loss=0.1753, simple_loss=0.249, pruned_loss=0.05079, over 23583.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2465, pruned_loss=0.04634, over 4700300.13 frames. ], batch size: 256, lr: 4.27e-03, grad_scale: 16.0 2023-10-02 09:46:14,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:46:14,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:46:18,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:46:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 09:46:23,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:46:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:46:27,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=835920.0, ans=0.125 2023-10-02 09:46:29,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:38,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:46:46,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 09:46:49,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:46:52,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 09:46:54,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:46:54,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=835986.6666666666, ans=0.0 2023-10-02 09:46:57,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:46:57,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:46:57,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:46:58,937 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.959e+02 2.375e+02 3.040e+02 4.854e+02, threshold=4.749e+02, percent-clipped=3.0 2023-10-02 09:47:01,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 09:47:03,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 09:47:05,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 09:47:08,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 09:47:09,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:47:17,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:17,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:47:17,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:17,724 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 09:47:17,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:47:21,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:47:23,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 09:47:23,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 09:47:24,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 09:47:26,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 09:47:28,238 INFO [train.py:1046] (2/4) Epoch 24, batch 3250, loss[loss=0.2116, simple_loss=0.2647, pruned_loss=0.07927, over 19288.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2465, pruned_loss=0.04639, over 4693995.94 frames. ], batch size: 388, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:47:29,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:47:30,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:47:30,910 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 09:47:30,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:47:30,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:32,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 09:47:37,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:47:38,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=15.0 2023-10-02 09:47:40,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:47:47,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:47:47,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 09:47:47,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=836253.3333333334, ans=0.125 2023-10-02 09:47:48,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:47:48,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:48,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:47:51,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:47:51,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:47:53,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:53,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:47:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:47:54,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:54,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:54,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:47:57,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:47:59,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:48:01,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:48:01,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:48:03,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:48:03,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:48:03,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:48:08,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=836320.0, ans=0.125 2023-10-02 09:48:09,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 09:48:09,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:48:09,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:48:10,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:10,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:48:15,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:48:21,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:48:21,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:21,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 09:48:21,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:48:21,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:48:21,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:25,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 09:48:25,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 09:48:26,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:48:26,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:28,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:48:29,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.09 vs. limit=15.0 2023-10-02 09:48:29,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 09:48:29,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:48:33,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:48:33,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:48:34,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 09:48:34,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:48:39,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:48:39,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 09:48:41,736 INFO [train.py:1046] (2/4) Epoch 24, batch 3300, loss[loss=0.1749, simple_loss=0.2449, pruned_loss=0.05249, over 23551.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2474, pruned_loss=0.04687, over 4697881.61 frames. ], batch size: 256, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:48:41,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:48:41,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 09:48:43,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 09:48:43,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=836520.0, ans=0.1 2023-10-02 09:48:44,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 09:48:44,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:47,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.33 vs. limit=10.0 2023-10-02 09:48:48,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:48:48,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:48:48,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:50,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:48:52,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:48:54,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:48:57,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:49:01,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 09:49:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:02,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:04,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:04,434 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 09:49:05,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:05,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:49:07,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:49:07,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:07,139 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 09:49:07,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=836586.6666666666, ans=0.2 2023-10-02 09:49:11,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:49:11,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:49:12,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.30 vs. limit=10.0 2023-10-02 09:49:13,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:13,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 09:49:13,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 09:49:13,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:13,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=836653.3333333334, ans=0.125 2023-10-02 09:49:14,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:49:17,539 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 09:49:18,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 09:49:18,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:49:21,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 09:49:26,018 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.884e+02 2.119e+02 2.530e+02 4.061e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 09:49:26,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:49:27,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:49:27,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:49:30,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:32,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:32,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:49:32,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:49:34,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:49:34,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:35,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:49:35,760 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 09:49:35,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 09:49:38,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:49:38,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:49:38,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:40,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:40,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:42,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:49:42,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:42,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:49:42,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:44,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:49:47,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 09:49:47,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:50,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:49:50,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:49:51,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:54,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:54,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:55,963 INFO [train.py:1046] (2/4) Epoch 24, batch 3350, loss[loss=0.2094, simple_loss=0.273, pruned_loss=0.07288, over 22688.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2493, pruned_loss=0.04754, over 4699478.29 frames. ], batch size: 322, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:49:58,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:50:01,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:02,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:50:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:08,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:50:09,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:50:09,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:50:10,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 09:50:11,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=836920.0, ans=0.125 2023-10-02 09:50:12,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 09:50:14,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:50:16,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 09:50:16,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 09:50:18,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:50:18,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:50:18,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:19,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 09:50:19,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:19,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:50:21,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:22,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:22,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:23,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:50:27,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:29,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:30,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:34,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:50:36,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:36,608 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:50:37,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:39,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:42,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:43,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 09:50:43,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:50:43,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 09:50:43,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:50:45,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 09:50:46,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:49,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:53,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:54,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 09:50:54,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:50:55,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:50:57,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:51:04,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:51:06,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 09:51:06,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:51:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:51:09,644 INFO [train.py:1046] (2/4) Epoch 24, batch 3400, loss[loss=0.1701, simple_loss=0.2376, pruned_loss=0.05134, over 23607.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2496, pruned_loss=0.04751, over 4713050.68 frames. ], batch size: 256, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:51:09,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:11,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 09:51:11,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:51:12,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 09:51:12,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:51:13,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:51:13,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:51:15,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=837186.6666666666, ans=0.05 2023-10-02 09:51:16,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:51:16,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 09:51:19,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 09:51:19,796 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 09:51:19,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:22,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:51:22,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:51:23,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:25,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:51:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:51:31,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 09:51:33,070 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.92 vs. limit=10.0 2023-10-02 09:51:35,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:51:36,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:37,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:40,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:51:45,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:51:47,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 09:51:53,136 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.845e+02 2.030e+02 2.228e+02 3.234e+02, threshold=4.061e+02, percent-clipped=0.0 2023-10-02 09:51:53,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:53,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:55,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 09:51:55,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:51:56,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:56,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:51:56,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:51:58,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=837386.6666666666, ans=0.125 2023-10-02 09:51:59,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:59,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=837386.6666666666, ans=0.0 2023-10-02 09:52:03,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:52:03,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:52:07,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:52:08,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=837453.3333333334, ans=0.1 2023-10-02 09:52:11,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 09:52:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:52:21,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 09:52:22,558 INFO [train.py:1046] (2/4) Epoch 24, batch 3450, loss[loss=0.1425, simple_loss=0.2214, pruned_loss=0.03178, over 24573.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2501, pruned_loss=0.04787, over 4706839.71 frames. ], batch size: 60, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:52:25,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 09:52:25,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:52:28,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:52:28,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 09:52:29,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:52:32,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:52:36,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:52:36,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:52:38,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:52:38,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:52:42,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:52:46,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 09:52:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 09:52:52,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:52:52,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:52:52,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=837653.3333333334, ans=0.07 2023-10-02 09:52:55,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:00,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 09:53:01,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:53:05,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:53:07,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:53:08,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:53:08,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:53:11,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 09:53:11,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:53:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:53:14,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:53:17,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 09:53:21,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=837786.6666666666, ans=0.1 2023-10-02 09:53:22,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:53:26,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:53:27,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:30,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:33,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:33,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:53:35,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:53:35,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:53:36,703 INFO [train.py:1046] (2/4) Epoch 24, batch 3500, loss[loss=0.1765, simple_loss=0.256, pruned_loss=0.04851, over 23943.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.248, pruned_loss=0.04724, over 4698494.14 frames. ], batch size: 86, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:53:39,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:42,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:53:42,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 09:53:44,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:53:46,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 09:53:49,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:49,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 09:53:55,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:53:55,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:53:56,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:53:56,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:53:58,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:53:58,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:58,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:53:59,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 09:54:03,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:03,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:54:03,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:54:08,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:09,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=837986.6666666666, ans=0.125 2023-10-02 09:54:10,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 09:54:10,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:54:13,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:54:14,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:54:16,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:17,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:54:18,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:54:20,197 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.879e+02 2.062e+02 2.374e+02 3.315e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 09:54:20,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 09:54:21,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 09:54:21,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 09:54:21,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:54:23,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:23,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:54:23,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:54:27,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:54:29,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:54:33,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:54:33,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 09:54:35,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 09:54:35,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:54:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:54:39,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:54:40,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:42,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 09:54:44,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:54:44,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=838120.0, ans=0.125 2023-10-02 09:54:45,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:54:45,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=838120.0, ans=0.125 2023-10-02 09:54:46,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 09:54:46,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=838120.0, ans=0.125 2023-10-02 09:54:48,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 09:54:48,989 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.12 vs. limit=10.0 2023-10-02 09:54:49,498 INFO [train.py:1046] (2/4) Epoch 24, batch 3550, loss[loss=0.1797, simple_loss=0.2429, pruned_loss=0.05821, over 23684.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2466, pruned_loss=0.04696, over 4708216.26 frames. ], batch size: 232, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:54:50,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:50,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:54:50,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:54:52,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:54:53,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:55:04,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:06,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 09:55:08,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:55:10,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:55:12,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:12,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:55:13,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:55:14,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:55:14,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:55:16,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:16,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:55:16,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:55:21,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:55:21,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:55:22,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:55:22,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:23,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:55:24,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 09:55:24,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:26,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:26,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:55:30,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:55:32,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:55:32,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:55:35,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 09:55:37,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:55:37,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 09:55:39,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:55:42,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:55:43,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:55:44,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 09:55:46,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:55:49,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=838453.3333333334, ans=0.125 2023-10-02 09:55:51,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.55 vs. limit=10.0 2023-10-02 09:55:51,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:55:53,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 09:55:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:55:56,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=838453.3333333334, ans=0.1 2023-10-02 09:55:57,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:59,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 09:56:04,248 INFO [train.py:1046] (2/4) Epoch 24, batch 3600, loss[loss=0.1594, simple_loss=0.2507, pruned_loss=0.03403, over 24408.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2465, pruned_loss=0.04662, over 4706253.19 frames. ], batch size: 69, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:56:04,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 09:56:04,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:56:05,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:56:05,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=838520.0, ans=0.125 2023-10-02 09:56:07,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:56:07,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:56:07,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=838520.0, ans=0.0 2023-10-02 09:56:09,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:56:13,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:56:16,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:16,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:56:17,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:56:17,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:17,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 09:56:20,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:56:22,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:24,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:56:26,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:56:27,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:56:29,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:56:29,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 09:56:29,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:56:32,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:56:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:56:39,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:56:39,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:56:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 09:56:46,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:56:47,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:56:49,064 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 1.867e+02 2.105e+02 2.448e+02 3.419e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-02 09:56:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 09:56:53,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:56:57,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:00,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:05,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:57:05,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:57:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 09:57:07,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 09:57:08,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 09:57:09,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:57:10,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:57:11,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 09:57:13,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:57:13,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:57:13,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:57:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 09:57:15,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 09:57:16,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=838786.6666666666, ans=0.035 2023-10-02 09:57:17,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:18,583 INFO [train.py:1046] (2/4) Epoch 24, batch 3650, loss[loss=0.1942, simple_loss=0.274, pruned_loss=0.05719, over 23945.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2479, pruned_loss=0.04668, over 4721348.86 frames. ], batch size: 86, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:57:18,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 09:57:22,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 09:57:23,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=838853.3333333334, ans=22.5 2023-10-02 09:57:24,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:57:27,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 09:57:30,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 09:57:33,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:57:33,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:57:34,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:57:36,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:57:36,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:57:36,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 09:57:36,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:57:38,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:57:38,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 09:57:40,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:57:40,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:57:40,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:57:42,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:57:45,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 09:57:45,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 09:57:45,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:57:48,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 09:57:50,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:57:50,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:57:51,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.16 vs. limit=6.0 2023-10-02 09:57:54,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:57:55,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:57:56,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:57:56,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:57:58,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:58:00,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:58:01,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=838986.6666666666, ans=0.0 2023-10-02 09:58:04,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:58:05,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:05,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:58:08,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:58:09,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:58:09,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:58:14,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=839053.3333333334, ans=0.05 2023-10-02 09:58:16,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 09:58:19,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:58:19,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:58:20,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:58:22,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:23,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:58:25,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:25,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 09:58:25,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:29,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:58:30,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:58:31,835 INFO [train.py:1046] (2/4) Epoch 24, batch 3700, loss[loss=0.172, simple_loss=0.2424, pruned_loss=0.05084, over 23690.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2484, pruned_loss=0.04703, over 4710255.43 frames. ], batch size: 149, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:58:33,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:58:35,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:35,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 09:58:35,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:36,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:58:36,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:58:41,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:58:44,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:58:44,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=839186.6666666666, ans=0.0 2023-10-02 09:58:45,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:58:45,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:58:45,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:46,217 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.25 vs. limit=15.0 2023-10-02 09:58:46,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:58:48,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:58:49,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 09:58:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:58:56,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:58:57,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=839253.3333333334, ans=0.2 2023-10-02 09:58:58,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:58:58,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 09:58:58,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:58:58,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=839253.3333333334, ans=0.1 2023-10-02 09:59:02,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:03,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 09:59:05,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:06,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:59:10,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=26.05 vs. limit=22.5 2023-10-02 09:59:11,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:11,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:59:13,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 09:59:15,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.39 vs. limit=22.5 2023-10-02 09:59:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:59:16,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 09:59:17,439 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.801e+02 2.013e+02 2.164e+02 3.512e+02, threshold=4.027e+02, percent-clipped=0.0 2023-10-02 09:59:17,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:59:18,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 09:59:23,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:59:23,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:59:25,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:59:25,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 09:59:27,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:59:28,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:59:28,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:59:28,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:59:31,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:59:32,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 09:59:32,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 09:59:34,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:59:34,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:35,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:59:37,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:59:40,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:40,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:59:42,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:59:45,147 INFO [train.py:1046] (2/4) Epoch 24, batch 3750, loss[loss=0.1714, simple_loss=0.252, pruned_loss=0.0454, over 23367.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2493, pruned_loss=0.04716, over 4715190.33 frames. ], batch size: 119, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:59:45,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 09:59:46,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 09:59:47,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:59:49,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 09:59:50,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:59:50,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:52,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:52,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:59:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:00:00,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:00:00,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:00:01,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:00:04,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:00:06,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 10:00:07,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:00:09,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:00:09,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:00:09,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=839586.6666666666, ans=0.1 2023-10-02 10:00:14,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 10:00:16,530 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.82 vs. limit=10.0 2023-10-02 10:00:17,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 10:00:18,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=839653.3333333334, ans=0.125 2023-10-02 10:00:19,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:00:19,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:00:22,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:00:25,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:00:25,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 10:00:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 10:00:31,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=839720.0, ans=0.125 2023-10-02 10:00:33,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:00:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:00:37,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:00:41,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:00:45,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 10:00:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:00:49,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:00:50,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:00:53,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:00:57,801 INFO [train.py:1046] (2/4) Epoch 24, batch 3800, loss[loss=0.1772, simple_loss=0.2677, pruned_loss=0.04334, over 24653.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2487, pruned_loss=0.04708, over 4707107.67 frames. ], batch size: 73, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:01:01,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:01:03,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=839853.3333333334, ans=0.125 2023-10-02 10:01:04,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:06,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 10:01:07,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 10:01:08,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:01:10,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:12,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 10:01:14,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=839920.0, ans=0.125 2023-10-02 10:01:15,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:01:15,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:16,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:01:17,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:01:18,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:01:18,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:20,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 10:01:21,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:01:21,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:01:24,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:26,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:01:27,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:01:28,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:01:30,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:31,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:32,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=839986.6666666666, ans=0.125 2023-10-02 10:01:33,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:36,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 10:01:36,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 10:01:36,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=839986.6666666666, ans=0.5 2023-10-02 10:01:36,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=839986.6666666666, ans=0.125 2023-10-02 10:01:37,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:01:42,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:01:44,019 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.979e+02 2.395e+02 2.879e+02 4.810e+02, threshold=4.790e+02, percent-clipped=5.0 2023-10-02 10:01:48,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:01:50,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 10:01:51,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=840053.3333333334, ans=0.0 2023-10-02 10:01:52,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 10:01:54,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:54,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:01:54,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=840053.3333333334, ans=0.09899494936611666 2023-10-02 10:01:55,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:56,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 10:01:59,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 10:01:59,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 10:01:59,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:01,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:02:06,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:02:09,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:02:11,992 INFO [train.py:1046] (2/4) Epoch 24, batch 3850, loss[loss=0.1616, simple_loss=0.2066, pruned_loss=0.05832, over 18911.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2475, pruned_loss=0.04727, over 4705423.85 frames. ], batch size: 388, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:02:12,756 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-10-02 10:02:13,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:02:13,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 10:02:14,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:02:16,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:18,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:02:21,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:02:23,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:02:23,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 10:02:29,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:30,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:32,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:02:33,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:02:36,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:38,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:02:38,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:02:38,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:02:40,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:02:43,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:02:44,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:44,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:02:45,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 10:02:45,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 10:02:46,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:02:46,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:47,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:02:47,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:48,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=840320.0, ans=0.125 2023-10-02 10:02:49,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 10:02:51,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 10:02:52,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=840320.0, ans=0.1 2023-10-02 10:02:53,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:02:54,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 10:02:56,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 10:03:00,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:01,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:03:03,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=840386.6666666666, ans=0.125 2023-10-02 10:03:06,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:06,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 10:03:09,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 10:03:13,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:13,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:16,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:03:16,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:03:17,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:18,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:18,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:03:18,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 10:03:20,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:03:21,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 10:03:21,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:21,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:24,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:03:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:25,747 INFO [train.py:1046] (2/4) Epoch 24, batch 3900, loss[loss=0.1847, simple_loss=0.2566, pruned_loss=0.05641, over 23216.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2465, pruned_loss=0.04684, over 4698418.51 frames. ], batch size: 119, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:03:25,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:03:27,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:27,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:27,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:03:27,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 10:03:28,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:32,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:03:32,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:03:32,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:03:35,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:03:36,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:03:38,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:39,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:03:41,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 10:03:41,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:03:43,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 10:03:44,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:44,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 10:03:46,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 10:03:52,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:03:53,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:03:53,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:03:53,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:03:56,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:03:57,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:04:00,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:04:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:04:00,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:04:05,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:04:05,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:04:09,706 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.982e+02 2.222e+02 2.624e+02 4.261e+02, threshold=4.444e+02, percent-clipped=0.0 2023-10-02 10:04:14,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:04:16,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:04:25,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:04:25,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=840786.6666666666, ans=0.125 2023-10-02 10:04:28,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:04:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 10:04:29,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 10:04:29,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:04:31,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 10:04:31,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=840786.6666666666, ans=0.125 2023-10-02 10:04:32,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:04:32,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 10:04:38,030 INFO [train.py:1046] (2/4) Epoch 24, batch 3950, loss[loss=0.1826, simple_loss=0.2544, pruned_loss=0.05537, over 23766.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2469, pruned_loss=0.04699, over 4703662.72 frames. ], batch size: 232, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:04:40,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:04:42,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 10:04:44,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:04:46,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:04:46,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:04:53,163 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 10:04:53,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:04:53,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 10:04:55,008 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 10:04:55,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:04:57,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:04:57,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:04:57,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:05:00,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 10:05:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:05:03,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-02 10:05:04,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:05:04,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:05:04,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:05:06,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:05:11,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=840986.6666666666, ans=0.1 2023-10-02 10:05:16,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:05:16,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:05:21,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 10:05:27,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 10:05:27,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 10:05:27,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:05:28,970 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:05:29,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:05:30,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=841053.3333333334, ans=0.0 2023-10-02 10:05:35,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=841120.0, ans=0.125 2023-10-02 10:05:36,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:05:36,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:05:36,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:05:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:05:38,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 10:05:41,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:05:42,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:05:46,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 10:05:51,369 INFO [train.py:1046] (2/4) Epoch 24, batch 4000, loss[loss=0.1527, simple_loss=0.2379, pruned_loss=0.03379, over 24685.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2468, pruned_loss=0.04676, over 4719038.57 frames. ], batch size: 65, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:05:51,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=841186.6666666666, ans=0.1 2023-10-02 10:05:57,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:02,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:08,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:08,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:06:09,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:09,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 10:06:09,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:06:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 10:06:11,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:06:11,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 10:06:13,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:17,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:06:17,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:06:17,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:06:17,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:06:17,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:06:19,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:06:21,211 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 10:06:22,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:06:22,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:25,517 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 10:06:27,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:06:27,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:06:28,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=841320.0, ans=0.0 2023-10-02 10:06:30,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=841320.0, ans=0.2 2023-10-02 10:06:32,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 10:06:32,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:06:35,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:06:35,410 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 10:06:36,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:06:38,065 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.830e+02 2.096e+02 2.397e+02 3.466e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-02 10:06:38,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 10:06:38,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:06:39,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:39,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:06:40,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:06:42,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:06:42,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:06:43,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 10:06:45,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:46,899 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 10:06:52,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:06:54,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 10:06:54,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:06:56,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:57,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:06:57,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:02,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=841453.3333333334, ans=0.125 2023-10-02 10:07:03,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:07:04,743 INFO [train.py:1046] (2/4) Epoch 24, batch 4050, loss[loss=0.1756, simple_loss=0.2477, pruned_loss=0.05173, over 23859.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2472, pruned_loss=0.04685, over 4718784.91 frames. ], batch size: 195, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:07:04,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:07:06,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 10:07:07,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:07:07,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:08,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.51 vs. limit=15.0 2023-10-02 10:07:08,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:07:10,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:07:11,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:07:14,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:07:18,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:07:19,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 10:07:21,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:07:21,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:07:24,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:25,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:07:29,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 10:07:31,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 10:07:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 10:07:34,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:07:39,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=22.5 2023-10-02 10:07:39,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 10:07:39,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:07:44,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:46,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:46,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:07:47,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:51,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:07:53,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=841720.0, ans=0.2 2023-10-02 10:07:54,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.35 vs. limit=15.0 2023-10-02 10:07:55,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 10:07:55,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:07:57,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:07:58,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 10:08:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:08:07,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=841786.6666666666, ans=0.125 2023-10-02 10:08:08,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 10:08:08,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=841786.6666666666, ans=0.1 2023-10-02 10:08:09,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:08:09,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:08:12,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 10:08:12,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 10:08:12,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:13,072 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.13 vs. limit=15.0 2023-10-02 10:08:15,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:08:16,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:16,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:08:17,408 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.67 vs. limit=15.0 2023-10-02 10:08:18,181 INFO [train.py:1046] (2/4) Epoch 24, batch 4100, loss[loss=0.1394, simple_loss=0.218, pruned_loss=0.0304, over 24300.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.248, pruned_loss=0.04741, over 4713825.06 frames. ], batch size: 56, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:08:21,166 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-10-02 10:08:23,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 10:08:24,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 10:08:25,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 10:08:27,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 10:08:27,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:27,999 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.50 vs. limit=22.5 2023-10-02 10:08:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:28,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:29,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:08:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 10:08:31,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:08:32,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:08:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:34,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:08:39,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:08:39,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:08:41,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:08:41,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 10:08:42,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:42,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:08:42,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:08:42,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:08:44,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 10:08:47,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:08:48,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 10:08:50,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:08:51,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=841986.6666666666, ans=0.125 2023-10-02 10:08:52,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:08:52,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 10:08:53,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:08:53,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:08:53,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:08:55,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 10:08:57,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:08:57,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:08:59,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 10:09:00,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:09:02,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:09:03,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:09:06,740 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.905e+02 2.091e+02 2.337e+02 3.438e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 10:09:09,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:10,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:09:11,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=842053.3333333334, ans=0.1 2023-10-02 10:09:12,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:09:15,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=842120.0, ans=0.2 2023-10-02 10:09:22,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:09:22,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:09:24,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:09:24,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:09:30,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:09:30,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:09:31,400 INFO [train.py:1046] (2/4) Epoch 24, batch 4150, loss[loss=0.1756, simple_loss=0.2314, pruned_loss=0.05988, over 19995.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2479, pruned_loss=0.04756, over 4705121.86 frames. ], batch size: 388, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:09:31,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:09:31,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:09:32,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 10:09:33,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=842186.6666666666, ans=0.125 2023-10-02 10:09:34,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 10:09:36,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 10:09:36,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 10:09:37,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:41,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:09:41,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:09:43,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=842186.6666666666, ans=0.125 2023-10-02 10:09:44,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:09:46,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:09:48,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:09:50,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:09:50,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:09:51,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:09:51,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=842253.3333333334, ans=0.125 2023-10-02 10:09:55,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:10:00,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:10:02,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 10:10:05,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 10:10:05,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:10:05,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=842320.0, ans=0.125 2023-10-02 10:10:06,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 10:10:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:10:06,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:10:09,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:10,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:10:14,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 10:10:17,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:10:20,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:10:21,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 10:10:23,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:10:24,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 10:10:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:10:26,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:10:27,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:29,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 10:10:29,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:10:29,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:10:30,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:10:33,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 10:10:33,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:34,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:10:34,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:10:34,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 10:10:34,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:10:35,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=842453.3333333334, ans=0.0 2023-10-02 10:10:35,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=842453.3333333334, ans=10.0 2023-10-02 10:10:36,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:10:36,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:10:39,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:39,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 10:10:40,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:10:45,058 INFO [train.py:1046] (2/4) Epoch 24, batch 4200, loss[loss=0.1846, simple_loss=0.27, pruned_loss=0.04959, over 24062.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2474, pruned_loss=0.04731, over 4720724.88 frames. ], batch size: 86, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:10:45,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:10:47,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 10:10:49,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:10:51,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:10:52,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:10:54,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:10:54,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:10:56,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 10:10:57,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 10:10:58,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:01,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:11:04,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:11:07,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:11:08,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:11:09,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:09,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 10:11:09,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:11:11,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:12,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:11:12,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:11:13,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:11:15,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 10:11:17,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:21,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 10:11:21,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:11:24,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:11:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:11:27,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:11:27,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 10:11:27,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:11:27,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=842653.3333333334, ans=0.125 2023-10-02 10:11:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:11:33,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=842720.0, ans=0.125 2023-10-02 10:11:34,513 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.872e+02 2.085e+02 2.351e+02 3.667e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-02 10:11:34,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:11:36,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:11:40,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:11:43,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 10:11:46,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:11:46,531 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:11:50,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:11:52,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:11:54,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 10:11:54,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=842786.6666666666, ans=0.125 2023-10-02 10:11:54,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=842786.6666666666, ans=0.1 2023-10-02 10:12:00,371 INFO [train.py:1046] (2/4) Epoch 24, batch 4250, loss[loss=0.1651, simple_loss=0.2575, pruned_loss=0.03629, over 24543.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.247, pruned_loss=0.04686, over 4725063.28 frames. ], batch size: 71, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:12:00,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:12:04,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:12:04,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:12:07,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:10,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:12:11,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 10:12:11,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:12:14,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:17,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:12:23,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:23,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:24,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:12:24,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:12:27,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:29,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:30,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:12:33,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:12:34,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 10:12:37,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 10:12:37,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:39,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:12:39,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:40,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:12:40,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:41,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:43,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:12:43,981 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.83 vs. limit=15.0 2023-10-02 10:12:44,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:12:48,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:12:52,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:12:52,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 10:12:52,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:12:54,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 10:12:55,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:12:55,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:12:57,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:58,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:13:00,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 10:13:02,420 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-02 10:13:03,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:13:03,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:13:06,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:13:10,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:13:10,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:13:11,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:13:12,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:13:14,168 INFO [train.py:1046] (2/4) Epoch 24, batch 4300, loss[loss=0.1672, simple_loss=0.2369, pruned_loss=0.04876, over 23319.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2465, pruned_loss=0.04626, over 4727649.83 frames. ], batch size: 119, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:13:14,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:13:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:13:14,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 10:13:14,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=843186.6666666666, ans=0.2 2023-10-02 10:13:15,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:13:20,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:13:20,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:13:26,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:13:27,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.57 vs. limit=15.0 2023-10-02 10:13:31,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=843253.3333333334, ans=0.2 2023-10-02 10:13:34,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:13:34,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 10:13:34,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:13:37,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:13:37,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:13:38,415 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 10:13:41,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:13:42,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:13:44,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 10:13:44,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:13:45,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 10:13:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:13:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:13:53,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:13:53,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:13:53,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=843320.0, ans=0.0 2023-10-02 10:13:54,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:13:56,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:13:56,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:13:56,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 10:13:57,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 10:14:00,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:14:02,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:02,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:14:02,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:02,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:14:03,757 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.778e+02 1.987e+02 2.231e+02 3.215e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 10:14:03,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 10:14:03,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 10:14:03,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 10:14:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:14:05,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 10:14:06,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 10:14:09,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:14:10,051 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-10-02 10:14:10,835 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 10:14:10,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:14:12,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:14:14,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 10:14:14,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:14:16,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:14:16,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:14:17,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:14:22,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:14:23,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.whiten.whitening_limit, batch_count=843453.3333333334, ans=12.0 2023-10-02 10:14:23,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:25,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:25,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:14:28,389 INFO [train.py:1046] (2/4) Epoch 24, batch 4350, loss[loss=0.1571, simple_loss=0.2375, pruned_loss=0.03836, over 24570.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2478, pruned_loss=0.04667, over 4722161.44 frames. ], batch size: 60, lr: 4.25e-03, grad_scale: 4.0 2023-10-02 10:14:31,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 10:14:32,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:14:37,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:14:38,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:40,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:14:40,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:14:45,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:14:47,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=843586.6666666666, ans=0.125 2023-10-02 10:14:48,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:49,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:14:49,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:14:51,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:14:53,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:14:55,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:15:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 10:15:01,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:03,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:07,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:08,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 10:15:11,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:15:17,136 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 10:15:18,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:15:18,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:15:20,343 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 10:15:20,404 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 10:15:20,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:15:21,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:21,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:15:23,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:15:23,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:15:23,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:15:25,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 10:15:25,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:27,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:27,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 10:15:28,579 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 10:15:28,583 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 10:15:29,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 10:15:32,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:15:32,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:15:33,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:15:34,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:15:36,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 10:15:38,102 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 10:15:39,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:41,939 INFO [train.py:1046] (2/4) Epoch 24, batch 4400, loss[loss=0.1499, simple_loss=0.2329, pruned_loss=0.03351, over 24510.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2489, pruned_loss=0.04707, over 4715556.08 frames. ], batch size: 63, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:15:42,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:15:42,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:43,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:44,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 10:15:44,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 10:15:46,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 10:15:46,304 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 10:15:48,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:15:48,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:15:51,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 10:15:54,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:55,561 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 10:15:59,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:15:59,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 10:16:00,904 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 10:16:03,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 10:16:04,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 10:16:04,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 10:16:05,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:07,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:16:07,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:16:07,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:16:08,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 10:16:08,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 10:16:09,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:16:12,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:16:12,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:16:14,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:15,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:16:15,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 10:16:16,934 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 10:16:19,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:26,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:16:29,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 10:16:30,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:16:32,290 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.852e+02 2.093e+02 2.430e+02 3.908e+02, threshold=4.185e+02, percent-clipped=0.0 2023-10-02 10:16:33,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:16:37,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:16:38,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 10:16:38,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:16:38,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:16:38,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:16:38,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:16:42,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 10:16:46,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 10:16:46,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 10:16:46,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:16:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 10:16:47,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:16:48,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=844120.0, ans=0.125 2023-10-02 10:16:50,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=844120.0, ans=0.125 2023-10-02 10:16:52,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:16:55,143 INFO [train.py:1046] (2/4) Epoch 24, batch 4450, loss[loss=0.1664, simple_loss=0.2413, pruned_loss=0.04572, over 23651.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2496, pruned_loss=0.04709, over 4722515.79 frames. ], batch size: 149, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:16:55,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 10:17:00,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:17:00,276 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:17:01,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:02,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:17:10,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:10,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:17:14,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:16,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:17:17,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:17:17,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:17:19,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 10:17:19,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:17:20,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:20,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:17:20,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:17:23,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:17:27,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:29,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:31,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:17:31,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:17:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:17:35,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 10:17:37,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 10:17:38,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 10:17:38,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:17:40,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:41,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 10:17:44,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:17:47,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:48,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 10:17:48,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:48,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:17:48,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:17:48,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:51,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:54,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:17:54,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 10:17:57,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:17:59,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:18:01,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:18:02,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:18:02,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:18:02,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=844453.3333333334, ans=0.125 2023-10-02 10:18:05,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:18:09,740 INFO [train.py:1046] (2/4) Epoch 24, batch 4500, loss[loss=0.169, simple_loss=0.2553, pruned_loss=0.04135, over 24653.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2497, pruned_loss=0.04676, over 4726006.48 frames. ], batch size: 73, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:18:09,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 10:18:11,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:18:15,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:18:17,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 10:18:17,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 10:18:18,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:18:23,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:18:23,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:18:25,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:18:25,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:18:25,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:18:26,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:18:36,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:18:38,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:18:40,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:18:42,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:18:43,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:18:48,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:18:51,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:18:55,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:18:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:18:57,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 10:18:59,243 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.850e+02 2.033e+02 2.343e+02 3.798e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-02 10:18:59,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:18:59,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:02,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:03,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:19:04,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:19:04,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 10:19:04,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:19:04,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:08,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=844786.6666666666, ans=0.1 2023-10-02 10:19:09,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:19:09,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:19:12,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:15,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:19:15,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:19:18,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 10:19:18,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 10:19:18,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 10:19:21,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 10:19:22,498 INFO [train.py:1046] (2/4) Epoch 24, batch 4550, loss[loss=0.1606, simple_loss=0.232, pruned_loss=0.0446, over 23609.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2483, pruned_loss=0.04641, over 4710038.95 frames. ], batch size: 149, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:19:23,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 10:19:24,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=844853.3333333334, ans=0.0 2023-10-02 10:19:25,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:19:28,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:19:29,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:19:33,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:19:36,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:19:37,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:19:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:19:40,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:41,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=844920.0, ans=10.0 2023-10-02 10:19:43,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:19:44,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:19:46,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:19:47,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 10:19:49,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 10:19:50,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:19:51,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 10:19:54,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 10:19:54,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:19:57,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 10:19:59,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:20:02,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:02,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:02,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:20:03,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 10:20:06,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:20:08,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:08,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:20:09,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:20:10,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 10:20:11,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 10:20:11,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:20:12,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 10:20:15,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 10:20:15,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:20:18,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:18,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:20:18,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:19,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:20:19,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:20:21,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 10:20:23,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:20:23,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 10:20:23,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 10:20:23,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:20:23,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 10:20:27,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:20:27,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:20:30,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:20:30,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:30,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:20:32,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:20:34,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:20:36,059 INFO [train.py:1046] (2/4) Epoch 24, batch 4600, loss[loss=0.1501, simple_loss=0.2296, pruned_loss=0.03531, over 24438.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2469, pruned_loss=0.04618, over 4700666.07 frames. ], batch size: 58, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:20:37,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:38,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:20:41,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:20:41,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:20:43,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:20:44,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 10:20:44,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:20:47,060 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=7.42 vs. limit=12.0 2023-10-02 10:20:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:20:49,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:20:49,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=845253.3333333334, ans=0.125 2023-10-02 10:20:53,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:57,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=845253.3333333334, ans=0.0 2023-10-02 10:20:59,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 10:20:59,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:02,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:03,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:21:03,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:21:11,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 10:21:11,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:21:11,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:21:14,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=845320.0, ans=0.0 2023-10-02 10:21:16,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:18,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:21:19,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:21:22,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 10:21:23,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:21:25,025 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.808e+02 1.985e+02 2.322e+02 3.064e+02, threshold=3.969e+02, percent-clipped=0.0 2023-10-02 10:21:29,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:29,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:21:32,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=845386.6666666666, ans=0.125 2023-10-02 10:21:33,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:33,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 10:21:33,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:35,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 10:21:35,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:36,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:36,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:37,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:21:37,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:39,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 10:21:39,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 10:21:40,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 10:21:40,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:42,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:21:43,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:45,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:46,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=845453.3333333334, ans=0.125 2023-10-02 10:21:46,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=845453.3333333334, ans=0.0 2023-10-02 10:21:49,426 INFO [train.py:1046] (2/4) Epoch 24, batch 4650, loss[loss=0.1539, simple_loss=0.2404, pruned_loss=0.03369, over 24665.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2461, pruned_loss=0.04605, over 4694432.67 frames. ], batch size: 65, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:21:49,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=845520.0, ans=0.125 2023-10-02 10:21:52,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:21:52,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=845520.0, ans=0.1 2023-10-02 10:21:55,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:21:55,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:55,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:21:56,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:56,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:21:58,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:59,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=845520.0, ans=0.125 2023-10-02 10:22:01,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 10:22:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:22:07,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=845586.6666666666, ans=0.0 2023-10-02 10:22:09,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 10:22:09,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:22:10,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 10:22:10,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:22:11,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 10:22:11,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 10:22:11,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:11,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:22:15,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:22:17,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:17,842 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 10:22:21,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:21,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 10:22:26,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:26,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:22:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 10:22:26,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=845653.3333333334, ans=0.125 2023-10-02 10:22:27,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:22:30,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:22:31,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:22:38,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:38,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=845720.0, ans=0.125 2023-10-02 10:22:41,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:41,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:42,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:22:44,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 10:22:45,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 10:22:45,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 10:22:45,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 10:22:47,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=845786.6666666666, ans=0.125 2023-10-02 10:22:48,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:22:55,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:22:55,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:22:55,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 10:22:55,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:22:56,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:22:56,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:22:57,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=845786.6666666666, ans=0.125 2023-10-02 10:22:59,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:23:01,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:23:01,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:23:01,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=845853.3333333334, ans=0.1 2023-10-02 10:23:02,337 INFO [train.py:1046] (2/4) Epoch 24, batch 4700, loss[loss=0.1752, simple_loss=0.2662, pruned_loss=0.04206, over 24423.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2467, pruned_loss=0.04605, over 4716577.78 frames. ], batch size: 69, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:23:02,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:23:09,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:23:09,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:23:09,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:23:10,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 10:23:10,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:23:10,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=845853.3333333334, ans=0.125 2023-10-02 10:23:12,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 10:23:16,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=845920.0, ans=0.2 2023-10-02 10:23:20,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:22,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:23:22,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:23:23,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:23:25,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:23:29,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 10:23:29,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 10:23:30,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:32,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:23:33,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:23:36,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:23:42,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:23:42,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=845986.6666666666, ans=0.2 2023-10-02 10:23:44,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:23:50,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 10:23:51,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:23:53,102 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.791e+02 2.008e+02 2.250e+02 3.556e+02, threshold=4.016e+02, percent-clipped=0.0 2023-10-02 10:23:53,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:23:57,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 10:23:57,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=846053.3333333334, ans=0.125 2023-10-02 10:23:58,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:02,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:24:02,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 10:24:04,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:04,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=846120.0, ans=0.125 2023-10-02 10:24:07,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:24:08,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.96 vs. limit=15.0 2023-10-02 10:24:08,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:24:08,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 10:24:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 10:24:10,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=846120.0, ans=0.125 2023-10-02 10:24:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:13,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:13,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:13,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 10:24:16,535 INFO [train.py:1046] (2/4) Epoch 24, batch 4750, loss[loss=0.1942, simple_loss=0.2614, pruned_loss=0.06353, over 23608.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2472, pruned_loss=0.04663, over 4723264.02 frames. ], batch size: 256, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:24:16,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:20,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 10:24:22,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:24:24,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:25,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=846186.6666666666, ans=0.2 2023-10-02 10:24:28,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:29,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:24:29,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 10:24:30,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:24:35,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 10:24:35,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=846253.3333333334, ans=0.1 2023-10-02 10:24:36,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:24:36,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:24:40,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=846253.3333333334, ans=0.0 2023-10-02 10:24:42,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 10:24:47,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:24:50,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 10:24:50,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:24:52,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:52,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:52,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:53,810 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 10:24:53,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 10:24:55,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=846320.0, ans=0.125 2023-10-02 10:24:59,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 10:25:00,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:00,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=846386.6666666666, ans=0.2 2023-10-02 10:25:03,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:03,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:25:03,620 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 10:25:04,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:06,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=846386.6666666666, ans=0.1 2023-10-02 10:25:08,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:25:08,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=846386.6666666666, ans=0.125 2023-10-02 10:25:09,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:25:10,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.14 vs. limit=15.0 2023-10-02 10:25:10,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 10:25:12,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 10:25:12,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:25:14,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:25:14,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:25:15,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 10:25:17,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 10:25:21,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:25:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:25:22,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 10:25:23,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:25:25,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:26,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:25:27,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:28,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:25:30,621 INFO [train.py:1046] (2/4) Epoch 24, batch 4800, loss[loss=0.1697, simple_loss=0.2446, pruned_loss=0.04742, over 23666.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2483, pruned_loss=0.047, over 4713662.13 frames. ], batch size: 120, lr: 4.24e-03, grad_scale: 16.0 2023-10-02 10:25:32,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:25:32,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 10:25:32,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 10:25:34,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 10:25:36,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:25:36,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:25:37,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 10:25:42,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:42,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:25:48,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:25:50,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:50,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:50,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 10:25:50,763 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=15.0 2023-10-02 10:25:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:25:52,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:25:53,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:25:57,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:25:58,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:59,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:26:00,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:00,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 10:26:00,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:01,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:04,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:05,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:07,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:07,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:26:09,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=846653.3333333334, ans=0.125 2023-10-02 10:26:10,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:26:10,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:12,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 10:26:12,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 10:26:14,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:14,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:26:14,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:26:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:26:14,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:26:17,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:26:17,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:26:21,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:26:23,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.885e+02 2.115e+02 2.514e+02 3.907e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-02 10:26:23,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=846720.0, ans=0.1 2023-10-02 10:26:24,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:24,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=846720.0, ans=0.0 2023-10-02 10:26:27,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:26:29,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=846786.6666666666, ans=0.1 2023-10-02 10:26:30,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 10:26:30,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:31,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:31,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:26:33,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:35,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:26:37,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:26:37,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:38,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:26:39,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:26:40,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:26:43,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=846853.3333333334, ans=0.1 2023-10-02 10:26:44,815 INFO [train.py:1046] (2/4) Epoch 24, batch 4850, loss[loss=0.167, simple_loss=0.2357, pruned_loss=0.04914, over 20438.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2487, pruned_loss=0.04708, over 4704523.30 frames. ], batch size: 44, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:26:44,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:26:44,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:44,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:46,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 10:26:48,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 10:26:48,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:48,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:50,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:26:50,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:53,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:58,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 10:26:58,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=846920.0, ans=0.0 2023-10-02 10:26:59,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:27:04,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:27:04,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:27:05,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:27:08,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:27:09,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:27:11,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:27:11,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 10:27:13,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:27:17,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:27:17,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:27:18,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:27:18,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 10:27:20,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:27:20,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:25,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 10:27:25,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 10:27:28,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:27:36,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:27:36,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 10:27:36,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:27:36,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:27:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:27:39,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=847053.3333333334, ans=0.0 2023-10-02 10:27:40,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 10:27:40,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:40,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 10:27:40,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:27:42,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:27:44,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 10:27:50,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=847120.0, ans=0.0 2023-10-02 10:27:51,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:51,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=847120.0, ans=0.125 2023-10-02 10:27:56,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:27:56,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:27:59,045 INFO [train.py:1046] (2/4) Epoch 24, batch 4900, loss[loss=0.1869, simple_loss=0.2686, pruned_loss=0.05254, over 24355.00 frames. ], tot_loss[loss=0.171, simple_loss=0.248, pruned_loss=0.04697, over 4710440.98 frames. ], batch size: 77, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:28:00,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 10:28:00,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:28:04,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:06,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:28:06,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:28:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 10:28:12,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 10:28:17,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 10:28:17,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=847253.3333333334, ans=0.2 2023-10-02 10:28:18,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 10:28:18,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:28:18,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:28:18,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:28:18,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:28:18,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:28:20,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 10:28:24,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 10:28:24,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:28:25,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:28:27,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:28:29,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:28:29,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=847320.0, ans=0.2 2023-10-02 10:28:30,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:31,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:28:32,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 10:28:33,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:28:33,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:28:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 10:28:35,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 10:28:35,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=847320.0, ans=0.2 2023-10-02 10:28:36,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 10:28:39,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:28:40,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:28:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:28:42,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:42,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:28:42,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:28:44,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 10:28:47,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:28:47,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=847386.6666666666, ans=0.2 2023-10-02 10:28:48,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:28:51,077 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.854e+02 2.024e+02 2.236e+02 3.450e+02, threshold=4.049e+02, percent-clipped=0.0 2023-10-02 10:28:51,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:28:55,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 10:28:56,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:28:56,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 10:28:56,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=847453.3333333334, ans=0.0 2023-10-02 10:28:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 10:29:03,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=12.0 2023-10-02 10:29:05,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:29:05,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:29:05,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=847453.3333333334, ans=0.2 2023-10-02 10:29:07,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 10:29:07,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:29:07,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:29:08,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:11,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:29:11,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:29:11,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=847520.0, ans=0.125 2023-10-02 10:29:12,837 INFO [train.py:1046] (2/4) Epoch 24, batch 4950, loss[loss=0.152, simple_loss=0.2191, pruned_loss=0.04239, over 22755.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2468, pruned_loss=0.04665, over 4705898.15 frames. ], batch size: 322, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:29:12,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:29:12,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 10:29:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:29:17,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:29:17,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:29:19,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 10:29:20,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 10:29:20,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:29:22,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 10:29:22,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:23,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:29:23,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:29:23,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:26,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:26,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:29:27,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:29:27,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:29:31,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:29:33,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:29:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:39,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:29:42,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:42,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:45,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:29:45,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=847653.3333333334, ans=0.09899494936611666 2023-10-02 10:29:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 10:29:47,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 10:29:47,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=847653.3333333334, ans=0.125 2023-10-02 10:29:50,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:53,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:29:53,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:29:53,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:29:53,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:29:53,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=847653.3333333334, ans=0.125 2023-10-02 10:29:54,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:29:57,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:59,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:29:59,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=847720.0, ans=0.125 2023-10-02 10:30:02,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:30:04,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:04,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:06,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 10:30:06,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:30:07,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:30:09,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=847720.0, ans=0.125 2023-10-02 10:30:11,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:30:11,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:30:11,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=847786.6666666666, ans=0.0 2023-10-02 10:30:12,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:30:12,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:13,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:30:13,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:30:16,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:30:16,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:30:17,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:30:18,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 10:30:23,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=847786.6666666666, ans=0.125 2023-10-02 10:30:24,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:27,212 INFO [train.py:1046] (2/4) Epoch 24, batch 5000, loss[loss=0.17, simple_loss=0.2491, pruned_loss=0.0454, over 24464.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2461, pruned_loss=0.04606, over 4716791.75 frames. ], batch size: 63, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:30:27,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 10:30:27,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:30:35,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:35,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:30:35,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 10:30:37,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 10:30:40,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:30:40,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 10:30:40,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:30:40,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:30:42,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 10:30:43,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:43,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:30:44,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 10:30:44,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:44,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:30:46,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 10:30:46,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 10:30:46,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=847920.0, ans=0.125 2023-10-02 10:30:48,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:30:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 10:30:48,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:30:48,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:49,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:30:49,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 10:30:49,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 10:30:51,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 10:30:51,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:51,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:53,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 10:30:53,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:30:55,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:55,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:56,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=847986.6666666666, ans=0.125 2023-10-02 10:30:56,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=847986.6666666666, ans=0.125 2023-10-02 10:30:56,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.13 vs. limit=15.0 2023-10-02 10:30:57,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 10:30:59,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=847986.6666666666, ans=0.125 2023-10-02 10:31:00,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 10:31:02,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:31:02,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=847986.6666666666, ans=0.125 2023-10-02 10:31:03,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:31:06,217 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 10:31:09,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:31:10,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:31:10,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:11,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.56 vs. limit=22.5 2023-10-02 10:31:13,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 10:31:13,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:31:14,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:31:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:31:16,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 10:31:18,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:31:19,761 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.837e+02 2.022e+02 2.310e+02 4.101e+02, threshold=4.045e+02, percent-clipped=1.0 2023-10-02 10:31:19,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:31:19,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:31:24,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 10:31:28,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:31,262 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=15.0 2023-10-02 10:31:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:31:40,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:40,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:31:40,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:31:41,900 INFO [train.py:1046] (2/4) Epoch 24, batch 5050, loss[loss=0.171, simple_loss=0.2535, pruned_loss=0.0442, over 24461.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2476, pruned_loss=0.04646, over 4719960.33 frames. ], batch size: 69, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:31:41,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:31:41,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:31:42,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:46,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:46,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 10:31:48,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:31:49,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:31:51,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:31:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 10:31:52,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=848186.6666666666, ans=0.0 2023-10-02 10:31:53,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:31:53,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:31:55,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:31:56,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:31:58,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:32:07,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 10:32:07,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:32:08,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:32:08,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 10:32:08,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:32:10,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:10,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:32:11,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:32:11,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 10:32:13,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 10:32:14,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:17,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:32:20,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:20,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 10:32:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:32:23,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 10:32:25,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:32:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:32:26,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:32:27,739 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=15.0 2023-10-02 10:32:28,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:32:29,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:32:31,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:32:32,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:32,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:32:32,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:32:32,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 10:32:34,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:32:34,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=848386.6666666666, ans=0.02 2023-10-02 10:32:35,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:32:38,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:32:38,175 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 10:32:38,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:32:39,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:32:39,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:41,446 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 10:32:45,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:32:45,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 10:32:45,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:49,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:32:49,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:49,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 10:32:49,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=848453.3333333334, ans=0.125 2023-10-02 10:32:51,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 10:32:53,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:32:53,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:32:54,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:32:56,185 INFO [train.py:1046] (2/4) Epoch 24, batch 5100, loss[loss=0.1837, simple_loss=0.2593, pruned_loss=0.05404, over 23580.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2485, pruned_loss=0.0469, over 4717727.53 frames. ], batch size: 256, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:32:56,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 10:32:58,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:33:03,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 10:33:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 10:33:04,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:33:06,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:33:08,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:33:08,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 10:33:09,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 10:33:10,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=848586.6666666666, ans=0.125 2023-10-02 10:33:13,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:33:15,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:33:20,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:33:22,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 10:33:22,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:33:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:33:25,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:33:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:28,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:28,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 10:33:31,543 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 10:33:32,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:32,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 10:33:32,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 10:33:33,600 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.19 vs. limit=15.0 2023-10-02 10:33:36,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:33:39,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=848720.0, ans=0.0 2023-10-02 10:33:45,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:33:46,936 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.871e+02 2.071e+02 2.316e+02 3.219e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-02 10:33:48,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 10:33:49,006 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 10:33:49,013 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 10:33:51,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 10:33:51,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:54,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 10:33:57,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 10:33:58,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.01 vs. limit=12.0 2023-10-02 10:33:59,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:34:00,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:34:03,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 10:34:05,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:34:05,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 10:34:09,206 INFO [train.py:1046] (2/4) Epoch 24, batch 5150, loss[loss=0.1629, simple_loss=0.2447, pruned_loss=0.04058, over 24647.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.249, pruned_loss=0.04678, over 4731497.01 frames. ], batch size: 68, lr: 4.23e-03, grad_scale: 4.0 2023-10-02 10:34:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:34:11,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:34:11,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:34:11,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:34:12,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:34:12,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:34:13,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=848853.3333333334, ans=0.125 2023-10-02 10:34:14,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 10:34:14,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 10:34:14,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 10:34:15,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:34:15,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 10:34:18,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:18,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 10:34:18,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.42 vs. limit=15.0 2023-10-02 10:34:20,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:34:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:34:25,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:34:25,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 10:34:25,890 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-10-02 10:34:27,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:27,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:34:29,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:34:29,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:34:29,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:34:30,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:34:30,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:34:30,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 10:34:33,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:34:34,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:34:36,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:34:36,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=848920.0, ans=0.1 2023-10-02 10:34:38,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 10:34:40,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:34:44,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:34:47,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 10:34:51,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:34:54,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=849053.3333333334, ans=0.125 2023-10-02 10:34:54,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-10-02 10:34:57,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:34:58,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:01,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:02,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:35:02,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=849053.3333333334, ans=0.0 2023-10-02 10:35:04,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 10:35:07,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:35:07,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:35:07,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:35:10,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:10,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=849120.0, ans=0.1 2023-10-02 10:35:11,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:35:12,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 10:35:17,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:19,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:35:20,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:35:20,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:35:20,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=849120.0, ans=0.125 2023-10-02 10:35:22,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:35:22,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:35:23,936 INFO [train.py:1046] (2/4) Epoch 24, batch 5200, loss[loss=0.1815, simple_loss=0.254, pruned_loss=0.05451, over 23501.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2499, pruned_loss=0.04742, over 4728023.19 frames. ], batch size: 120, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:35:23,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:35:24,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:35:28,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:35:29,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:35:30,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.83 vs. limit=15.0 2023-10-02 10:35:30,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:35:31,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=849186.6666666666, ans=0.2 2023-10-02 10:35:31,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=849186.6666666666, ans=0.0 2023-10-02 10:35:33,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=849186.6666666666, ans=0.1 2023-10-02 10:35:34,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 10:35:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:35:37,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:38,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:35:39,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:35:39,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:40,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=849253.3333333334, ans=0.125 2023-10-02 10:35:41,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=849253.3333333334, ans=0.125 2023-10-02 10:35:42,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 10:35:45,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:35:45,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:47,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 10:35:50,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:35:50,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:35:50,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=849253.3333333334, ans=0.04949747468305833 2023-10-02 10:35:51,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 10:35:53,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 10:35:56,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 10:35:58,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:58,079 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 10:35:58,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:59,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:59,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:35:59,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 10:35:59,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:36:02,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:36:02,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=849320.0, ans=0.0 2023-10-02 10:36:05,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 10:36:05,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 10:36:05,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 10:36:09,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 10:36:09,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=849386.6666666666, ans=0.1 2023-10-02 10:36:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:36:11,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=849386.6666666666, ans=0.125 2023-10-02 10:36:16,806 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.871e+02 2.065e+02 2.333e+02 3.353e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 10:36:16,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:36:16,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:18,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 10:36:19,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:36:19,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 10:36:19,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:19,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:36:24,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:36:24,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:36:27,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:36:28,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:36:28,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:33,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:35,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 10:36:35,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:36:36,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:36:36,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:37,887 INFO [train.py:1046] (2/4) Epoch 24, batch 5250, loss[loss=0.1776, simple_loss=0.2632, pruned_loss=0.04604, over 24447.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2485, pruned_loss=0.04724, over 4727283.48 frames. ], batch size: 69, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:36:37,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:36:39,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:36:40,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:36:42,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=849520.0, ans=0.125 2023-10-02 10:36:43,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:36:44,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:36:46,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:36:53,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:54,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:36:54,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=849586.6666666666, ans=0.125 2023-10-02 10:36:55,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:36:57,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:36:59,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 10:37:00,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:37:02,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:37:10,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=849653.3333333334, ans=0.1 2023-10-02 10:37:27,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=849720.0, ans=0.125 2023-10-02 10:37:46,412 INFO [train.py:1046] (2/4) Epoch 24, batch 5300, loss[loss=0.1656, simple_loss=0.2504, pruned_loss=0.04041, over 24464.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2465, pruned_loss=0.04701, over 4709490.24 frames. ], batch size: 69, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:37:58,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.88 vs. limit=15.0 2023-10-02 10:38:01,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:38:01,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 10:38:01,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 10:38:01,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:01,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:02,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:02,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:02,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:02,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:02,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:38:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:38:02,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 10:38:02,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 10:38:02,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 10:38:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 10:38:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 10:38:02,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 10:38:02,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:03,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:03,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:38:03,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:38:03,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:38:04,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:38:04,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:04,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:04,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:38:04,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:04,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:38:04,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:04,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:38:04,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 10:38:04,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:38:05,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:05,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 10:38:05,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 10:38:05,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:38:05,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 10:38:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 10:38:05,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:38:05,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:38:06,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:38:06,451 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 10:38:06,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 10:38:06,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:38:06,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:06,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 10:38:06,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 10:38:06,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 10:38:06,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:38:13,255 INFO [train.py:1046] (2/4) Epoch 25, batch 0, loss[loss=0.1469, simple_loss=0.2274, pruned_loss=0.03317, over 24619.00 frames. ], tot_loss[loss=0.1469, simple_loss=0.2274, pruned_loss=0.03317, over 24619.00 frames. ], batch size: 60, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:38:13,255 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 10:38:25,659 INFO [train.py:1078] (2/4) Epoch 25, validation: loss=0.3293, simple_loss=0.2723, pruned_loss=0.1931, over 1125622.00 frames. 2023-10-02 10:38:25,660 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 10:38:29,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 10:38:29,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:38:30,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:38:35,042 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:38:36,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:36,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:38:36,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:37,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 10:38:38,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 10:38:40,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:41,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:44,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:45,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:45,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:38:47,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:38:48,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 10:38:49,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:38:59,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.885e+02 2.204e+02 2.603e+02 4.904e+02, threshold=4.408e+02, percent-clipped=3.0 2023-10-02 10:38:59,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:38:59,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:39:01,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 10:39:05,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:39:05,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:39:08,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:39:11,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:39:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:39:17,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=850140.0, ans=0.125 2023-10-02 10:39:21,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 10:39:22,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 10:39:22,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:39:22,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:22,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=850206.6666666666, ans=0.0 2023-10-02 10:39:23,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:39:23,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:39:25,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 10:39:27,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:29,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:34,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:39:37,612 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 10:39:38,880 INFO [train.py:1046] (2/4) Epoch 25, batch 50, loss[loss=0.1575, simple_loss=0.2286, pruned_loss=0.04319, over 23485.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2482, pruned_loss=0.04437, over 1078238.12 frames. ], batch size: 134, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:39:38,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:39:40,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-10-02 10:39:43,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:39:44,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:39:44,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 10:39:45,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:39:47,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:39:48,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:39:48,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=850273.3333333334, ans=0.125 2023-10-02 10:39:49,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:39:51,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:39:54,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=850340.0, ans=0.2 2023-10-02 10:39:55,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 10:39:55,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:03,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:40:05,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 10:40:07,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 10:40:08,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:40:10,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:40:10,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:11,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:40:12,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:40:14,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:40:14,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:19,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:40:21,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:40:21,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:40:22,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 10:40:25,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:40:26,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:40:26,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 10:40:26,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:40:29,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 10:40:37,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:40:37,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:40:39,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:40:40,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:40:40,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:40:42,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 10:40:44,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 10:40:44,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:40:45,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:40:46,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:40:47,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:40:47,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 10:40:49,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 10:40:49,937 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.75 vs. limit=10.0 2023-10-02 10:40:50,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 10:40:50,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:40:50,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:40:51,996 INFO [train.py:1046] (2/4) Epoch 25, batch 100, loss[loss=0.1815, simple_loss=0.2631, pruned_loss=0.04998, over 23949.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2463, pruned_loss=0.04485, over 1878229.63 frames. ], batch size: 86, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:40:52,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 10:40:52,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 10:40:53,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:40:53,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:40:54,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:40:54,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:40:57,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:41:00,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:41:04,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:41:05,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 10:41:05,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:41:10,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:41:10,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:41:11,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:41:11,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:41:11,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:41:11,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 10:41:14,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:41:14,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:14,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:41:14,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:41:18,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 10:41:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:21,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:41:22,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:41:24,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:41:25,652 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.842e+02 2.089e+02 2.326e+02 3.490e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 10:41:28,550 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 10:41:28,563 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 10:41:30,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:41:30,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:41:34,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:41:36,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:37,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:38,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.84 vs. limit=12.0 2023-10-02 10:41:43,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:43,834 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 10:41:47,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:41:50,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:41:51,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:41:54,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:55,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:41:58,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:41:59,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:42:01,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:02,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:02,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:02,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:42:02,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:04,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 10:42:04,612 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 10:42:04,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:06,355 INFO [train.py:1046] (2/4) Epoch 25, batch 150, loss[loss=0.1779, simple_loss=0.2563, pruned_loss=0.04974, over 23485.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2477, pruned_loss=0.04599, over 2512569.43 frames. ], batch size: 93, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:42:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:42:07,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:07,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:07,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 10:42:09,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:42:09,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:42:09,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:09,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:10,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:11,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.64 vs. limit=10.0 2023-10-02 10:42:11,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:42:11,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:42:15,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:18,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:42:18,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:18,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:21,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:21,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:23,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:42:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 10:42:26,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 10:42:26,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 10:42:28,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:42:28,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:42:29,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:42:31,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:31,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:33,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:33,135 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 10:42:34,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:35,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=851073.3333333334, ans=0.2 2023-10-02 10:42:36,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=851073.3333333334, ans=0.125 2023-10-02 10:42:40,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:42,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=851073.3333333334, ans=0.0 2023-10-02 10:42:46,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:42:46,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=851073.3333333334, ans=0.0 2023-10-02 10:42:47,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 10:42:50,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:42:50,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:52,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:42:53,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:42:54,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:56,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:42:56,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=851140.0, ans=0.125 2023-10-02 10:42:57,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:58,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 10:43:00,720 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:43:02,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:03,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:03,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:43:03,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:43:05,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:08,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 10:43:09,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:43:12,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:43:13,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:43:15,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:43:16,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 10:43:16,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:43:16,880 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 10:43:17,543 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.89 vs. limit=15.0 2023-10-02 10:43:20,121 INFO [train.py:1046] (2/4) Epoch 25, batch 200, loss[loss=0.1895, simple_loss=0.2622, pruned_loss=0.05833, over 23837.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2492, pruned_loss=0.04695, over 2993344.33 frames. ], batch size: 212, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:43:20,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:43:22,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:43:22,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:43:25,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 10:43:25,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:43:27,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:29,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 10:43:31,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:43:33,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:34,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:39,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:43:39,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:43:39,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:54,038 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-10-02 10:43:54,483 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.978e+02 2.260e+02 2.565e+02 3.626e+02, threshold=4.520e+02, percent-clipped=0.0 2023-10-02 10:43:56,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:43:57,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:43:57,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:43:58,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:43:58,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 10:43:58,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:44:02,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:02,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:44:03,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:44:04,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:44:04,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 10:44:04,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:44:04,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:10,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:44:14,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.37 vs. limit=12.0 2023-10-02 10:44:15,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:44:15,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=851473.3333333334, ans=0.2 2023-10-02 10:44:16,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=851473.3333333334, ans=0.0 2023-10-02 10:44:19,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=851540.0, ans=0.2 2023-10-02 10:44:24,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:25,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:44:31,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:31,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=851540.0, ans=0.125 2023-10-02 10:44:32,023 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.88 vs. limit=15.0 2023-10-02 10:44:33,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 10:44:33,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:33,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:44:33,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:44:34,607 INFO [train.py:1046] (2/4) Epoch 25, batch 250, loss[loss=0.1818, simple_loss=0.2449, pruned_loss=0.05935, over 23844.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2487, pruned_loss=0.04667, over 3392567.59 frames. ], batch size: 195, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:44:34,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:44:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 10:44:37,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:44:37,501 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 10:44:38,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:40,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:44:40,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:42,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=851606.6666666666, ans=0.125 2023-10-02 10:44:43,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:44:44,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:46,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:44:48,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:45:01,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:45:04,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:45:04,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:45:05,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=851740.0, ans=0.1 2023-10-02 10:45:09,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:45:09,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:45:11,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:45:11,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:45:13,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:45:14,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:45:15,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:45:17,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:45:19,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 10:45:19,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:45:21,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:45:21,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:45:21,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:45:23,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:45:23,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:45:23,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:45:26,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:27,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:45:29,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:45:29,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=851806.6666666666, ans=0.0 2023-10-02 10:45:32,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:45:36,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:38,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:45:39,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=851873.3333333334, ans=0.125 2023-10-02 10:45:40,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:45:42,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:45:45,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 10:45:47,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:45:47,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:45:48,260 INFO [train.py:1046] (2/4) Epoch 25, batch 300, loss[loss=0.1563, simple_loss=0.2048, pruned_loss=0.05385, over 19275.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2469, pruned_loss=0.0462, over 3672419.41 frames. ], batch size: 389, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:45:49,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 10:45:49,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:45:51,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:45:51,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 10:45:52,202 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.95 vs. limit=15.0 2023-10-02 10:45:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:56,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:00,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:46:01,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 10:46:02,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.52 vs. limit=15.0 2023-10-02 10:46:02,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-10-02 10:46:03,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:46:04,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:46:04,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 10:46:04,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:05,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=852006.6666666666, ans=0.0 2023-10-02 10:46:08,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:46:12,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:46:13,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 10:46:15,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 10:46:17,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:19,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:22,341 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.821e+02 1.986e+02 2.165e+02 3.006e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 10:46:23,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:23,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 10:46:23,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:46:25,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:46:27,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:46:29,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:46:30,858 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:46:32,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:46:32,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 10:46:33,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:46:34,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:36,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 10:46:37,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:41,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=852140.0, ans=0.0 2023-10-02 10:46:42,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:46:45,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:46:45,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 10:46:46,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=852206.6666666666, ans=0.0 2023-10-02 10:46:49,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:49,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:46:50,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:52,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:46:54,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 10:46:54,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:46:54,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:46:56,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 10:46:56,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:58,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:46:58,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=852206.6666666666, ans=0.2 2023-10-02 10:46:59,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:59,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:59,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:02,999 INFO [train.py:1046] (2/4) Epoch 25, batch 350, loss[loss=0.1743, simple_loss=0.2617, pruned_loss=0.04339, over 23819.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2454, pruned_loss=0.04568, over 3904467.80 frames. ], batch size: 85, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:47:05,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:05,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 10:47:06,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=852273.3333333334, ans=0.125 2023-10-02 10:47:07,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:13,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:47:15,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:15,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=852273.3333333334, ans=0.1 2023-10-02 10:47:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:19,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 10:47:20,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:20,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 10:47:23,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:25,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 10:47:25,790 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.98 vs. limit=22.5 2023-10-02 10:47:26,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:47:28,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 10:47:29,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:47:32,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:47:32,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:47:33,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.42 vs. limit=15.0 2023-10-02 10:47:34,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:47:34,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:47:34,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:47:34,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:34,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=852406.6666666666, ans=0.125 2023-10-02 10:47:35,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:47:36,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:47:36,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:44,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:47:44,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:47:46,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:47:46,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:50,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 10:47:50,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:53,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:53,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:47:54,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:56,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 10:47:57,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:47:58,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 10:48:00,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 10:48:00,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:03,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:48:03,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 10:48:06,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:09,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:48:10,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:12,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:48:13,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:48:16,570 INFO [train.py:1046] (2/4) Epoch 25, batch 400, loss[loss=0.1767, simple_loss=0.2653, pruned_loss=0.04408, over 24338.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2449, pruned_loss=0.04563, over 4084494.20 frames. ], batch size: 74, lr: 4.14e-03, grad_scale: 32.0 2023-10-02 10:48:16,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:48:18,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:48:19,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 10:48:19,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:22,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:48:22,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:25,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:28,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:28,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 10:48:30,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.whiten.whitening_limit, batch_count=852673.3333333334, ans=12.0 2023-10-02 10:48:31,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 10:48:31,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:32,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 10:48:33,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:36,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:48:36,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:48:36,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 10:48:36,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:48:36,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:38,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:48:38,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:41,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 10:48:41,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 10:48:47,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:47,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:48,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 10:48:49,990 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.835e+02 2.039e+02 2.384e+02 3.954e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-02 10:48:50,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 10:48:50,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=852740.0, ans=0.5 2023-10-02 10:48:53,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:48:56,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:48:57,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=852740.0, ans=0.2 2023-10-02 10:49:01,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 10:49:01,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=852806.6666666666, ans=0.1 2023-10-02 10:49:04,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:49:04,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 10:49:06,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=852806.6666666666, ans=0.1 2023-10-02 10:49:07,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=852806.6666666666, ans=0.1 2023-10-02 10:49:08,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:49:10,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:49:10,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 10:49:10,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=852806.6666666666, ans=0.09899494936611666 2023-10-02 10:49:13,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:49:13,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=852873.3333333334, ans=0.1 2023-10-02 10:49:16,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:49:17,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:49:18,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=852873.3333333334, ans=0.125 2023-10-02 10:49:18,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=852873.3333333334, ans=0.1 2023-10-02 10:49:19,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:19,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 10:49:20,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=852873.3333333334, ans=0.2 2023-10-02 10:49:21,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:49:22,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 10:49:26,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:49:26,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:49:28,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 10:49:29,376 INFO [train.py:1046] (2/4) Epoch 25, batch 450, loss[loss=0.1677, simple_loss=0.2559, pruned_loss=0.03972, over 24015.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2459, pruned_loss=0.04643, over 4217021.85 frames. ], batch size: 80, lr: 4.14e-03, grad_scale: 32.0 2023-10-02 10:49:30,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:49:30,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:49:30,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:49:32,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 10:49:33,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:49:35,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:49:36,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:49:36,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 10:49:36,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:49:38,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:49:39,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=852940.0, ans=0.0 2023-10-02 10:49:40,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:49:41,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=852940.0, ans=0.5 2023-10-02 10:49:47,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:48,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:49:50,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 10:49:50,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 10:49:54,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:49:57,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:59,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:02,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:50:04,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:50:06,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 10:50:08,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 10:50:10,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 10:50:10,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:11,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:50:12,932 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 10:50:14,608 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 10:50:14,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:50:15,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:50:16,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:50:20,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:50:21,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:50:22,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 10:50:23,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 10:50:23,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:50:25,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:50:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:50:26,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=853206.6666666666, ans=0.0 2023-10-02 10:50:28,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 10:50:31,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:50:31,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 10:50:33,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 10:50:33,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:50:37,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:50:39,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:50:40,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:50:40,698 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 10:50:41,949 INFO [train.py:1046] (2/4) Epoch 25, batch 500, loss[loss=0.1682, simple_loss=0.2442, pruned_loss=0.0461, over 24619.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2469, pruned_loss=0.04687, over 4327435.03 frames. ], batch size: 60, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:50:45,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:45,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=853273.3333333334, ans=0.125 2023-10-02 10:50:47,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:50:47,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:47,142 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 10:50:49,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 10:50:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:51,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:50:58,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:51:00,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:51:01,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:51:01,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:51:03,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:13,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 10:51:14,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:51:14,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:14,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 10:51:14,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:51:19,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:51:19,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:51:19,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:51:20,411 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.879e+02 2.028e+02 2.264e+02 3.460e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-02 10:51:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:20,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 10:51:25,066 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 10:51:27,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:28,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:29,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:29,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:30,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:51:32,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 10:51:36,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:51:36,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:39,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=853473.3333333334, ans=0.04949747468305833 2023-10-02 10:51:40,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:51:43,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:48,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:53,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 10:51:54,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:54,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 10:51:57,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:51:57,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:58,889 INFO [train.py:1046] (2/4) Epoch 25, batch 550, loss[loss=0.1656, simple_loss=0.2508, pruned_loss=0.04026, over 24574.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2476, pruned_loss=0.04706, over 4425520.07 frames. ], batch size: 71, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:52:03,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 10:52:06,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 10:52:06,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:07,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 10:52:07,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:52:07,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:08,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:08,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:09,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:52:10,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:52:10,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=853606.6666666666, ans=0.125 2023-10-02 10:52:13,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:52:13,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 10:52:13,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:52:19,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:19,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:20,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:52:22,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:25,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 10:52:25,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 10:52:26,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=853673.3333333334, ans=0.125 2023-10-02 10:52:28,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:52:34,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:52:34,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:52:34,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:52:37,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:37,471 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 10:52:38,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:40,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 10:52:42,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:52:42,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:52:42,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:52:44,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:45,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 10:52:47,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 10:52:48,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:52:48,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:52:50,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:52:50,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:53,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:52:53,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:52:57,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:52:57,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:59,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:52:59,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:53:01,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:01,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:53:03,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:04,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:53:04,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:53:09,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 10:53:11,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=853940.0, ans=0.125 2023-10-02 10:53:12,551 INFO [train.py:1046] (2/4) Epoch 25, batch 600, loss[loss=0.1703, simple_loss=0.2508, pruned_loss=0.04487, over 23985.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2483, pruned_loss=0.04765, over 4476218.52 frames. ], batch size: 80, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:53:12,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=853940.0, ans=0.125 2023-10-02 10:53:13,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 10:53:15,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:53:15,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:53:15,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:21,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:53:23,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:53:25,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 10:53:28,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:53:29,093 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.98 vs. limit=15.0 2023-10-02 10:53:29,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:53:31,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:33,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 10:53:33,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:53:39,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 10:53:42,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:53:42,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:42,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:53:48,071 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.854e+02 2.038e+02 2.245e+02 2.978e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-02 10:53:48,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:53:48,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:53:49,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:53,424 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.43 vs. limit=15.0 2023-10-02 10:53:56,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:53:58,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=854140.0, ans=0.125 2023-10-02 10:53:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:53:59,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:54:06,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 10:54:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:54:12,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:54:15,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 10:54:15,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:54:17,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 10:54:18,587 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.68 vs. limit=12.0 2023-10-02 10:54:19,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:54:19,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:54:25,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:54:27,165 INFO [train.py:1046] (2/4) Epoch 25, batch 650, loss[loss=0.1784, simple_loss=0.2484, pruned_loss=0.05416, over 23772.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2468, pruned_loss=0.04742, over 4516127.97 frames. ], batch size: 179, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:54:27,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:54:29,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:54:31,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:54:32,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:54:35,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 10:54:36,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:54:37,467 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.81 vs. limit=22.5 2023-10-02 10:54:42,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:54:42,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:54:45,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:54:46,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=854340.0, ans=0.0 2023-10-02 10:54:47,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 10:54:51,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:54:51,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:54:55,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=854406.6666666666, ans=0.125 2023-10-02 10:54:56,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:54:56,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 10:54:59,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:54:59,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:00,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:55:01,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:01,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:55:03,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=854406.6666666666, ans=0.125 2023-10-02 10:55:04,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:55:04,678 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 10:55:05,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:55:05,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:55:07,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:08,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:55:10,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:10,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:55:11,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 10:55:11,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:55:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:55:13,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=854473.3333333334, ans=0.125 2023-10-02 10:55:14,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 10:55:14,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:55:15,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:55:15,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 10:55:19,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 10:55:19,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:55:19,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:55:20,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:55:20,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:55:21,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=854473.3333333334, ans=15.0 2023-10-02 10:55:27,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:27,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:55:30,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:55:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:32,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 10:55:34,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:38,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:55:38,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:55:39,744 INFO [train.py:1046] (2/4) Epoch 25, batch 700, loss[loss=0.1415, simple_loss=0.2215, pruned_loss=0.03074, over 24397.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2463, pruned_loss=0.04691, over 4564468.31 frames. ], batch size: 58, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:55:39,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:55:39,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:55:45,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 10:55:46,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 10:55:47,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 10:55:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:49,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:55:52,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 10:55:54,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=854673.3333333334, ans=0.125 2023-10-02 10:55:57,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:55:57,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=854673.3333333334, ans=0.125 2023-10-02 10:56:01,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:56:01,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:56:02,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:56:02,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=854673.3333333334, ans=0.0 2023-10-02 10:56:03,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:56:05,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:56:07,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 10:56:07,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:56:09,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 10:56:12,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 10:56:14,703 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.821e+02 2.032e+02 2.235e+02 2.949e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 10:56:16,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:56:16,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:56:19,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:56:22,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:56:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 10:56:27,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:56:29,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:56:29,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 10:56:32,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:56:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:56:35,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:56:40,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:56:40,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 10:56:43,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 10:56:44,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 10:56:46,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:56:46,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.02 vs. limit=6.0 2023-10-02 10:56:48,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:56:48,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:56:50,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:56:50,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 10:56:53,752 INFO [train.py:1046] (2/4) Epoch 25, batch 750, loss[loss=0.1643, simple_loss=0.2237, pruned_loss=0.05246, over 19417.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2451, pruned_loss=0.04659, over 4586107.05 frames. ], batch size: 388, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:56:55,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 10:56:55,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 10:56:56,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 10:56:57,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=854940.0, ans=0.09899494936611666 2023-10-02 10:56:57,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=854940.0, ans=0.0 2023-10-02 10:56:58,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 10:56:58,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 10:56:59,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:56:59,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 10:57:01,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:57:01,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:57:03,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:05,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:06,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:57:06,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:57:09,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:57:09,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:57:10,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:57:11,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=855006.6666666666, ans=0.0 2023-10-02 10:57:13,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:13,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:13,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 10:57:14,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:57:16,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:57:17,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:57:18,528 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.44 vs. limit=22.5 2023-10-02 10:57:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:57:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 10:57:22,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:57:24,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=855073.3333333334, ans=0.125 2023-10-02 10:57:25,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 10:57:25,924 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 10:57:27,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 10:57:27,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:57:27,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:57:29,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:57:35,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:57:35,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:57:35,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:57:38,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:38,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:57:39,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 10:57:39,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:57:42,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 10:57:43,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:57:45,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:57:45,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 10:57:46,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:57:46,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=855140.0, ans=0.2 2023-10-02 10:57:52,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:57:53,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:57:53,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:55,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:58:00,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 10:58:00,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:58:00,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:03,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:04,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:06,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:07,676 INFO [train.py:1046] (2/4) Epoch 25, batch 800, loss[loss=0.217, simple_loss=0.2726, pruned_loss=0.08067, over 19473.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2463, pruned_loss=0.04681, over 4613870.34 frames. ], batch size: 388, lr: 4.13e-03, grad_scale: 32.0 2023-10-02 10:58:07,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:58:09,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=855273.3333333334, ans=0.2 2023-10-02 10:58:09,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=855273.3333333334, ans=0.125 2023-10-02 10:58:14,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:14,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:17,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:58:17,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:18,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:18,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:18,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:23,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:25,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:58:26,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 10:58:28,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:29,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:29,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:58:29,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:58:29,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 10:58:31,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:31,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 10:58:34,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:36,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:37,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:37,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:58:40,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:41,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:44,117 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.817e+02 2.016e+02 2.246e+02 3.441e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 10:58:44,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:58:44,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:58:44,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 10:58:47,024 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 10:58:47,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 10:58:47,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:58:47,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:47,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=855406.6666666666, ans=0.0 2023-10-02 10:58:49,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:49,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:58:50,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=855473.3333333334, ans=0.125 2023-10-02 10:58:50,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=855473.3333333334, ans=0.125 2023-10-02 10:58:53,266 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 10:58:53,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 10:58:54,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:58:56,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:59:00,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:59:05,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=855540.0, ans=0.0 2023-10-02 10:59:06,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:59:06,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 10:59:06,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:59:10,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 10:59:12,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=855540.0, ans=0.125 2023-10-02 10:59:12,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.29 vs. limit=15.0 2023-10-02 10:59:14,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.77 vs. limit=12.0 2023-10-02 10:59:17,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:59:20,394 INFO [train.py:1046] (2/4) Epoch 25, batch 850, loss[loss=0.1746, simple_loss=0.2559, pruned_loss=0.04669, over 24024.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2469, pruned_loss=0.04657, over 4652226.57 frames. ], batch size: 86, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:59:21,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:59:21,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 10:59:23,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:59:23,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:59:24,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 10:59:24,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:26,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:59:28,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:29,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:59:31,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:59:32,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 10:59:32,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 10:59:32,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 10:59:34,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:59:34,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:59:35,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=855673.3333333334, ans=0.2 2023-10-02 10:59:36,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:36,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:59:36,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:59:40,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:40,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:59:41,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 10:59:44,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 10:59:48,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:48,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 10:59:48,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=855740.0, ans=0.0 2023-10-02 10:59:49,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=855740.0, ans=0.0 2023-10-02 10:59:51,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 10:59:52,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 10:59:54,678 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 10:59:54,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:59:54,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:59:54,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 10:59:57,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:59,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:59,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 11:00:01,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:00:01,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:02,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:00:02,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:00:05,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:00:06,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:00:08,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 11:00:12,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:00:12,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:00:13,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:00:13,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:00:13,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=855806.6666666666, ans=0.125 2023-10-02 11:00:15,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:17,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:00:19,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:00:19,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=855873.3333333334, ans=0.125 2023-10-02 11:00:20,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:00:20,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:20,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:00:26,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=855873.3333333334, ans=0.0 2023-10-02 11:00:28,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:00:28,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:00:30,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 11:00:30,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:00:32,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:00:33,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 11:00:34,956 INFO [train.py:1046] (2/4) Epoch 25, batch 900, loss[loss=0.1665, simple_loss=0.2626, pruned_loss=0.03522, over 24305.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2481, pruned_loss=0.04714, over 4666795.26 frames. ], batch size: 74, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:00:40,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:00:41,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:42,361 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.54 vs. limit=15.0 2023-10-02 11:00:43,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 11:00:44,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=855940.0, ans=10.0 2023-10-02 11:00:46,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:00:46,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 11:00:47,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:00:47,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:00:47,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:00:48,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:00:48,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:00:57,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:57,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:57,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:01:01,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:01:05,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 11:01:07,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:01:10,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:01:11,294 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.867e+02 2.045e+02 2.308e+02 5.112e+02, threshold=4.090e+02, percent-clipped=1.0 2023-10-02 11:01:11,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:01:11,482 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 11:01:12,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 11:01:18,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:01:18,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:01:18,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:01:18,586 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:01:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:25,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:01:27,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 11:01:27,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:01:29,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 11:01:33,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:01:33,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:34,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:01:34,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:01:39,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 11:01:39,149 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 11:01:40,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 11:01:40,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 11:01:40,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=856206.6666666666, ans=0.1 2023-10-02 11:01:42,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:44,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 11:01:47,440 INFO [train.py:1046] (2/4) Epoch 25, batch 950, loss[loss=0.2065, simple_loss=0.2602, pruned_loss=0.07637, over 19448.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2484, pruned_loss=0.04779, over 4649753.86 frames. ], batch size: 388, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:01:48,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:01:52,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:01:53,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:01:53,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:01:53,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=856273.3333333334, ans=0.125 2023-10-02 11:01:56,256 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 11:02:01,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:01,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=856340.0, ans=0.125 2023-10-02 11:02:03,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:02:03,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:02:04,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:02:04,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 11:02:05,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:02:07,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:08,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 11:02:08,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:02:14,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:14,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:02:15,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:02:16,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 11:02:18,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 11:02:19,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:02:21,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:02:23,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=856406.6666666666, ans=0.0 2023-10-02 11:02:25,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=856406.6666666666, ans=0.0 2023-10-02 11:02:26,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:02:26,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:02:29,259 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.59 vs. limit=5.0 2023-10-02 11:02:30,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 11:02:31,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 11:02:31,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:02:31,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:02:33,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:33,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:02:37,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 11:02:37,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=856473.3333333334, ans=0.125 2023-10-02 11:02:38,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:02:42,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:02:42,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:42,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 11:02:44,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:44,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:02:44,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 11:02:47,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:02:47,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=856540.0, ans=0.125 2023-10-02 11:02:50,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:54,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:02:55,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 11:02:55,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 11:03:00,015 INFO [train.py:1046] (2/4) Epoch 25, batch 1000, loss[loss=0.1694, simple_loss=0.2336, pruned_loss=0.05261, over 23659.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2476, pruned_loss=0.04708, over 4671248.63 frames. ], batch size: 256, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:03:01,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:03:06,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 11:03:06,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:10,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:03:10,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 11:03:10,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 11:03:13,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:03:16,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:17,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 11:03:22,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 11:03:23,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 11:03:24,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:03:25,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 11:03:25,832 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.99 vs. limit=15.0 2023-10-02 11:03:26,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=856673.3333333334, ans=0.125 2023-10-02 11:03:27,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:03:28,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 11:03:29,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:29,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:36,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.36 vs. limit=10.0 2023-10-02 11:03:37,776 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.816e+02 2.002e+02 2.184e+02 3.024e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 11:03:37,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:39,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:03:39,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:40,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:40,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 11:03:40,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:03:41,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.07 vs. limit=10.0 2023-10-02 11:03:42,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:03:42,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:43,442 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 11:03:46,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 11:03:46,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 11:03:47,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 11:03:47,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=856806.6666666666, ans=0.125 2023-10-02 11:03:50,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:03:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:54,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:03:56,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:56,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=856806.6666666666, ans=0.1 2023-10-02 11:03:57,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:03:58,165 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:04:00,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 11:04:02,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:04:02,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 11:04:02,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=856873.3333333334, ans=0.0 2023-10-02 11:04:03,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 11:04:05,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:04:05,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:04:06,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:04:10,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:04:10,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=856873.3333333334, ans=0.0 2023-10-02 11:04:11,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:04:14,016 INFO [train.py:1046] (2/4) Epoch 25, batch 1050, loss[loss=0.1739, simple_loss=0.2464, pruned_loss=0.05073, over 23650.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2463, pruned_loss=0.04628, over 4692196.47 frames. ], batch size: 120, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:04:15,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:04:17,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:04:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:04:19,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:04:21,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:04:22,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=856940.0, ans=0.2 2023-10-02 11:04:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:04:25,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:04:28,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:04:28,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:04:28,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:04:29,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:04:30,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 11:04:31,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:04:33,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 11:04:35,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:04:35,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 11:04:35,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:04:40,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:04:42,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:04:42,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:04:45,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 11:04:45,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 11:04:46,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:04:48,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 11:04:51,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 11:04:51,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:04:51,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=857073.3333333334, ans=0.0 2023-10-02 11:04:53,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 11:04:55,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:04:55,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:04:56,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:05:00,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=857140.0, ans=0.2 2023-10-02 11:05:02,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:05:02,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=857140.0, ans=0.2 2023-10-02 11:05:06,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 11:05:07,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 11:05:09,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 11:05:09,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:05:09,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:05:11,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 11:05:15,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:05:16,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:05:16,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:05:16,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:05:18,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:05:20,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:05:20,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 11:05:22,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:05:22,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 11:05:22,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 11:05:23,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:05:26,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:05:28,140 INFO [train.py:1046] (2/4) Epoch 25, batch 1100, loss[loss=0.1806, simple_loss=0.2646, pruned_loss=0.04826, over 24101.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2459, pruned_loss=0.04573, over 4693348.63 frames. ], batch size: 80, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:05:32,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:05:36,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:05:38,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:05:38,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:05:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 11:05:41,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:05:42,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=857340.0, ans=0.125 2023-10-02 11:05:44,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 11:05:47,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:05:50,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:05:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 11:05:51,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:05:52,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:05:52,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:05:53,618 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.84 vs. limit=15.0 2023-10-02 11:05:55,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:05:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:06:01,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:06:04,995 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.795e+02 1.916e+02 2.111e+02 3.206e+02, threshold=3.831e+02, percent-clipped=0.0 2023-10-02 11:06:05,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 11:06:05,845 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 11:06:07,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:07,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=857406.6666666666, ans=0.0 2023-10-02 11:06:08,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:09,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:06:09,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:06:11,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 11:06:11,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:06:12,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:06:12,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:06:12,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:12,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 11:06:20,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:06:21,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 11:06:22,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:06:28,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:06:29,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=857540.0, ans=0.1 2023-10-02 11:06:31,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 11:06:31,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:06:32,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:35,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:06:37,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:06:38,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 11:06:38,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:06:38,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:06:40,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 11:06:40,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:06:41,862 INFO [train.py:1046] (2/4) Epoch 25, batch 1150, loss[loss=0.1698, simple_loss=0.2524, pruned_loss=0.04357, over 24665.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2467, pruned_loss=0.04571, over 4700857.06 frames. ], batch size: 68, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:06:41,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 11:06:41,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:06:43,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:06:44,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:06:44,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=857606.6666666666, ans=0.2 2023-10-02 11:06:48,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:06:50,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:06:53,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:06:53,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:06:53,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 11:06:54,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:06:56,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 11:06:57,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:06:57,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:07:03,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 11:07:06,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:07:09,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:07:09,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:09,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 11:07:09,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:07:09,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:07:15,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 11:07:16,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:07:17,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:07:27,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:31,906 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:07:34,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:35,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 11:07:35,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:36,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:41,043 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 11:07:44,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:49,874 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 11:07:54,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:07:55,409 INFO [train.py:1046] (2/4) Epoch 25, batch 1200, loss[loss=0.187, simple_loss=0.2619, pruned_loss=0.05611, over 23276.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2474, pruned_loss=0.04611, over 4705590.00 frames. ], batch size: 105, lr: 4.12e-03, grad_scale: 32.0 2023-10-02 11:07:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:07:55,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:07:56,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:08:00,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:03,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=857940.0, ans=0.1 2023-10-02 11:08:04,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:08:04,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:08:06,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:06,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:08:07,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:08:10,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:08:12,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:12,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:08:14,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.17 vs. limit=6.0 2023-10-02 11:08:15,372 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 11:08:16,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 11:08:19,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:08:21,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:08:22,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:24,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=858073.3333333334, ans=0.125 2023-10-02 11:08:25,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=858073.3333333334, ans=0.125 2023-10-02 11:08:26,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:08:26,432 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 11:08:27,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:30,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=858073.3333333334, ans=0.0 2023-10-02 11:08:32,517 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.868e+02 2.056e+02 2.360e+02 3.745e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-02 11:08:34,920 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:08:36,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:08:36,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:08:36,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 11:08:36,329 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:08:37,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:08:40,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 11:08:45,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 11:08:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:46,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:08:46,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:08:48,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:08:48,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:49,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:08:49,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:08:50,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 11:08:52,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:08:52,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:08:52,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:08:52,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=858140.0, ans=0.1 2023-10-02 11:08:54,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:55,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:08:56,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=858206.6666666666, ans=0.1 2023-10-02 11:08:57,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:08:59,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:09:03,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 11:09:05,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=858206.6666666666, ans=0.125 2023-10-02 11:09:06,857 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 11:09:08,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:09:09,458 INFO [train.py:1046] (2/4) Epoch 25, batch 1250, loss[loss=0.1521, simple_loss=0.2307, pruned_loss=0.03674, over 24618.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2476, pruned_loss=0.04647, over 4697919.88 frames. ], batch size: 60, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:09:10,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:09:11,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:09:12,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:09:16,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 11:09:16,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=858273.3333333334, ans=0.125 2023-10-02 11:09:20,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:09:21,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:21,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 11:09:24,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:09:25,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.09 vs. limit=6.0 2023-10-02 11:09:25,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:09:28,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:09:30,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:31,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:09:31,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:09:34,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:09:38,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:09:39,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:09:39,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:09:40,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:09:40,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:44,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:09:44,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=858406.6666666666, ans=0.125 2023-10-02 11:09:45,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:09:48,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 11:09:48,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=858406.6666666666, ans=0.1 2023-10-02 11:09:50,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:09:50,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=858406.6666666666, ans=0.0 2023-10-02 11:09:53,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:09:54,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 11:09:55,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:55,732 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 11:09:55,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:55,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:58,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:10:01,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:10:02,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:10:04,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 11:10:04,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 11:10:05,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 11:10:08,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:10,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 11:10:10,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:10:12,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 11:10:13,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:10:14,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.20 vs. limit=15.0 2023-10-02 11:10:15,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 11:10:15,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:10:16,033 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.07 vs. limit=15.0 2023-10-02 11:10:16,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:10:16,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:10:16,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:10:18,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 11:10:18,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=858540.0, ans=0.0 2023-10-02 11:10:21,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:10:22,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:10:23,946 INFO [train.py:1046] (2/4) Epoch 25, batch 1300, loss[loss=0.1534, simple_loss=0.2317, pruned_loss=0.03751, over 24566.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2475, pruned_loss=0.04646, over 4699084.03 frames. ], batch size: 60, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:10:24,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:10:25,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:10:28,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:10:29,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 11:10:32,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:35,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:10:37,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:10:39,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:10:40,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:10:40,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 11:10:44,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:10:44,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:10:46,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 11:10:46,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=858673.3333333334, ans=0.0 2023-10-02 11:10:49,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=858673.3333333334, ans=0.125 2023-10-02 11:10:51,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:10:54,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:10:55,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:10:56,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:58,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:10:58,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:10:59,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:10:59,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 11:11:03,927 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.851e+02 2.097e+02 2.493e+02 3.634e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 11:11:05,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:11:05,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:11:07,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 11:11:07,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:11:09,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:11:11,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:11:13,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 11:11:14,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:11:14,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 11:11:15,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:11:19,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:11:20,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.49 vs. limit=22.5 2023-10-02 11:11:20,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:11:21,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 11:11:22,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 11:11:23,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 11:11:27,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:11:30,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 11:11:31,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:11:37,869 INFO [train.py:1046] (2/4) Epoch 25, batch 1350, loss[loss=0.179, simple_loss=0.2593, pruned_loss=0.04941, over 24004.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2475, pruned_loss=0.04664, over 4691073.57 frames. ], batch size: 80, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:11:40,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 11:11:42,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:11:43,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:11:48,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:11:48,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:11:51,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:11:51,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:11:51,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=859006.6666666666, ans=0.125 2023-10-02 11:11:55,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:11:56,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 11:11:58,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:11:59,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:12:01,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=859006.6666666666, ans=0.0 2023-10-02 11:12:02,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 11:12:02,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:12:03,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:12:03,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 11:12:05,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 11:12:08,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 11:12:08,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:08,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=859073.3333333334, ans=0.1 2023-10-02 11:12:10,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 11:12:13,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=859073.3333333334, ans=15.0 2023-10-02 11:12:19,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:26,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:26,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:26,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 11:12:30,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:30,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 11:12:30,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:12:30,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:12:33,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:12:37,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 11:12:38,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:12:42,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=859206.6666666666, ans=0.2 2023-10-02 11:12:43,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 11:12:44,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 11:12:51,747 INFO [train.py:1046] (2/4) Epoch 25, batch 1400, loss[loss=0.1523, simple_loss=0.2372, pruned_loss=0.03373, over 24687.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2468, pruned_loss=0.04642, over 4708274.15 frames. ], batch size: 65, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:12:53,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 11:12:54,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:54,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=859273.3333333334, ans=0.125 2023-10-02 11:12:57,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:12:57,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:12:59,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=859273.3333333334, ans=0.2 2023-10-02 11:13:00,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 11:13:01,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 11:13:09,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=859340.0, ans=0.125 2023-10-02 11:13:10,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:13:11,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:13:13,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:13:15,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:13:19,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:13:19,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 11:13:29,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:29,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=859406.6666666666, ans=0.0 2023-10-02 11:13:30,869 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.837e+02 2.107e+02 2.511e+02 3.281e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-02 11:13:30,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:35,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 11:13:37,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:13:38,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:13:38,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:13:39,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=859473.3333333334, ans=0.025 2023-10-02 11:13:40,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:13:40,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:13:41,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:13:41,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:13:43,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 11:13:43,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:13:47,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:50,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:13:58,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 11:13:59,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:14:00,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:14:02,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 11:14:02,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:04,952 INFO [train.py:1046] (2/4) Epoch 25, batch 1450, loss[loss=0.1753, simple_loss=0.2447, pruned_loss=0.05298, over 23833.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2464, pruned_loss=0.04597, over 4712525.93 frames. ], batch size: 212, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:14:05,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:14:07,345 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:14:07,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=859606.6666666666, ans=0.0 2023-10-02 11:14:08,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:14:10,904 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.13 vs. limit=15.0 2023-10-02 11:14:11,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:14:11,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:11,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 11:14:14,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:14,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:14:17,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:14:17,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 11:14:17,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:14:19,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 11:14:19,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:20,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:20,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 11:14:22,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:14:22,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:14:24,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 11:14:25,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:25,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:14:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:30,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:14:32,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:14:35,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:36,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:38,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:38,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:14:39,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:39,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:14:43,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 11:14:45,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:14:49,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 11:14:50,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:14:52,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:14:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:14:54,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 11:14:58,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:00,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 11:15:01,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 11:15:03,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:04,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=859873.3333333334, ans=0.2 2023-10-02 11:15:07,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:15:07,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:15:10,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 11:15:12,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 11:15:12,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 11:15:13,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:15,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:15:19,143 INFO [train.py:1046] (2/4) Epoch 25, batch 1500, loss[loss=0.2095, simple_loss=0.2695, pruned_loss=0.07475, over 19737.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2473, pruned_loss=0.04597, over 4724756.71 frames. ], batch size: 388, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:15:26,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 11:15:26,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:15:26,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:15:26,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:27,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:15:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:15:30,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 11:15:32,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:15:32,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:15:32,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:15:32,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:15:35,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:15:35,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:15:41,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:15:41,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 11:15:41,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:15:43,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:15:44,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:47,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 11:15:50,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 11:15:51,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:53,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 11:15:54,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.12 vs. limit=22.5 2023-10-02 11:15:56,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:15:58,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:15:58,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=860073.3333333334, ans=0.07 2023-10-02 11:15:59,380 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.797e+02 1.991e+02 2.155e+02 3.181e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-02 11:15:59,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:59,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:00,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 11:16:00,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:16:02,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:16:02,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 11:16:03,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:16:07,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:16:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 11:16:13,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=860140.0, ans=0.0 2023-10-02 11:16:14,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:16:16,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:16:16,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=860206.6666666666, ans=0.1 2023-10-02 11:16:20,867 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 11:16:20,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:20,922 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 11:16:23,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:23,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:16:23,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 11:16:25,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:16:29,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 11:16:30,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:31,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=860273.3333333334, ans=0.125 2023-10-02 11:16:32,936 INFO [train.py:1046] (2/4) Epoch 25, batch 1550, loss[loss=0.1634, simple_loss=0.2462, pruned_loss=0.0403, over 24651.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2475, pruned_loss=0.04587, over 4735216.10 frames. ], batch size: 65, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:16:34,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:16:34,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:34,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:16:34,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:35,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:16:38,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 11:16:38,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 11:16:38,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:16:40,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 11:16:40,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 11:16:42,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:44,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:44,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:16:45,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:16:45,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:47,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:50,772 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 11:16:50,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:50,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:16:52,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:16:53,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:16:53,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 11:16:55,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:56,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 11:16:57,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 11:16:58,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 11:16:58,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:59,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:16:59,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=860340.0, ans=0.125 2023-10-02 11:17:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:17:03,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 11:17:03,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 11:17:07,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=860406.6666666666, ans=0.2 2023-10-02 11:17:10,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=860406.6666666666, ans=0.125 2023-10-02 11:17:11,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:15,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:17:15,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:17:15,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:17:16,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 11:17:22,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:17:22,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:22,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=860473.3333333334, ans=0.1 2023-10-02 11:17:26,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:17:29,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:17:29,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:29,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 11:17:30,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:17:31,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=860540.0, ans=0.125 2023-10-02 11:17:32,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:17:32,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:32,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 11:17:32,300 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 11:17:35,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:17:37,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=860540.0, ans=0.125 2023-10-02 11:17:40,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 11:17:43,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:17:45,898 INFO [train.py:1046] (2/4) Epoch 25, batch 1600, loss[loss=0.1554, simple_loss=0.2387, pruned_loss=0.03603, over 24340.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2484, pruned_loss=0.04648, over 4725633.11 frames. ], batch size: 61, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:17:45,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:46,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 11:17:46,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=860606.6666666666, ans=0.0 2023-10-02 11:17:47,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:17:49,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:17:49,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:17:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:17:51,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:17:53,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:17:55,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 11:17:55,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 11:17:58,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 11:18:00,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:18:01,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 11:18:03,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:18:05,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:18:08,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:18:10,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=860673.3333333334, ans=0.125 2023-10-02 11:18:11,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 11:18:11,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=860673.3333333334, ans=0.0 2023-10-02 11:18:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:18:14,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 11:18:15,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:15,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 11:18:17,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.16 vs. limit=15.0 2023-10-02 11:18:19,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=860740.0, ans=0.09899494936611666 2023-10-02 11:18:20,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 11:18:26,761 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.917e+02 2.071e+02 2.381e+02 2.970e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 11:18:28,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:18:28,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 11:18:28,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:18:29,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:18:29,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:18:30,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=860806.6666666666, ans=0.125 2023-10-02 11:18:33,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 11:18:37,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:18:39,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:18:39,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:39,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:41,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:18:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:18:44,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:18:44,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:18:50,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:51,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:18:54,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 11:18:54,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:18:57,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 11:19:00,261 INFO [train.py:1046] (2/4) Epoch 25, batch 1650, loss[loss=0.1587, simple_loss=0.2346, pruned_loss=0.04136, over 24333.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2493, pruned_loss=0.0469, over 4726973.86 frames. ], batch size: 56, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:19:01,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:01,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=860940.0, ans=0.125 2023-10-02 11:19:02,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:19:03,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:19:03,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 11:19:03,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 11:19:03,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 11:19:03,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 11:19:07,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:19:07,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:19:07,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:19:07,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:19:10,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:13,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 11:19:14,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:19:14,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:19:14,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:19:16,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:19:16,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 11:19:17,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 11:19:25,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:19:27,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:19:33,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 11:19:33,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:33,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=861073.3333333334, ans=0.0 2023-10-02 11:19:36,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 11:19:37,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=861073.3333333334, ans=0.1 2023-10-02 11:19:38,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:19:41,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:19:42,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:19:43,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:19:43,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=861140.0, ans=0.2 2023-10-02 11:19:45,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:19:45,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:48,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:49,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:49,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:19:49,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:19:51,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:19:51,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:19:55,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:19:57,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 11:19:58,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:19:58,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=861206.6666666666, ans=0.125 2023-10-02 11:19:59,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 11:20:00,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 11:20:00,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 11:20:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:01,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:20:01,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:20:01,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:20:01,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 11:20:06,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:20:07,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:20:07,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:20:08,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.50 vs. limit=22.5 2023-10-02 11:20:10,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 11:20:13,176 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:20:14,107 INFO [train.py:1046] (2/4) Epoch 25, batch 1700, loss[loss=0.1562, simple_loss=0.2422, pruned_loss=0.0351, over 24663.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2486, pruned_loss=0.04681, over 4725197.49 frames. ], batch size: 65, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:20:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:20:15,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:20:15,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 11:20:16,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:20:16,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:20:16,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:20:18,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:20:18,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:20:18,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 11:20:21,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:20:29,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:20:30,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:32,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:20:34,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=861340.0, ans=0.0 2023-10-02 11:20:35,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:38,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:20:38,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:20:39,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.15 vs. limit=12.0 2023-10-02 11:20:39,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:20:39,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:20:41,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 11:20:43,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:20:43,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:20:46,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:20:47,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 11:20:47,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 11:20:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:49,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=861406.6666666666, ans=0.1 2023-10-02 11:20:49,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=861406.6666666666, ans=0.125 2023-10-02 11:20:51,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 11:20:51,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:20:55,050 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.911e+02 2.075e+02 2.352e+02 2.964e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-02 11:20:58,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=861473.3333333334, ans=0.2 2023-10-02 11:20:59,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:01,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:01,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:21:02,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:21:02,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 11:21:02,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:21:05,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:05,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 11:21:06,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:21:06,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:06,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:06,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:09,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:09,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:21:10,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:21:12,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:15,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:21:15,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 11:21:18,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:21:22,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 11:21:28,804 INFO [train.py:1046] (2/4) Epoch 25, batch 1750, loss[loss=0.154, simple_loss=0.2434, pruned_loss=0.03227, over 24661.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2475, pruned_loss=0.04626, over 4726285.19 frames. ], batch size: 65, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:21:28,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:30,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:32,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:21:32,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 11:21:32,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:35,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:21:35,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:39,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 11:21:40,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:43,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 11:21:43,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:43,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=861673.3333333334, ans=0.125 2023-10-02 11:21:44,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:21:47,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:21:47,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=861673.3333333334, ans=0.05 2023-10-02 11:21:49,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 11:21:50,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:21:50,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 11:21:59,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:22:02,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:02,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:22:03,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=861740.0, ans=0.0 2023-10-02 11:22:06,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:07,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:22:08,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.60 vs. limit=6.0 2023-10-02 11:22:09,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:22:10,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:13,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:22:13,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:22:14,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 11:22:16,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:22:17,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=861806.6666666666, ans=0.125 2023-10-02 11:22:19,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 11:22:21,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:22:23,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:22:23,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:22:23,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=861806.6666666666, ans=0.2 2023-10-02 11:22:25,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:22:25,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 11:22:27,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:29,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:22:33,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:22:35,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:22:36,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:22:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 11:22:39,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:40,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:22:40,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:22:40,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:22:40,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:22:40,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=861873.3333333334, ans=0.1 2023-10-02 11:22:41,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:22:42,935 INFO [train.py:1046] (2/4) Epoch 25, batch 1800, loss[loss=0.1691, simple_loss=0.2412, pruned_loss=0.04849, over 23844.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2465, pruned_loss=0.04602, over 4716002.83 frames. ], batch size: 195, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:22:44,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:22:44,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:46,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:22:47,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=861940.0, ans=0.0 2023-10-02 11:22:50,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:53,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:22:53,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:22:58,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:22:58,734 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.10 vs. limit=15.0 2023-10-02 11:23:01,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:01,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:02,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:23:03,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=862006.6666666666, ans=0.035 2023-10-02 11:23:04,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:23:04,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 11:23:04,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:05,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=862006.6666666666, ans=0.125 2023-10-02 11:23:05,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.82 vs. limit=10.0 2023-10-02 11:23:08,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:11,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 11:23:14,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 11:23:14,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 11:23:14,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:16,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:16,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:23:18,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:23:22,272 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.934e+02 2.294e+02 2.756e+02 4.950e+02, threshold=4.588e+02, percent-clipped=2.0 2023-10-02 11:23:23,756 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 11:23:25,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:23:27,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:29,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=862140.0, ans=0.125 2023-10-02 11:23:30,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 11:23:30,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 11:23:30,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:23:31,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:23:33,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:23:38,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 11:23:44,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:23:44,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 11:23:45,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:23:45,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:46,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:23:46,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 11:23:49,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:23:49,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:23:49,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=862206.6666666666, ans=0.0 2023-10-02 11:23:51,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 11:23:51,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:54,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:23:54,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:23:54,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:56,317 INFO [train.py:1046] (2/4) Epoch 25, batch 1850, loss[loss=0.1502, simple_loss=0.2264, pruned_loss=0.03702, over 24450.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.247, pruned_loss=0.04659, over 4700737.71 frames. ], batch size: 58, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:23:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:56,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:23:59,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:23:59,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:24:00,317 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.54 vs. limit=22.5 2023-10-02 11:24:01,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:24:02,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:24:10,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:24:10,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 11:24:14,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 11:24:15,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 11:24:18,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:24:18,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 11:24:18,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 11:24:30,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:24:31,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 11:24:34,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:24:34,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:24:40,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 11:24:40,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:24:40,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:24:42,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:24:44,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:24:45,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:24:48,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:24:49,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:24:49,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:24:51,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:24:52,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:24:54,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:24:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 11:24:57,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:25:01,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:25:03,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:25:03,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 11:25:03,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 11:25:04,545 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 11:25:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 11:25:08,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:25:08,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:25:08,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:25:09,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:09,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 11:25:09,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:25:09,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:10,613 INFO [train.py:1046] (2/4) Epoch 25, batch 1900, loss[loss=0.1825, simple_loss=0.2623, pruned_loss=0.05136, over 24000.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2472, pruned_loss=0.04632, over 4712284.45 frames. ], batch size: 86, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:25:10,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:25:12,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:25:14,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:25:14,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 11:25:16,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:16,777 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 11:25:16,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:25:18,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:25:23,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:25:23,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:25:25,294 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 11:25:26,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 11:25:28,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:25:30,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:25:30,036 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 11:25:30,062 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 11:25:33,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 11:25:35,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:25:39,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 11:25:41,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 11:25:45,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=862740.0, ans=0.0 2023-10-02 11:25:49,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 11:25:50,896 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.019e+02 2.414e+02 2.839e+02 5.766e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-02 11:25:52,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 11:25:52,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:53,848 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 11:25:53,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 11:25:53,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 11:25:53,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 11:25:53,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:25:56,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 11:25:57,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=862806.6666666666, ans=0.125 2023-10-02 11:26:00,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:26:01,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=862806.6666666666, ans=0.1 2023-10-02 11:26:04,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:26:04,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 11:26:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:26:08,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 11:26:08,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:26:14,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:26:14,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:26:14,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:26:17,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:26:19,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:26:19,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:26:19,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=862873.3333333334, ans=0.0 2023-10-02 11:26:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:26:23,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:26:23,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:26:23,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=862940.0, ans=0.0 2023-10-02 11:26:24,530 INFO [train.py:1046] (2/4) Epoch 25, batch 1950, loss[loss=0.1782, simple_loss=0.2623, pruned_loss=0.04708, over 24036.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2473, pruned_loss=0.04623, over 4722540.17 frames. ], batch size: 80, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:26:25,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:26:25,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:26:25,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:26:27,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:26:29,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.97 vs. limit=15.0 2023-10-02 11:26:30,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:26:32,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:26:33,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:33,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:26:34,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 11:26:36,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 11:26:36,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:38,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:39,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:26:40,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:26:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:44,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:26:45,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:26:47,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:26:47,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:26:47,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:50,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:54,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:26:54,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:26:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:26:54,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 11:26:54,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:26:54,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:26:55,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:57,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:27:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:27:04,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:27:09,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:27:09,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:27:09,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 11:27:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:27:13,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:27:15,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:27:15,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:27:22,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:24,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:26,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:28,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:27:31,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:27:31,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:27:32,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 11:27:32,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:27:32,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:27:33,889 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.25 vs. limit=12.0 2023-10-02 11:27:34,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 11:27:36,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:27:36,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=863206.6666666666, ans=0.0 2023-10-02 11:27:39,095 INFO [train.py:1046] (2/4) Epoch 25, batch 2000, loss[loss=0.1933, simple_loss=0.2535, pruned_loss=0.06659, over 19490.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2481, pruned_loss=0.04664, over 4718499.78 frames. ], batch size: 388, lr: 4.11e-03, grad_scale: 32.0 2023-10-02 11:27:40,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:27:41,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:27:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:27:43,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:27:45,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:47,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 11:27:49,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:27:52,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:27:53,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 11:27:55,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:27:55,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:27:58,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:27:58,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 11:27:59,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:01,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:02,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:05,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 11:28:05,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:28:07,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 11:28:07,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:28:10,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:10,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:28:10,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:11,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:11,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:28:12,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 11:28:14,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 11:28:14,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:28:15,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:19,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:20,913 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.859e+02 2.034e+02 2.311e+02 3.192e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 11:28:20,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:28:21,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:28:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:28:25,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:25,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:25,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:28:25,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:27,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:31,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:28:32,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 11:28:35,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:28:36,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:38,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:38,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:28:44,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:45,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:46,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=863540.0, ans=0.125 2023-10-02 11:28:47,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:28:47,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:28:50,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:50,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:53,312 INFO [train.py:1046] (2/4) Epoch 25, batch 2050, loss[loss=0.1624, simple_loss=0.2277, pruned_loss=0.04856, over 23668.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2469, pruned_loss=0.04625, over 4718104.43 frames. ], batch size: 232, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:28:53,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:58,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:59,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:29:01,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:29:01,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:29:04,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 11:29:04,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:29:05,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:29:05,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:29:16,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:29:16,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:29:17,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 11:29:20,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:29:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 11:29:21,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:29:24,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:29:29,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:29:29,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:29:30,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:29:32,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:29:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:29:33,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:29:38,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:29:38,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:29:39,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:29:41,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:29:45,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:29:51,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:29:51,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 11:29:57,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:29:57,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:29:58,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:30:00,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 11:30:05,783 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 11:30:05,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:05,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:30:07,103 INFO [train.py:1046] (2/4) Epoch 25, batch 2100, loss[loss=0.1582, simple_loss=0.2231, pruned_loss=0.04666, over 23649.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2456, pruned_loss=0.04586, over 4712931.76 frames. ], batch size: 232, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:30:07,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:30:07,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:30:08,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 11:30:08,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 11:30:11,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:30:14,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:30:14,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:30:16,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:17,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:30:17,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 11:30:19,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:30:20,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 11:30:20,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 11:30:22,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:22,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:30:22,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 11:30:22,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=864006.6666666666, ans=0.1 2023-10-02 11:30:24,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 11:30:28,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 11:30:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:30:31,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:30:32,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:30:35,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:30:36,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 11:30:36,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:36,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 11:30:38,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 11:30:40,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:41,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 11:30:41,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 11:30:43,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 11:30:44,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:30:45,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:30:48,951 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.897e+02 2.155e+02 2.514e+02 3.500e+02, threshold=4.310e+02, percent-clipped=0.0 2023-10-02 11:30:49,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:30:51,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:30:51,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=864140.0, ans=0.0 2023-10-02 11:30:53,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:53,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:53,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 11:30:53,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:53,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:55,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:55,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 11:30:55,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 11:30:56,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 11:30:59,007 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=3.98 vs. limit=15.0 2023-10-02 11:30:59,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:31:02,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:31:03,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 11:31:08,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:10,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:31:11,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:31:11,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:31:11,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 11:31:11,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:31:14,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:14,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:31:14,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:31:14,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:17,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 11:31:17,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=864206.6666666666, ans=0.0 2023-10-02 11:31:18,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 11:31:18,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:21,323 INFO [train.py:1046] (2/4) Epoch 25, batch 2150, loss[loss=0.1466, simple_loss=0.1946, pruned_loss=0.04931, over 19311.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2442, pruned_loss=0.04567, over 4701793.96 frames. ], batch size: 390, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:31:21,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:31:21,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:31:23,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:31:23,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:31:24,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=864273.3333333334, ans=0.1 2023-10-02 11:31:24,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=864273.3333333334, ans=0.125 2023-10-02 11:31:28,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 11:31:30,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:31,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:31,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:31:31,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:31,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:31:31,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=864273.3333333334, ans=0.125 2023-10-02 11:31:34,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:36,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:31:36,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:31:40,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:40,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 11:31:42,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=864340.0, ans=0.1 2023-10-02 11:31:44,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=864340.0, ans=0.1 2023-10-02 11:31:45,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:31:46,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:31:46,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:48,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:31:48,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:48,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:31:49,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:49,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:31:50,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:52,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 11:31:54,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=864406.6666666666, ans=0.0 2023-10-02 11:31:55,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:31:55,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:55,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:31:55,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=864406.6666666666, ans=0.0 2023-10-02 11:31:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:31:59,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:31:59,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=864406.6666666666, ans=0.1 2023-10-02 11:32:01,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:32:02,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:32:02,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:32:02,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 11:32:04,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:32:05,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:32:06,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:08,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:32:09,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:32:09,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:09,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=864473.3333333334, ans=0.2 2023-10-02 11:32:11,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:11,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 11:32:14,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 11:32:14,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:32:14,684 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 11:32:14,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:16,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:32:16,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 11:32:16,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:32:16,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 11:32:18,095 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 11:32:18,095 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 11:32:18,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 11:32:19,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:19,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:32:19,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:32:20,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:22,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:32:23,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:23,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:33,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:32:33,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 11:32:33,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=864606.6666666666, ans=0.1 2023-10-02 11:32:34,870 INFO [train.py:1046] (2/4) Epoch 25, batch 2200, loss[loss=0.185, simple_loss=0.2531, pruned_loss=0.05848, over 23775.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2445, pruned_loss=0.0458, over 4706654.43 frames. ], batch size: 195, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:32:36,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:32:36,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=864606.6666666666, ans=0.125 2023-10-02 11:32:40,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:42,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:32:42,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:32:43,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:32:45,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:45,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:32:47,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 11:32:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 11:32:52,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:32:57,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=864673.3333333334, ans=0.125 2023-10-02 11:32:58,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 11:32:58,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=864673.3333333334, ans=0.2 2023-10-02 11:33:02,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:02,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:33:03,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:33:06,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:33:06,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 11:33:08,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=864740.0, ans=0.04949747468305833 2023-10-02 11:33:11,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:33:13,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 11:33:16,292 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.821e+02 1.981e+02 2.213e+02 3.022e+02, threshold=3.961e+02, percent-clipped=0.0 2023-10-02 11:33:16,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:33:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:33:19,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:33:20,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:22,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 11:33:23,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:25,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 11:33:27,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:27,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:33:28,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:29,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:33:29,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:33:29,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:29,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:31,398 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-10-02 11:33:32,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:33:32,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:33:34,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:33:37,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:33:37,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:33:41,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:33:41,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 11:33:41,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=864873.3333333334, ans=0.125 2023-10-02 11:33:45,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:33:45,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 11:33:45,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:33:47,102 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 11:33:48,991 INFO [train.py:1046] (2/4) Epoch 25, batch 2250, loss[loss=0.1595, simple_loss=0.2451, pruned_loss=0.03697, over 24356.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2455, pruned_loss=0.04578, over 4711308.93 frames. ], batch size: 61, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:33:49,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:49,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:33:50,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:52,020 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 11:33:53,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:33:56,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:34:02,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:34:04,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:34:06,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:06,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:34:07,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:34:10,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 11:34:10,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:34:10,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:34:12,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=865006.6666666666, ans=0.0 2023-10-02 11:34:13,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 11:34:13,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:34:13,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=865006.6666666666, ans=0.0 2023-10-02 11:34:15,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:17,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:34:23,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:34:24,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:34:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:34:25,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 11:34:26,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:34:33,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:34:34,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:34:36,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:34:36,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:34:37,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:34:40,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:34:44,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:34:46,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:34:51,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:34:51,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:34:51,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:34:57,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:35:00,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:35:00,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 11:35:00,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:01,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:35:03,267 INFO [train.py:1046] (2/4) Epoch 25, batch 2300, loss[loss=0.1751, simple_loss=0.2588, pruned_loss=0.0457, over 24417.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2472, pruned_loss=0.04648, over 4703662.99 frames. ], batch size: 77, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:35:05,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 11:35:07,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:35:08,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:12,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:13,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:35:16,055 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 11:35:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:20,311 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.61 vs. limit=15.0 2023-10-02 11:35:23,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:35:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:35:24,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:24,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:24,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 11:35:26,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:35:27,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:35:27,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:35:30,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:35:34,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:35:36,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:35:40,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:35:40,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:43,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:35:44,372 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.801e+02 2.122e+02 2.550e+02 3.360e+02, threshold=4.244e+02, percent-clipped=0.0 2023-10-02 11:35:45,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:46,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=865473.3333333334, ans=0.125 2023-10-02 11:35:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:35:49,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:35:49,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:35:49,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 11:35:53,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=865473.3333333334, ans=0.125 2023-10-02 11:35:54,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:35:54,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:55,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:35:55,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:35:55,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:35:57,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 11:35:59,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:35:59,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 11:35:59,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:35:59,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:36:00,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 11:36:00,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=865473.3333333334, ans=0.2 2023-10-02 11:36:03,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:36:07,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:36:11,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:36:12,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:36:12,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:36:14,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:36:14,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:36:14,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:36:15,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 11:36:17,544 INFO [train.py:1046] (2/4) Epoch 25, batch 2350, loss[loss=0.1732, simple_loss=0.2609, pruned_loss=0.04274, over 24629.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2474, pruned_loss=0.0467, over 4698291.69 frames. ], batch size: 68, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:36:21,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:36:21,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 11:36:21,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=865606.6666666666, ans=0.0 2023-10-02 11:36:28,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 11:36:31,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:36:31,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=865673.3333333334, ans=0.1 2023-10-02 11:36:32,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:36:32,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:36:33,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=865673.3333333334, ans=0.0 2023-10-02 11:36:34,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:36:34,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:36:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 11:36:38,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=865673.3333333334, ans=0.95 2023-10-02 11:36:39,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:36:41,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=865673.3333333334, ans=0.125 2023-10-02 11:36:45,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 11:36:46,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:36:49,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:36:49,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:36:52,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:36:54,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 11:36:55,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:36:57,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:36:57,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:36:57,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:37:02,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:37:02,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=865806.6666666666, ans=0.125 2023-10-02 11:37:04,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 11:37:04,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:37:06,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:37:06,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:37:09,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 11:37:09,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:37:13,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 11:37:13,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:37:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 11:37:20,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 11:37:20,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:37:22,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 11:37:22,177 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 11:37:22,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 11:37:24,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 11:37:29,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:37:31,493 INFO [train.py:1046] (2/4) Epoch 25, batch 2400, loss[loss=0.1678, simple_loss=0.2315, pruned_loss=0.05201, over 23615.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2473, pruned_loss=0.04655, over 4702485.10 frames. ], batch size: 232, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:37:34,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:37:38,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:37:38,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:37:38,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 11:37:39,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 11:37:40,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=865940.0, ans=0.1 2023-10-02 11:37:41,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=865940.0, ans=0.0 2023-10-02 11:37:45,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:37:45,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:37:46,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 11:37:46,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:37:48,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:37:49,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 11:37:54,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:37:56,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 11:38:02,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:38:02,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=866073.3333333334, ans=0.125 2023-10-02 11:38:05,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 11:38:07,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:38:09,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:12,533 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.315e+02 1.806e+02 1.972e+02 2.221e+02 3.865e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-02 11:38:13,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:38:15,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 11:38:15,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:38:15,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=866140.0, ans=0.0 2023-10-02 11:38:21,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:24,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:38:24,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=866140.0, ans=0.0 2023-10-02 11:38:25,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:38:27,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:38:27,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:38:27,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:38:27,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:29,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:38:29,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:38:33,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:38:35,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:38:35,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 11:38:36,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 11:38:39,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:38:39,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:39,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 11:38:40,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 11:38:40,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 11:38:40,825 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 11:38:40,958 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:38:42,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 11:38:43,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:38:44,983 INFO [train.py:1046] (2/4) Epoch 25, batch 2450, loss[loss=0.1621, simple_loss=0.2453, pruned_loss=0.03944, over 24400.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2464, pruned_loss=0.04603, over 4710537.36 frames. ], batch size: 77, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:38:45,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:45,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:38:46,499 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 11:38:46,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:47,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:38:49,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:38:51,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:38:54,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:38:54,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:38:54,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 11:39:00,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:39:00,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:03,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:39:05,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:39:05,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:39:05,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 11:39:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:12,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:39:12,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:39:16,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:39:16,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:17,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:17,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:39:18,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=866406.6666666666, ans=0.125 2023-10-02 11:39:20,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 11:39:20,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:39:26,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.46 vs. limit=12.0 2023-10-02 11:39:27,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:28,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:30,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:39:30,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:39:31,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:32,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:39:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 11:39:36,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:36,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:39:39,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:39:39,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:39:43,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:39:43,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 11:39:45,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:39:46,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:39:46,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 11:39:46,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:39:48,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:39:51,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:39:52,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:53,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:39:58,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 11:39:58,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:40:00,114 INFO [train.py:1046] (2/4) Epoch 25, batch 2500, loss[loss=0.1869, simple_loss=0.2646, pruned_loss=0.05454, over 24347.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2455, pruned_loss=0.04571, over 4705243.63 frames. ], batch size: 77, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:40:05,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:40:14,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:40:15,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:40:16,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:40:16,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 11:40:23,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:40:23,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:40:23,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=866673.3333333334, ans=0.04949747468305833 2023-10-02 11:40:25,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:40:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 11:40:25,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 11:40:25,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=866673.3333333334, ans=0.1 2023-10-02 11:40:27,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:28,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:40:28,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 11:40:28,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:28,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=866740.0, ans=0.2 2023-10-02 11:40:29,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 11:40:29,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:34,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-10-02 11:40:35,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:40:36,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:40:39,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:40:39,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 11:40:39,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:40:41,814 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.456e+02 1.832e+02 2.049e+02 2.331e+02 3.606e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 11:40:41,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:44,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:47,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:50,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:40:55,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=15.0 2023-10-02 11:40:56,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:40:57,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 11:40:59,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:40:59,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:41:00,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:41:00,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:41:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 11:41:02,371 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 11:41:02,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 11:41:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:41:05,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 11:41:05,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 11:41:07,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:41:08,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 11:41:11,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 11:41:12,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:41:14,051 INFO [train.py:1046] (2/4) Epoch 25, batch 2550, loss[loss=0.1925, simple_loss=0.2681, pruned_loss=0.05846, over 23476.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2455, pruned_loss=0.04556, over 4705537.73 frames. ], batch size: 106, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:41:15,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:41:15,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:41:16,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:41:18,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 11:41:20,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:41:22,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 11:41:23,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.87 vs. limit=22.5 2023-10-02 11:41:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:41:26,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:28,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:41:28,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 11:41:29,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:41:29,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:41:29,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:41:33,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:41:33,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 11:41:33,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:41:33,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:33,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 11:41:34,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn2.whiten.whitening_limit, batch_count=867006.6666666666, ans=22.5 2023-10-02 11:41:44,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=867073.3333333334, ans=0.0 2023-10-02 11:41:44,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=867073.3333333334, ans=0.0 2023-10-02 11:41:47,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=867073.3333333334, ans=0.2 2023-10-02 11:41:48,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:41:54,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:41:54,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:54,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:41:54,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=867073.3333333334, ans=0.125 2023-10-02 11:41:55,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:41:58,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=867140.0, ans=0.0 2023-10-02 11:42:03,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:42:06,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:42:06,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:42:06,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:42:06,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:42:08,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:42:08,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=867140.0, ans=0.125 2023-10-02 11:42:11,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:42:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:42:16,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:42:16,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 11:42:16,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:42:16,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:42:17,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:42:19,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:42:20,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:26,619 INFO [train.py:1046] (2/4) Epoch 25, batch 2600, loss[loss=0.167, simple_loss=0.2552, pruned_loss=0.0394, over 24443.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2458, pruned_loss=0.04542, over 4724701.15 frames. ], batch size: 69, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:42:26,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:42:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:30,982 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 11:42:32,467 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 11:42:32,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:42:33,775 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 11:42:33,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 11:42:33,865 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 11:42:37,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:42:38,981 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 11:42:40,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 11:42:42,222 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 11:42:43,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:42:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 11:42:46,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 11:42:47,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:42:47,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 11:42:49,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.35 vs. limit=15.0 2023-10-02 11:42:50,516 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 11:42:50,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 11:42:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:42:56,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:42:57,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 11:43:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:43:05,615 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 11:43:10,118 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.853e+02 2.071e+02 2.375e+02 3.577e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 11:43:12,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:43:12,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:12,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 11:43:13,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:43:13,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:43:14,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 11:43:17,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:43:17,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:43:20,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:23,073 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 11:43:24,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:43:29,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:43:30,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:43:30,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 11:43:31,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:43:31,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:43:33,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:43:35,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=867540.0, ans=0.125 2023-10-02 11:43:38,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 11:43:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:40,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:43:41,628 INFO [train.py:1046] (2/4) Epoch 25, batch 2650, loss[loss=0.1779, simple_loss=0.249, pruned_loss=0.05345, over 23862.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2465, pruned_loss=0.04553, over 4723649.95 frames. ], batch size: 195, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:43:44,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 11:43:44,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:46,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:43:47,253 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 11:43:47,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:43:51,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:51,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:43:53,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:43:53,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=867606.6666666666, ans=0.2 2023-10-02 11:43:55,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:55,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 11:43:55,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:43:55,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:43:59,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 11:43:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 11:44:02,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:03,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 11:44:03,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:05,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 11:44:11,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:11,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:44:11,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:12,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:15,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 11:44:16,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 11:44:18,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=867740.0, ans=0.125 2023-10-02 11:44:19,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:44:23,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 11:44:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:25,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:44:25,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:44:25,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=867806.6666666666, ans=0.1 2023-10-02 11:44:26,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:26,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:44:28,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:44:30,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:44:30,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:44:31,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:44:31,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:33,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:44:35,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:36,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:44:36,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:44:39,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:41,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:44:41,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:41,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 11:44:46,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:47,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:48,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:48,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:44:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:44:51,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:44:54,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:44:54,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 11:44:54,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=867940.0, ans=0.015 2023-10-02 11:44:55,440 INFO [train.py:1046] (2/4) Epoch 25, batch 2700, loss[loss=0.1659, simple_loss=0.2319, pruned_loss=0.04994, over 23528.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2483, pruned_loss=0.04698, over 4697904.85 frames. ], batch size: 256, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:44:56,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:44:57,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=867940.0, ans=0.125 2023-10-02 11:44:58,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 11:45:00,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:45:01,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:01,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:02,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:45:02,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:45:02,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:45:02,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:45:02,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 11:45:03,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=867940.0, ans=0.125 2023-10-02 11:45:04,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:45:05,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:45:08,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:45:08,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:45:12,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:45:14,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 11:45:14,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:45:18,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:45:18,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:45:22,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:45:22,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:45:23,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:45:23,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:45:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:45:28,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=868073.3333333334, ans=0.125 2023-10-02 11:45:29,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:45:29,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:45:29,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:45:34,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:34,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:45:39,106 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.430e+02 1.840e+02 2.089e+02 2.422e+02 3.725e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 11:45:44,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:45:44,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:45:47,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:45:47,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:45:48,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=868140.0, ans=0.125 2023-10-02 11:45:51,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:52,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:45:54,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:45:54,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:45:56,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:56,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:45:58,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:46:00,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:46:00,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:46:03,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 11:46:04,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:46:06,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 11:46:06,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.66 vs. limit=15.0 2023-10-02 11:46:09,183 INFO [train.py:1046] (2/4) Epoch 25, batch 2750, loss[loss=0.1536, simple_loss=0.2151, pruned_loss=0.04606, over 23408.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2478, pruned_loss=0.04696, over 4705465.58 frames. ], batch size: 285, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:46:09,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 11:46:09,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:12,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:13,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:46:15,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:15,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:46:16,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:19,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:46:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:46:21,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:46:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 11:46:21,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:46:21,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:26,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 11:46:27,868 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.64 vs. limit=15.0 2023-10-02 11:46:28,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:46:28,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:29,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:46:29,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:46:31,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:46:33,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:46:34,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:35,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:38,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:46:40,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:46:40,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:46:42,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:43,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:46:49,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=868406.6666666666, ans=0.125 2023-10-02 11:46:50,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:52,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:46:53,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:46:55,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=868473.3333333334, ans=0.125 2023-10-02 11:46:56,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:56,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:46:56,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:47:02,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:47:02,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:47:02,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 11:47:06,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:08,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 11:47:15,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:47:17,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:47:17,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 11:47:17,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:47:19,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:47:19,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 11:47:19,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:47:22,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 11:47:22,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:22,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:47:24,079 INFO [train.py:1046] (2/4) Epoch 25, batch 2800, loss[loss=0.1617, simple_loss=0.2491, pruned_loss=0.03714, over 24685.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2475, pruned_loss=0.04667, over 4714277.41 frames. ], batch size: 73, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:47:24,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 11:47:24,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:47:24,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:25,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:47:25,647 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 11:47:25,647 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 11:47:29,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:32,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:47:32,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:47:34,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:47:37,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 11:47:40,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 11:47:41,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 11:47:42,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:43,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:47:43,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:47:47,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:47:47,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:47:48,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:47:56,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:47:57,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:59,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:48:02,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:04,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:48:04,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 11:48:06,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:08,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:48:08,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:48:09,358 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.857e+02 2.100e+02 2.533e+02 3.672e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-02 11:48:11,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=868806.6666666666, ans=0.0 2023-10-02 11:48:12,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:12,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:15,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:48:16,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:48:18,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:18,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:48:19,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:48:19,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:48:21,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:48:21,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 11:48:21,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:21,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=868873.3333333334, ans=0.5 2023-10-02 11:48:21,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=868873.3333333334, ans=0.1 2023-10-02 11:48:22,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:48:22,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:23,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 11:48:25,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:25,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:48:25,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:48:27,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 11:48:29,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=868873.3333333334, ans=0.125 2023-10-02 11:48:33,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:48:33,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:48:34,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.72 vs. limit=15.0 2023-10-02 11:48:35,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:48:36,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:48:38,043 INFO [train.py:1046] (2/4) Epoch 25, batch 2850, loss[loss=0.1723, simple_loss=0.2402, pruned_loss=0.05214, over 23675.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2461, pruned_loss=0.0464, over 4691079.04 frames. ], batch size: 232, lr: 4.10e-03, grad_scale: 8.0 2023-10-02 11:48:41,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:48:41,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:48:42,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:44,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:44,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=868940.0, ans=0.0 2023-10-02 11:48:45,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:47,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:48:47,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 11:48:48,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=22.5 2023-10-02 11:48:49,236 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.15 vs. limit=15.0 2023-10-02 11:48:52,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 11:48:52,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:48:54,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 11:48:55,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:58,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 11:48:58,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 11:49:01,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:11,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.33 vs. limit=15.0 2023-10-02 11:49:14,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:49:15,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:49:15,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:49:17,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:49:17,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:49:17,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:49:20,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:49:20,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 11:49:21,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:49:21,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:49:22,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:49:22,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:25,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:49:25,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:49:25,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:27,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:49:29,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:49:29,612 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-10-02 11:49:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:30,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:32,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:49:38,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:49:39,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 11:49:39,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 11:49:41,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:49:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:49:43,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 11:49:44,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:49:44,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:49:44,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:49:44,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:49:44,597 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 11:49:45,962 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 11:49:45,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:49:46,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:51,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:49:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:49:52,692 INFO [train.py:1046] (2/4) Epoch 25, batch 2900, loss[loss=0.2123, simple_loss=0.267, pruned_loss=0.07882, over 19399.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2458, pruned_loss=0.04643, over 4696104.21 frames. ], batch size: 388, lr: 4.10e-03, grad_scale: 8.0 2023-10-02 11:49:52,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:49:52,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 11:49:57,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:57,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 11:49:58,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 11:49:59,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:49:59,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:50:01,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:50:04,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:50:08,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:50:08,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:50:11,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:50:12,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 11:50:12,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:50:14,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:16,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 11:50:17,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 11:50:18,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:50:18,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 11:50:20,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:50:21,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:50:21,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:50:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:50:23,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:25,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:50:27,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:50:28,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 11:50:30,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 11:50:30,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:50:33,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:50:36,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 11:50:37,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:50:40,587 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.913e+02 2.084e+02 2.277e+02 3.213e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-02 11:50:42,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=869473.3333333334, ans=0.125 2023-10-02 11:50:43,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:50,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:50:50,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:50:52,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=869540.0, ans=0.125 2023-10-02 11:50:53,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 11:50:56,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:50:56,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 11:50:56,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:50:56,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:50:56,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=869540.0, ans=0.125 2023-10-02 11:51:03,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:51:05,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 11:51:07,021 INFO [train.py:1046] (2/4) Epoch 25, batch 2950, loss[loss=0.1802, simple_loss=0.2512, pruned_loss=0.0546, over 23486.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2466, pruned_loss=0.04629, over 4705196.74 frames. ], batch size: 285, lr: 4.10e-03, grad_scale: 4.0 2023-10-02 11:51:07,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:51:07,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:08,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:08,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:51:11,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 11:51:11,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 11:51:12,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:51:12,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:51:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:51:21,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:51:22,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:51:22,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:51:25,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:51:25,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:51:26,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:28,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:28,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:51:28,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=869673.3333333334, ans=0.0 2023-10-02 11:51:31,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 11:51:37,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 11:51:37,299 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 11:51:37,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:51:39,340 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 11:51:40,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 11:51:40,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:51:42,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:51:42,084 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 11:51:42,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:51:42,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=869740.0, ans=0.125 2023-10-02 11:51:45,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 11:51:46,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:51:46,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:51:48,549 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.62 vs. limit=15.0 2023-10-02 11:51:49,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:49,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:51:50,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:51:50,834 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 11:51:52,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:52,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 11:51:57,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:51:58,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:52:00,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 11:52:00,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:52:01,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 11:52:04,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:52:07,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:52:07,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:52:09,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:52:09,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:52:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:52:12,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:12,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:52:14,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:52:15,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:52:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:52:17,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:17,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 11:52:19,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:20,700 INFO [train.py:1046] (2/4) Epoch 25, batch 3000, loss[loss=0.2246, simple_loss=0.2845, pruned_loss=0.08236, over 19249.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2473, pruned_loss=0.04685, over 4696644.18 frames. ], batch size: 388, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:52:20,700 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 11:52:34,336 INFO [train.py:1078] (2/4) Epoch 25, validation: loss=0.328, simple_loss=0.2751, pruned_loss=0.1905, over 1125622.00 frames. 2023-10-02 11:52:34,337 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 11:52:34,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:52:35,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:52:38,926 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 11:52:38,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 11:52:40,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:52:42,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:52:42,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 11:52:42,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:52:47,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=869940.0, ans=0.0 2023-10-02 11:52:49,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:52:57,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:53:00,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=870006.6666666666, ans=0.2 2023-10-02 11:53:02,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 11:53:04,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:53:07,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:53:08,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:53:08,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:53:11,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:53:11,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 11:53:11,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=870073.3333333334, ans=0.0 2023-10-02 11:53:11,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=870073.3333333334, ans=0.0 2023-10-02 11:53:13,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 11:53:15,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:53:15,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:53:16,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:53:18,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:53:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:18,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:53:21,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:53:21,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:53:21,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:53:23,882 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.799e+02 2.052e+02 2.353e+02 3.864e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 11:53:24,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:53:27,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 11:53:29,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:53:29,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:29,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:53:31,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.28 vs. limit=22.5 2023-10-02 11:53:33,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:33,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:33,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 11:53:33,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 11:53:33,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=870206.6666666666, ans=0.0 2023-10-02 11:53:35,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:53:35,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 11:53:35,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:53:37,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 11:53:39,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:53:41,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 11:53:41,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 11:53:43,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 11:53:43,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:53:45,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:53:46,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:46,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:53:46,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:46,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=870206.6666666666, ans=0.2 2023-10-02 11:53:47,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:53:47,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=870273.3333333334, ans=0.1 2023-10-02 11:53:48,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.49 vs. limit=15.0 2023-10-02 11:53:49,640 INFO [train.py:1046] (2/4) Epoch 25, batch 3050, loss[loss=0.166, simple_loss=0.2418, pruned_loss=0.04516, over 23250.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2481, pruned_loss=0.04719, over 4700012.21 frames. ], batch size: 105, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:53:49,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 11:53:51,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:53:53,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:53:53,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:53:56,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:59,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 11:54:01,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=870273.3333333334, ans=0.0 2023-10-02 11:54:02,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=870340.0, ans=0.125 2023-10-02 11:54:04,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 11:54:05,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 11:54:07,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:09,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:54:14,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:14,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:54:14,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=870340.0, ans=0.125 2023-10-02 11:54:15,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:19,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:54:19,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:54:19,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:19,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:54:19,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:20,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:23,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:24,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=870406.6666666666, ans=0.125 2023-10-02 11:54:26,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:26,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 11:54:28,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:28,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:54:30,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:54:32,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:54:32,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:54:32,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=870473.3333333334, ans=0.035 2023-10-02 11:54:33,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:38,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:39,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:39,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=870473.3333333334, ans=0.0 2023-10-02 11:54:42,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:43,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:54:43,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:45,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:54:46,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:54:48,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:54:48,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 11:54:50,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:54:50,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:51,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 11:54:52,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:57,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:59,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:55:00,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:55:01,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 11:55:03,276 INFO [train.py:1046] (2/4) Epoch 25, batch 3100, loss[loss=0.1443, simple_loss=0.2264, pruned_loss=0.03112, over 24663.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2475, pruned_loss=0.04686, over 4709036.48 frames. ], batch size: 65, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:55:06,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 11:55:07,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-10-02 11:55:07,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 11:55:09,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:55:12,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:55:12,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:15,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:55:20,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:24,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 11:55:28,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=870673.3333333334, ans=0.2 2023-10-02 11:55:29,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 11:55:29,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:30,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:55:30,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:55:31,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:55:32,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.27 vs. limit=15.0 2023-10-02 11:55:33,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:55:33,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=870740.0, ans=0.0 2023-10-02 11:55:34,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 11:55:34,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:55:34,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=870740.0, ans=0.0 2023-10-02 11:55:35,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:37,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 11:55:39,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:55:41,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:55:41,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 11:55:43,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 11:55:44,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:45,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:49,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:55:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:49,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:55:50,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:55:50,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:55:52,041 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.832e+02 1.979e+02 2.196e+02 3.160e+02, threshold=3.959e+02, percent-clipped=0.0 2023-10-02 11:55:53,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:55:53,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:55:53,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:53,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 11:55:56,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=870806.6666666666, ans=0.125 2023-10-02 11:55:58,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:55:59,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 11:56:01,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:56:02,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 11:56:02,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:02,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:03,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 11:56:16,765 INFO [train.py:1046] (2/4) Epoch 25, batch 3150, loss[loss=0.1749, simple_loss=0.2559, pruned_loss=0.04696, over 24047.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2468, pruned_loss=0.0466, over 4715314.22 frames. ], batch size: 80, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:56:16,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 11:56:18,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:18,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:19,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:56:19,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:56:21,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 11:56:23,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:56:25,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 11:56:26,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:28,056 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 11:56:29,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 11:56:31,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:56:32,516 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 11:56:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 11:56:35,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 11:56:36,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 11:56:36,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 11:56:36,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:36,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:56:37,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:38,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=871006.6666666666, ans=0.125 2023-10-02 11:56:39,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 11:56:41,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:41,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:42,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:56:44,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:56:44,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=871006.6666666666, ans=0.1 2023-10-02 11:56:48,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 11:56:48,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=871073.3333333334, ans=0.125 2023-10-02 11:56:49,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:56:51,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:56:52,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:56:52,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 11:56:55,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 11:56:55,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:56:57,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:56:57,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 11:56:57,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=871073.3333333334, ans=0.125 2023-10-02 11:56:58,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:58,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:56:58,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:56:58,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:57:00,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 11:57:00,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:57:00,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:03,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:57:03,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:57:03,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 11:57:04,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:06,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 11:57:06,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:07,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 11:57:09,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 11:57:10,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:57:10,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:11,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 11:57:13,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 11:57:13,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:57:16,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:57:18,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:18,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:57:24,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:57:24,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:26,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 11:57:30,982 INFO [train.py:1046] (2/4) Epoch 25, batch 3200, loss[loss=0.159, simple_loss=0.2329, pruned_loss=0.04249, over 23788.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.246, pruned_loss=0.04622, over 4718430.01 frames. ], batch size: 212, lr: 4.09e-03, grad_scale: 16.0 2023-10-02 11:57:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:57:33,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:57:37,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:37,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:57:37,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 11:57:38,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=871273.3333333334, ans=0.0 2023-10-02 11:57:40,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:43,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:57:46,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:54,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:58:03,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 11:58:03,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:58:06,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 11:58:08,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:58:12,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:58:12,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:58:13,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:58:17,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 11:58:18,965 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.831e+02 2.000e+02 2.221e+02 3.044e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-02 11:58:19,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:58:20,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 11:58:21,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.04 vs. limit=15.0 2023-10-02 11:58:23,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 11:58:25,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:58:30,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:58:30,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:58:30,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:58:31,648 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 11:58:31,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 11:58:34,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:58:35,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 11:58:36,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 11:58:36,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 11:58:36,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=871540.0, ans=0.2 2023-10-02 11:58:37,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 11:58:39,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:58:42,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:58:42,363 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 11:58:42,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:58:42,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:58:42,492 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 11:58:45,077 INFO [train.py:1046] (2/4) Epoch 25, batch 3250, loss[loss=0.1687, simple_loss=0.2553, pruned_loss=0.04108, over 24537.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2463, pruned_loss=0.04583, over 4729111.19 frames. ], batch size: 71, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:58:48,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:58:49,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:58:52,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=871606.6666666666, ans=0.0 2023-10-02 11:58:58,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:58:58,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 11:59:00,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:00,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:59:00,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:59:00,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:59:01,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:59:04,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:59:04,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:05,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:05,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:05,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:59:08,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:59:11,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:13,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:14,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:14,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:59:14,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:59:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 11:59:18,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:59:18,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:59:21,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:23,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:59:27,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:59:33,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:59:33,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:33,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 11:59:33,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:59:33,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:59:35,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:39,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 11:59:39,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 11:59:41,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:59:42,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:42,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:59:42,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:59:44,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:59:46,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:59:46,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:59:48,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=871873.3333333334, ans=0.125 2023-10-02 11:59:50,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 11:59:50,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:59:51,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:59:51,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 11:59:53,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=871873.3333333334, ans=0.1 2023-10-02 11:59:54,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:59:55,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 11:59:55,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=871873.3333333334, ans=0.125 2023-10-02 11:59:58,139 INFO [train.py:1046] (2/4) Epoch 25, batch 3300, loss[loss=0.1681, simple_loss=0.2426, pruned_loss=0.04685, over 23825.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2474, pruned_loss=0.0459, over 4734570.40 frames. ], batch size: 195, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:59:58,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 11:59:58,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 11:59:58,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:02,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:00:03,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:00:03,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:06,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:00:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:00:06,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=871940.0, ans=0.125 2023-10-02 12:00:07,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:10,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:00:10,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=871940.0, ans=0.125 2023-10-02 12:00:15,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 12:00:15,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:00:15,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:16,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:16,567 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 12:00:18,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.18 vs. limit=12.0 2023-10-02 12:00:19,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:00:19,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:00:19,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:00:19,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:00:21,041 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 12:00:25,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:25,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:00:25,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:25,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 12:00:27,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 12:00:27,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:29,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:00:29,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=872073.3333333334, ans=0.07 2023-10-02 12:00:32,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 12:00:35,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 12:00:35,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:00:38,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 12:00:40,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:00:43,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:00:43,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:00:46,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:00:46,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:46,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:46,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:00:48,872 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.878e+02 2.117e+02 2.413e+02 3.290e+02, threshold=4.234e+02, percent-clipped=0.0 2023-10-02 12:00:49,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:00:49,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:50,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:00:50,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 12:00:51,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 12:00:54,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:00:54,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:00:54,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:00:54,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=872140.0, ans=0.125 2023-10-02 12:00:58,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:58,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:00:59,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:00:59,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:00:59,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 12:01:00,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:01:02,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:01:05,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 12:01:06,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:07,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:08,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:01:08,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:01:10,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:11,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:01:11,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:13,358 INFO [train.py:1046] (2/4) Epoch 25, batch 3350, loss[loss=0.1594, simple_loss=0.2365, pruned_loss=0.04115, over 23283.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2482, pruned_loss=0.04644, over 4731815.37 frames. ], batch size: 119, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:01:16,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:01:17,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:17,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:01:19,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:21,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:01:21,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=872273.3333333334, ans=0.125 2023-10-02 12:01:23,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:24,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:01:26,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 12:01:27,694 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 12:01:27,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:31,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 12:01:31,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 12:01:33,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:01:33,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:01:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:34,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 12:01:34,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:01:36,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:38,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:39,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:39,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:01:42,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:01:45,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:45,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:01:49,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:01:50,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:52,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:58,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 12:01:58,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:01:58,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 12:01:58,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:01:59,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 12:02:02,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:02,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:02:09,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:02:09,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 12:02:11,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:02:12,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:02:14,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:02:19,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:02:20,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 12:02:20,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:02:21,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:02:24,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:24,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 12:02:25,918 INFO [train.py:1046] (2/4) Epoch 25, batch 3400, loss[loss=0.1784, simple_loss=0.2658, pruned_loss=0.04549, over 24291.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2487, pruned_loss=0.04684, over 4740613.11 frames. ], batch size: 74, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:02:26,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:02:26,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 12:02:27,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:02:27,618 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:02:29,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:02:30,220 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.70 vs. limit=22.5 2023-10-02 12:02:30,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:02:30,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:02:32,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 12:02:35,112 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:02:36,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 12:02:36,238 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 12:02:36,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:02:40,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:02:41,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:02:41,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:02:42,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:02:47,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:02:48,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 12:02:52,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:02:55,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:02:55,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:57,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:03:04,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:03:08,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 12:03:12,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:03:12,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:03:12,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 12:03:14,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:03:15,533 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.863e+02 2.090e+02 2.369e+02 3.462e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 12:03:15,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:03:17,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:03:17,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:03:17,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=872806.6666666666, ans=0.1 2023-10-02 12:03:20,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:03:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:03:23,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:03:26,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=872873.3333333334, ans=0.125 2023-10-02 12:03:27,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:03:29,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 12:03:33,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:03:38,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 12:03:39,773 INFO [train.py:1046] (2/4) Epoch 25, batch 3450, loss[loss=0.1728, simple_loss=0.2457, pruned_loss=0.04993, over 23709.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2474, pruned_loss=0.04649, over 4744170.84 frames. ], batch size: 149, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:03:42,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 12:03:43,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:03:45,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:03:45,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 12:03:45,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:03:48,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=872940.0, ans=0.125 2023-10-02 12:03:50,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:03:55,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:03:55,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:03:57,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:03:57,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:00,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:04,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 12:04:09,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 12:04:09,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:04:09,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=873073.3333333334, ans=0.1 2023-10-02 12:04:11,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:04:11,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:16,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 12:04:18,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:04:21,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:04:21,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:04:23,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:04:25,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:04:26,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 12:04:26,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:04:28,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:04:31,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 12:04:34,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:04:40,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:04:40,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:43,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:04:47,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:49,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:04:49,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:04:50,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:04:53,916 INFO [train.py:1046] (2/4) Epoch 25, batch 3500, loss[loss=0.1451, simple_loss=0.2255, pruned_loss=0.03238, over 21445.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2456, pruned_loss=0.0459, over 4731196.95 frames. ], batch size: 47, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:04:53,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:04:57,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:04:58,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 12:04:58,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=873273.3333333334, ans=0.1 2023-10-02 12:05:01,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:05:04,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:05:07,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:05:07,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 12:05:11,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:05:12,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:05:14,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:05:14,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:05:14,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:05:14,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:14,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:05:16,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 12:05:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:05:21,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:05:24,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:25,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 12:05:25,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:05:30,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:05:30,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:05:31,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:34,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:05:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:05:36,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 12:05:36,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 12:05:38,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 12:05:38,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:05:38,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=873473.3333333334, ans=0.125 2023-10-02 12:05:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:39,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:05:41,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:05:42,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:05:43,823 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.790e+02 1.964e+02 2.126e+02 3.238e+02, threshold=3.929e+02, percent-clipped=0.0 2023-10-02 12:05:43,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:05:50,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:05:51,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 12:05:51,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 12:05:51,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:05:53,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:05:53,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:05:54,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.44 vs. limit=15.0 2023-10-02 12:05:54,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:58,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 12:05:58,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:05:59,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:06:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 12:06:02,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 12:06:03,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:05,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:06:06,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:06,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:07,623 INFO [train.py:1046] (2/4) Epoch 25, batch 3550, loss[loss=0.1689, simple_loss=0.2353, pruned_loss=0.05123, over 23441.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2448, pruned_loss=0.0455, over 4733620.85 frames. ], batch size: 285, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:06:10,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:06:14,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=873606.6666666666, ans=10.0 2023-10-02 12:06:17,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:19,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 12:06:23,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:06:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:06:24,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:25,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:06:25,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:06:27,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:06:28,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:06:28,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:28,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:06:30,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:06:34,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:06:34,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:06:37,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:06:37,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:37,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:06:37,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 12:06:37,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:37,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=873740.0, ans=0.125 2023-10-02 12:06:39,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:41,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 12:06:44,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=873740.0, ans=0.0 2023-10-02 12:06:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:48,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:06:49,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:50,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 12:06:52,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:06:54,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 12:06:54,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:06:55,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:06:55,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:06:58,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 12:07:00,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:00,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.07 vs. limit=10.0 2023-10-02 12:07:04,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:05,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 12:07:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:11,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:07:11,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 12:07:12,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-02 12:07:18,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 12:07:18,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:07:18,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:07:21,697 INFO [train.py:1046] (2/4) Epoch 25, batch 3600, loss[loss=0.1542, simple_loss=0.2279, pruned_loss=0.04023, over 24460.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.245, pruned_loss=0.04539, over 4733822.09 frames. ], batch size: 58, lr: 4.09e-03, grad_scale: 16.0 2023-10-02 12:07:21,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:21,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:23,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:07:28,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:07:31,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:31,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:07:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:07:32,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:32,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 12:07:36,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:07:36,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:39,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:07:42,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:07:42,438 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:07:43,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:07:43,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:07:43,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 12:07:43,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:07:45,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=874006.6666666666, ans=0.125 2023-10-02 12:07:47,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:48,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:07:50,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:50,365 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:07:51,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=874073.3333333334, ans=0.125 2023-10-02 12:07:52,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:07:52,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:07:54,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 12:07:55,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=874073.3333333334, ans=0.09899494936611666 2023-10-02 12:08:02,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:03,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:08:03,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 12:08:07,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:08:08,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=874140.0, ans=0.0 2023-10-02 12:08:11,621 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.785e+02 1.913e+02 2.150e+02 3.207e+02, threshold=3.826e+02, percent-clipped=0.0 2023-10-02 12:08:11,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:13,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:13,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=874140.0, ans=0.05 2023-10-02 12:08:20,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:08:20,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:08:20,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 12:08:22,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 12:08:23,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 12:08:27,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:08:27,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:08:27,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 12:08:29,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:08:29,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:08:29,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:30,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 12:08:30,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 12:08:34,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:35,745 INFO [train.py:1046] (2/4) Epoch 25, batch 3650, loss[loss=0.166, simple_loss=0.2467, pruned_loss=0.04265, over 23320.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2459, pruned_loss=0.04586, over 4734376.96 frames. ], batch size: 119, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:08:35,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 12:08:37,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=874273.3333333334, ans=0.0 2023-10-02 12:08:39,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 12:08:42,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:08:45,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 12:08:46,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 12:08:51,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.14 vs. limit=15.0 2023-10-02 12:08:52,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:08:52,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:08:52,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:08:52,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=874340.0, ans=0.125 2023-10-02 12:08:55,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 12:08:55,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:56,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 12:08:58,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:08:58,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:08:58,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 12:09:00,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:09:00,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:00,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:03,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:09:06,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 12:09:07,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 12:09:09,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:09:10,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 12:09:10,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=874406.6666666666, ans=0.0 2023-10-02 12:09:11,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:09:13,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:09:17,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:09:18,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:18,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:09:20,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:09:21,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:09:24,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:09:24,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=874473.3333333334, ans=0.2 2023-10-02 12:09:26,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:09:28,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:28,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:09:29,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:09:31,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:31,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:09:36,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=874540.0, ans=0.125 2023-10-02 12:09:37,765 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 12:09:39,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:39,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:09:40,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:09:40,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:42,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:09:43,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:44,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 12:09:44,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:45,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=874540.0, ans=0.0 2023-10-02 12:09:45,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=874540.0, ans=0.2 2023-10-02 12:09:46,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:09:49,164 INFO [train.py:1046] (2/4) Epoch 25, batch 3700, loss[loss=0.1551, simple_loss=0.2338, pruned_loss=0.03819, over 21610.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2465, pruned_loss=0.04617, over 4735133.45 frames. ], batch size: 47, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:09:49,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:09:49,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:09:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 12:09:52,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:52,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:09:52,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:09:54,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=874606.6666666666, ans=0.125 2023-10-02 12:09:55,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:09:59,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:59,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:09:59,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:10:01,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:10:01,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:10:05,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:06,567 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 12:10:14,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:10:14,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:10:16,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:10:16,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 12:10:16,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:10:19,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:20,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 12:10:22,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:24,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:10:26,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:26,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:10:29,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:10:32,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:10:32,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 12:10:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 12:10:36,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=874806.6666666666, ans=0.125 2023-10-02 12:10:38,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:10:38,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:10:40,221 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.959e+02 2.210e+02 2.685e+02 4.346e+02, threshold=4.420e+02, percent-clipped=1.0 2023-10-02 12:10:41,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:10:43,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 12:10:45,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:10:45,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:10:45,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:10:45,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:10:50,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:10:50,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 12:10:51,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.77 vs. limit=15.0 2023-10-02 12:10:51,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 12:10:53,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:10:53,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:10:54,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:10:54,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:10:57,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:59,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:11:00,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:04,088 INFO [train.py:1046] (2/4) Epoch 25, batch 3750, loss[loss=0.1811, simple_loss=0.247, pruned_loss=0.05763, over 22670.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2476, pruned_loss=0.0461, over 4746513.79 frames. ], batch size: 322, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:11:04,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 12:11:05,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 12:11:07,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:11:08,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 12:11:08,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:11:10,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:11:11,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:11:11,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=874940.0, ans=0.1 2023-10-02 12:11:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:11:15,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:11:18,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:11:20,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:11:21,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:11:24,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:11:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 12:11:25,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:11:25,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:11:27,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:11:27,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=875006.6666666666, ans=0.125 2023-10-02 12:11:30,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 12:11:33,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 12:11:34,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:11:34,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=875073.3333333334, ans=0.125 2023-10-02 12:11:35,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:11:38,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:11:44,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:44,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 12:11:49,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 12:11:52,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:52,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=875140.0, ans=0.125 2023-10-02 12:11:54,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:11:56,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:11:59,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:12:05,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:12:06,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:12:08,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:12:09,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:12:11,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:12:17,199 INFO [train.py:1046] (2/4) Epoch 25, batch 3800, loss[loss=0.1589, simple_loss=0.2294, pruned_loss=0.04418, over 23515.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2471, pruned_loss=0.04599, over 4737687.17 frames. ], batch size: 134, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:12:19,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:12:23,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:24,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:12:25,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 12:12:25,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:12:29,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:12:29,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:12:31,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 12:12:31,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:33,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:12:34,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=875340.0, ans=0.0 2023-10-02 12:12:35,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:12:35,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:12:35,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:36,157 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.49 vs. limit=15.0 2023-10-02 12:12:36,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 12:12:39,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 12:12:41,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:12:42,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:12:43,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:12:43,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:12:46,060 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:12:47,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:12:47,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:50,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:51,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:55,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:12:55,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 12:12:57,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.72 vs. limit=15.0 2023-10-02 12:12:58,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:13:05,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:13:08,021 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.870e+02 2.050e+02 2.289e+02 3.047e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 12:13:10,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:13:12,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 12:13:13,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 12:13:13,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:13:15,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=875540.0, ans=0.1 2023-10-02 12:13:16,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:13:18,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:20,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 12:13:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 12:13:23,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 12:13:23,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:24,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:13:27,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=875540.0, ans=0.0 2023-10-02 12:13:28,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:13:30,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:13:31,975 INFO [train.py:1046] (2/4) Epoch 25, batch 3850, loss[loss=0.1631, simple_loss=0.2509, pruned_loss=0.03764, over 24628.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2456, pruned_loss=0.04531, over 4738954.02 frames. ], batch size: 68, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:13:34,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:13:36,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 12:13:38,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:13:38,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:41,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:13:42,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:13:45,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:13:47,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 12:13:52,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:13:52,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:13:55,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=875673.3333333334, ans=0.1 2023-10-02 12:13:56,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:13:57,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=875673.3333333334, ans=0.0 2023-10-02 12:13:59,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:00,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:14:01,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:02,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:14:03,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:05,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:07,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:07,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:14:08,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 12:14:08,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 12:14:09,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:14:09,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:11,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:11,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:11,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 12:14:14,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 12:14:14,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:15,468 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=12.0 2023-10-02 12:14:19,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 12:14:20,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:14:22,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=875806.6666666666, ans=0.1 2023-10-02 12:14:23,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:25,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:29,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:29,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 12:14:32,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 12:14:34,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:34,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:37,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=875873.3333333334, ans=0.125 2023-10-02 12:14:38,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:14:38,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:14:38,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:39,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:39,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:14:39,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 12:14:39,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=875873.3333333334, ans=0.125 2023-10-02 12:14:41,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:14:42,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 12:14:42,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:42,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:44,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:14:45,830 INFO [train.py:1046] (2/4) Epoch 25, batch 3900, loss[loss=0.1636, simple_loss=0.2541, pruned_loss=0.03655, over 24633.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2445, pruned_loss=0.04528, over 4710038.34 frames. ], batch size: 73, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:14:45,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:47,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:14:47,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:48,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:48,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:14:49,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 12:14:50,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:53,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:14:53,630 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:14:54,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:14:55,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=875940.0, ans=0.07 2023-10-02 12:14:56,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:14:57,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:15:00,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:15:00,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:15:00,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:15:00,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=876006.6666666666, ans=0.0 2023-10-02 12:15:02,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 12:15:02,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:15:04,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 12:15:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:15:04,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=876006.6666666666, ans=0.125 2023-10-02 12:15:05,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 12:15:07,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 12:15:10,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:15:11,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:15:11,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:15:11,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:17,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:15:19,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=22.5 2023-10-02 12:15:19,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.89 vs. limit=15.0 2023-10-02 12:15:19,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:15:21,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:15:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:15:22,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:15:30,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:15:30,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:15:37,488 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.823e+02 1.992e+02 2.120e+02 3.702e+02, threshold=3.983e+02, percent-clipped=0.0 2023-10-02 12:15:37,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:15:38,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:15:46,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:15:51,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:51,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 12:15:51,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 12:15:51,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:52,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 12:15:54,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:15:55,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 12:15:59,866 INFO [train.py:1046] (2/4) Epoch 25, batch 3950, loss[loss=0.1805, simple_loss=0.2592, pruned_loss=0.05086, over 23766.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2442, pruned_loss=0.04546, over 4699560.48 frames. ], batch size: 85, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:16:02,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:16:04,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 12:16:04,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:16:07,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:16:08,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:16:12,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=876340.0, ans=0.1 2023-10-02 12:16:13,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.60 vs. limit=15.0 2023-10-02 12:16:14,633 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 12:16:15,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:16:15,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 12:16:16,061 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 12:16:16,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:16:18,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:16:18,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:16:18,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:16:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 12:16:26,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:16:26,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:16:26,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:16:28,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:16:28,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:16:34,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=876406.6666666666, ans=0.2 2023-10-02 12:16:37,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:16:37,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:16:41,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 12:16:46,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=876473.3333333334, ans=0.025 2023-10-02 12:16:47,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 12:16:47,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 12:16:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:16:49,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:16:55,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.46 vs. limit=5.0 2023-10-02 12:16:55,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:16:57,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:16:57,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:16:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:16:57,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 12:17:02,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:17:04,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:17:08,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 12:17:13,733 INFO [train.py:1046] (2/4) Epoch 25, batch 4000, loss[loss=0.1591, simple_loss=0.2412, pruned_loss=0.03847, over 24636.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2455, pruned_loss=0.04584, over 4708475.10 frames. ], batch size: 65, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:17:18,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=876606.6666666666, ans=0.0 2023-10-02 12:17:19,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:27,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:30,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:17:30,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:17:32,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:32,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 12:17:33,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:17:33,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 12:17:34,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:17:34,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 12:17:34,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:17:37,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=876673.3333333334, ans=0.09899494936611666 2023-10-02 12:17:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:17:37,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:17:37,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:17:39,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:17:39,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:17:40,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:17:42,100 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 12:17:42,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:17:43,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:17:44,959 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 12:17:45,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:17:46,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:17:47,072 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.97 vs. limit=15.0 2023-10-02 12:17:52,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 12:17:52,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:17:55,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:17:57,281 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 12:17:57,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:17:58,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 12:17:58,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:18:00,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:18:01,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:18:02,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:18:02,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:18:04,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:18:05,456 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.899e+02 2.057e+02 2.287e+02 3.112e+02, threshold=4.114e+02, percent-clipped=0.0 2023-10-02 12:18:07,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 12:18:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:18:10,007 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 12:18:10,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=876806.6666666666, ans=0.0 2023-10-02 12:18:14,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:18:16,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 12:18:19,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:18:19,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:18:19,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:18:21,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:18:24,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=876873.3333333334, ans=0.5 2023-10-02 12:18:25,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:18:26,800 INFO [train.py:1046] (2/4) Epoch 25, batch 4050, loss[loss=0.1788, simple_loss=0.2512, pruned_loss=0.05321, over 23460.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2459, pruned_loss=0.04587, over 4715369.43 frames. ], batch size: 285, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:18:28,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:18:30,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 12:18:31,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:18:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:18:32,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:18:34,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:18:35,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:18:40,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:18:41,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:18:43,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:18:45,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:18:46,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=877006.6666666666, ans=0.0 2023-10-02 12:18:47,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:18:51,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:18:53,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:18:54,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 12:18:54,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=877073.3333333334, ans=0.125 2023-10-02 12:18:56,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=877073.3333333334, ans=0.0 2023-10-02 12:18:57,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 12:18:57,595 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 12:18:59,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=877073.3333333334, ans=0.0 2023-10-02 12:19:00,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:19:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 12:19:08,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:19:10,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:19:14,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:19:14,918 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-10-02 12:19:15,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:19:15,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:19:17,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:19:20,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 12:19:20,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:19:21,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:19:24,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 12:19:27,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:19:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 12:19:36,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:19:36,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:19:39,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 12:19:39,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 12:19:39,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:40,393 INFO [train.py:1046] (2/4) Epoch 25, batch 4100, loss[loss=0.1589, simple_loss=0.2396, pruned_loss=0.0391, over 24488.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2465, pruned_loss=0.04597, over 4720460.82 frames. ], batch size: 63, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:19:40,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:19:42,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:42,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:19:48,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 12:19:49,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 12:19:51,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 12:19:54,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 12:19:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:55,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:55,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:55,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:19:55,677 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 12:19:57,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=877340.0, ans=0.125 2023-10-02 12:19:58,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:19:59,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:19:59,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:59,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:20:04,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:20:05,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:20:05,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:20:05,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 12:20:07,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:20:07,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:20:07,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:20:07,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:20:07,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=877340.0, ans=0.2 2023-10-02 12:20:09,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 12:20:13,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:14,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 12:20:16,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:20:17,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:20:17,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 12:20:17,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=877406.6666666666, ans=0.0 2023-10-02 12:20:18,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:20:18,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:20:20,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:20:22,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 12:20:23,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:20:23,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:20:25,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 12:20:26,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:20:26,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:20:30,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:33,235 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.821e+02 2.033e+02 2.340e+02 3.969e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 12:20:35,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.22 vs. limit=6.0 2023-10-02 12:20:36,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:20:38,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:20:38,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:20:45,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:20:45,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:48,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=877540.0, ans=0.125 2023-10-02 12:20:50,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:20:53,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:20:54,537 INFO [train.py:1046] (2/4) Epoch 25, batch 4150, loss[loss=0.1419, simple_loss=0.2217, pruned_loss=0.03111, over 24330.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2466, pruned_loss=0.04613, over 4714813.30 frames. ], batch size: 56, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:20:57,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:20:57,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:20:58,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:20:58,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:21:02,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 12:21:02,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:21:03,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 12:21:04,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 12:21:04,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 12:21:06,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:21:10,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:21:10,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:21:14,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:14,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:21:14,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=877673.3333333334, ans=0.125 2023-10-02 12:21:15,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:21:18,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:21:18,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:21:19,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:21:24,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:21:27,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=877740.0, ans=0.025 2023-10-02 12:21:28,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:21:28,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 12:21:29,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 12:21:29,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:21:31,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 12:21:31,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:21:31,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:21:34,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:35,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:38,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 12:21:41,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.24 vs. limit=22.5 2023-10-02 12:21:42,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:21:43,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:21:43,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 12:21:45,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:21:45,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 12:21:48,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:21:49,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:21:51,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:52,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 12:21:52,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:21:52,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:21:53,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:21:55,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 12:21:55,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:55,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:21:55,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:21:56,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 12:21:57,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:57,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:21:57,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=877873.3333333334, ans=0.1 2023-10-02 12:21:59,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:22:00,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:22:01,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 12:22:01,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:22:02,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=877873.3333333334, ans=0.1 2023-10-02 12:22:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:22:08,733 INFO [train.py:1046] (2/4) Epoch 25, batch 4200, loss[loss=0.1656, simple_loss=0.2487, pruned_loss=0.04123, over 24606.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2456, pruned_loss=0.04627, over 4715427.94 frames. ], batch size: 68, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:22:08,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 12:22:10,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:22:10,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=877940.0, ans=0.125 2023-10-02 12:22:12,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:22:13,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:22:15,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:22:15,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:22:18,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 12:22:18,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=877940.0, ans=0.125 2023-10-02 12:22:19,750 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.54 vs. limit=5.0 2023-10-02 12:22:21,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 12:22:22,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:24,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:22:28,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:22:29,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=878006.6666666666, ans=0.05 2023-10-02 12:22:30,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:22:32,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:22:34,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:34,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 12:22:34,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:22:35,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:35,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:22:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:22:38,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:22:39,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 12:22:40,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:44,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:22:44,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:22:45,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.90 vs. limit=22.5 2023-10-02 12:22:46,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:22:49,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:22:51,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:22:51,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 12:22:52,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:22:52,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:22:57,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:22:59,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:23:01,935 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.878e+02 2.060e+02 2.306e+02 3.122e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-02 12:23:05,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:23:07,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 12:23:09,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:23:15,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:23:15,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:18,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 12:23:22,996 INFO [train.py:1046] (2/4) Epoch 25, batch 4250, loss[loss=0.1557, simple_loss=0.2261, pruned_loss=0.0427, over 24281.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2447, pruned_loss=0.0459, over 4728042.77 frames. ], batch size: 56, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:23:23,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:23:27,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:23:27,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:23:30,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:34,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:23:34,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 12:23:34,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:23:36,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:39,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:23:43,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:43,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:45,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:23:45,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:23:47,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:47,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:47,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=878340.0, ans=0.0 2023-10-02 12:23:48,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:23:53,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:23:54,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 12:23:57,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 12:23:58,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:58,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:23:58,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:58,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:23:58,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:00,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:24:04,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:24:04,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:24:08,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:24:10,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:11,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 12:24:11,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:24:13,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 12:24:13,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:24:15,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:24:16,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.03 vs. limit=22.5 2023-10-02 12:24:17,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:17,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:24:21,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 12:24:23,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:24:24,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:24:26,064 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-10-02 12:24:28,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:31,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:32,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:24:34,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:24:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:24:37,122 INFO [train.py:1046] (2/4) Epoch 25, batch 4300, loss[loss=0.1885, simple_loss=0.2697, pruned_loss=0.05366, over 24126.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2446, pruned_loss=0.04585, over 4715303.58 frames. ], batch size: 80, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:24:37,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:24:38,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:24:38,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 12:24:40,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:24:41,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=878606.6666666666, ans=0.2 2023-10-02 12:24:45,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:24:46,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:24:48,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:24:50,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=878673.3333333334, ans=0.125 2023-10-02 12:24:57,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:57,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 12:24:58,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:24:59,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:25:01,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:25:01,273 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 12:25:03,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:25:04,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:25:06,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=878740.0, ans=0.125 2023-10-02 12:25:07,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 12:25:07,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:25:07,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 12:25:10,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:25:11,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:25:14,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:25:14,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:25:14,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:25:16,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:25:16,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:25:17,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 12:25:17,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 12:25:19,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:25:22,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:22,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:25:22,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:23,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:25:23,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 12:25:23,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 12:25:25,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 12:25:25,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:25:27,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 12:25:27,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 12:25:30,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=878806.6666666666, ans=0.0 2023-10-02 12:25:30,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=878806.6666666666, ans=0.2 2023-10-02 12:25:30,781 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=12.0 2023-10-02 12:25:31,176 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.823e+02 1.995e+02 2.293e+02 3.439e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-02 12:25:31,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:25:33,192 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 12:25:34,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:25:36,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:25:36,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:25:38,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 12:25:38,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:25:38,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:40,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:25:40,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:25:40,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:25:42,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:25:46,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:25:47,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:47,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:25:51,629 INFO [train.py:1046] (2/4) Epoch 25, batch 4350, loss[loss=0.1937, simple_loss=0.2784, pruned_loss=0.05449, over 23724.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2463, pruned_loss=0.04638, over 4723252.93 frames. ], batch size: 85, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:25:53,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 12:25:53,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:25:57,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:01,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:04,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:26:04,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:26:07,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:26:08,344 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.46 vs. limit=15.0 2023-10-02 12:26:09,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=879006.6666666666, ans=0.1 2023-10-02 12:26:11,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:14,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:26:14,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:26:14,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=879006.6666666666, ans=0.125 2023-10-02 12:26:17,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:26:19,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:26:20,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:26:20,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=879073.3333333334, ans=0.0 2023-10-02 12:26:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 12:26:27,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:27,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=879073.3333333334, ans=0.125 2023-10-02 12:26:29,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:33,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:35,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 12:26:38,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:26:38,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=879140.0, ans=0.2 2023-10-02 12:26:39,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:26:42,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 12:26:45,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:26:46,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:26:46,917 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 12:26:48,955 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 12:26:48,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:26:48,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:49,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:26:50,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:26:51,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:26:51,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:26:53,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=879206.6666666666, ans=0.0 2023-10-02 12:26:54,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 12:26:54,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:54,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:26:54,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:54,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 12:26:55,959 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 12:26:55,963 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 12:26:55,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 12:26:58,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:59,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:26:59,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:00,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:27:02,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 12:27:04,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.81 vs. limit=22.5 2023-10-02 12:27:05,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 12:27:05,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:06,762 INFO [train.py:1046] (2/4) Epoch 25, batch 4400, loss[loss=0.1466, simple_loss=0.2233, pruned_loss=0.03496, over 18084.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2468, pruned_loss=0.0463, over 4721385.02 frames. ], batch size: 39, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:27:09,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:27:09,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:11,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:27:13,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 12:27:13,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 12:27:13,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=879273.3333333334, ans=0.0 2023-10-02 12:27:15,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 12:27:15,104 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 12:27:15,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:27:15,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:27:17,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 12:27:18,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:18,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=879273.3333333334, ans=0.125 2023-10-02 12:27:19,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:21,335 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 12:27:22,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 12:27:22,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 12:27:25,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 12:27:26,180 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-02 12:27:26,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 12:27:26,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 12:27:28,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:30,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:27:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:27:31,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:27:33,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 12:27:34,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 12:27:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:37,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:27:37,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:38,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:39,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:39,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 12:27:41,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 12:27:44,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:46,528 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.38 vs. limit=15.0 2023-10-02 12:27:48,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=879473.3333333334, ans=0.1 2023-10-02 12:27:51,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:27:52,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 12:27:55,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:27:56,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=879473.3333333334, ans=0.125 2023-10-02 12:27:57,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:28:00,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.859e+02 2.058e+02 2.291e+02 3.192e+02, threshold=4.115e+02, percent-clipped=0.0 2023-10-02 12:28:02,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:28:04,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 12:28:04,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:28:04,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:28:04,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:28:04,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:28:08,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 12:28:11,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 12:28:13,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 12:28:13,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:13,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 12:28:13,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:28:15,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:28:18,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 12:28:18,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=879606.6666666666, ans=0.125 2023-10-02 12:28:19,786 INFO [train.py:1046] (2/4) Epoch 25, batch 4450, loss[loss=0.1921, simple_loss=0.2662, pruned_loss=0.05902, over 22839.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.248, pruned_loss=0.04659, over 4707944.80 frames. ], batch size: 322, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:28:22,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:28:26,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:26,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:28:26,742 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.72 vs. limit=10.0 2023-10-02 12:28:34,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:28:34,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:28:37,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.50 vs. limit=22.5 2023-10-02 12:28:38,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:40,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:28:40,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=879673.3333333334, ans=0.125 2023-10-02 12:28:42,211 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.51 vs. limit=15.0 2023-10-02 12:28:42,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:28:42,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:42,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 12:28:42,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:28:44,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:44,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:28:44,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:28:47,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:28:51,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:28:51,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:28:53,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:28:54,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:55,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=879740.0, ans=0.0 2023-10-02 12:28:56,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:29:01,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:29:02,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 12:29:02,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 12:29:02,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:29:06,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:29:06,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 12:29:09,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:29:12,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:29:12,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=879806.6666666666, ans=0.2 2023-10-02 12:29:13,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 12:29:13,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:13,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:29:14,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:29:14,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:29:15,551 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.62 vs. limit=10.0 2023-10-02 12:29:16,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:29:19,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:29:19,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 12:29:19,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=879873.3333333334, ans=0.05 2023-10-02 12:29:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:29:22,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.78 vs. limit=15.0 2023-10-02 12:29:23,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:29:25,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:29:25,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:25,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:29:27,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:29:30,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 12:29:33,405 INFO [train.py:1046] (2/4) Epoch 25, batch 4500, loss[loss=0.1757, simple_loss=0.2576, pruned_loss=0.04693, over 23704.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2484, pruned_loss=0.04689, over 4705383.88 frames. ], batch size: 85, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:29:33,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:29:36,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:29:39,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 12:29:39,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 12:29:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:29:44,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:45,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:29:49,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:29:50,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:29:50,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:29:50,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:30:00,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:01,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:30:02,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:30:02,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=880006.6666666666, ans=0.1 2023-10-02 12:30:04,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:30:06,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:30:12,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:30:16,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:30:21,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:30:22,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:30:22,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 12:30:24,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:25,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:30:28,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:30:28,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:30:30,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:30:31,327 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.866e+02 2.072e+02 2.376e+02 3.335e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-02 12:30:31,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 12:30:31,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:30:31,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:36,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:30:36,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:30:37,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:38,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.85 vs. limit=15.0 2023-10-02 12:30:40,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:30:41,556 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.00 vs. limit=22.5 2023-10-02 12:30:42,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:30:42,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 12:30:45,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 12:30:45,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 12:30:47,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.48 vs. limit=15.0 2023-10-02 12:30:48,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 12:30:50,984 INFO [train.py:1046] (2/4) Epoch 25, batch 4550, loss[loss=0.1656, simple_loss=0.2512, pruned_loss=0.03996, over 24466.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2467, pruned_loss=0.04671, over 4705538.97 frames. ], batch size: 66, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:30:51,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 12:30:52,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:30:55,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:56,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:59,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:01,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=880273.3333333334, ans=0.125 2023-10-02 12:31:03,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:31:05,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=880340.0, ans=0.0 2023-10-02 12:31:06,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:31:06,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:06,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:31:06,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:06,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=880340.0, ans=0.125 2023-10-02 12:31:10,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:11,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:31:15,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:31:15,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=880340.0, ans=0.125 2023-10-02 12:31:16,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 12:31:18,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 12:31:18,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:31:19,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 12:31:21,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=880406.6666666666, ans=0.125 2023-10-02 12:31:22,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 12:31:22,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:31:27,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 12:31:29,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:31:31,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:32,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:32,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:31:34,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 12:31:37,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:31:37,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=880473.3333333334, ans=0.125 2023-10-02 12:31:38,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:38,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:31:38,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=880473.3333333334, ans=0.125 2023-10-02 12:31:40,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:42,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 12:31:43,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 12:31:43,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:31:45,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 12:31:48,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 12:31:48,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:49,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:49,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:31:51,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:51,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:31:51,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=880540.0, ans=0.0 2023-10-02 12:31:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:31:53,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 12:31:56,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:31:56,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 12:31:58,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 12:31:58,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:31:58,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 12:31:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:32:01,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:32:02,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:32:02,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:32:04,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:32:05,470 INFO [train.py:1046] (2/4) Epoch 25, batch 4600, loss[loss=0.1738, simple_loss=0.2591, pruned_loss=0.04421, over 24635.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2454, pruned_loss=0.04604, over 4712048.03 frames. ], batch size: 73, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:32:05,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:32:06,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:32:08,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:09,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:32:11,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:32:11,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:32:12,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=880606.6666666666, ans=0.0 2023-10-02 12:32:13,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:15,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 12:32:16,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:32:19,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:32:21,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:23,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:30,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 12:32:31,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:32,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:33,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=880673.3333333334, ans=0.125 2023-10-02 12:32:37,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:32:37,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:40,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 12:32:40,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:32:41,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:32:43,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.28 vs. limit=15.0 2023-10-02 12:32:46,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:47,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:32:49,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:32:52,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 12:32:53,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:32:57,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:32:58,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:00,730 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.837e+02 2.008e+02 2.202e+02 3.381e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 12:33:00,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:00,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 12:33:00,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:02,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 12:33:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:02,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:03,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:03,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:33:05,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:05,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 12:33:06,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 12:33:06,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 12:33:06,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:08,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:33:08,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:09,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:10,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=880873.3333333334, ans=0.0 2023-10-02 12:33:19,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:33:20,345 INFO [train.py:1046] (2/4) Epoch 25, batch 4650, loss[loss=0.1509, simple_loss=0.2309, pruned_loss=0.03545, over 24606.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2451, pruned_loss=0.04641, over 4695926.12 frames. ], batch size: 60, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:33:21,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:33:23,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:23,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:33:23,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:23,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:33:24,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:27,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 12:33:30,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:33:31,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 12:33:31,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:33:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 12:33:33,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:33:33,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 12:33:33,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=881006.6666666666, ans=10.0 2023-10-02 12:33:35,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 12:33:35,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:35,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:33:36,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=881006.6666666666, ans=0.125 2023-10-02 12:33:38,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:33:39,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:39,473 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 12:33:40,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:42,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 12:33:45,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=881006.6666666666, ans=0.125 2023-10-02 12:33:47,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:47,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:33:48,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 12:33:48,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:33:52,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:33:53,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=881073.3333333334, ans=0.2 2023-10-02 12:33:57,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:03,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:04,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:34:06,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:06,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:34:07,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 12:34:09,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 12:34:10,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 12:34:10,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 12:34:11,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:13,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.08 vs. limit=22.5 2023-10-02 12:34:16,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:34:16,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:34:16,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 12:34:16,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:18,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:34:18,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:34:19,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:34:22,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:34:22,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:34:23,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:34:26,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:27,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=881206.6666666666, ans=0.1 2023-10-02 12:34:27,180 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:34:27,511 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.49 vs. limit=22.5 2023-10-02 12:34:28,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:34:28,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:34:28,863 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-10-02 12:34:29,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 12:34:31,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:34:32,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 12:34:33,835 INFO [train.py:1046] (2/4) Epoch 25, batch 4700, loss[loss=0.1679, simple_loss=0.255, pruned_loss=0.04039, over 24324.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2453, pruned_loss=0.04615, over 4714002.10 frames. ], batch size: 74, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:34:39,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:39,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:40,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:34:40,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=881273.3333333334, ans=0.1 2023-10-02 12:34:41,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:34:42,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:34:46,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 12:34:46,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 12:34:49,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:50,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:34:50,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:54,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:56,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=881340.0, ans=0.125 2023-10-02 12:34:59,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:35:00,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:35:03,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:35:09,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 12:35:11,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:35:13,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:17,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 12:35:18,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:35:22,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:35:24,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 12:35:27,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:27,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:28,469 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.857e+02 2.002e+02 2.228e+02 2.822e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 12:35:30,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:35:30,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:35:30,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 12:35:32,030 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 12:35:33,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:33,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=881540.0, ans=0.125 2023-10-02 12:35:34,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:34,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:34,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 12:35:34,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:37,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 12:35:42,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:35:43,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:35:44,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.43 vs. limit=12.0 2023-10-02 12:35:46,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:35:48,461 INFO [train.py:1046] (2/4) Epoch 25, batch 4750, loss[loss=0.1507, simple_loss=0.2272, pruned_loss=0.03707, over 24299.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2461, pruned_loss=0.04621, over 4720326.48 frames. ], batch size: 56, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:35:48,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:35:50,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 12:35:51,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:35:54,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 12:35:57,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:35:57,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:57,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=881606.6666666666, ans=0.1 2023-10-02 12:35:58,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:04,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 12:36:08,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:36:09,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 12:36:10,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=881673.3333333334, ans=0.0 2023-10-02 12:36:11,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:14,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=881673.3333333334, ans=0.0 2023-10-02 12:36:15,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:36:15,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:36:15,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:36:17,333 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 12:36:17,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 12:36:22,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 12:36:25,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:36:26,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:36:29,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:36:29,297 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 12:36:29,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:36:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:36:34,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.87 vs. limit=22.5 2023-10-02 12:36:35,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:36:38,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 12:36:38,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 12:36:39,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:36:39,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:36:39,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:36:40,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:36:42,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 12:36:43,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 12:36:46,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:36:48,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.98 vs. limit=15.0 2023-10-02 12:36:49,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:36:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 12:36:51,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:51,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=881873.3333333334, ans=0.0 2023-10-02 12:36:52,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:36:53,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:36:55,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:36:55,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:36:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:36:59,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 12:37:01,143 INFO [train.py:1046] (2/4) Epoch 25, batch 4800, loss[loss=0.167, simple_loss=0.2503, pruned_loss=0.04188, over 24551.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2466, pruned_loss=0.04624, over 4727635.42 frames. ], batch size: 71, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:37:01,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 12:37:02,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 12:37:03,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:37:05,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:37:05,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 12:37:11,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:11,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:16,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:37:18,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:18,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:18,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 12:37:19,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:37:19,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:37:21,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:37:23,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=882006.6666666666, ans=0.0 2023-10-02 12:37:25,148 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.85 vs. limit=15.0 2023-10-02 12:37:25,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:37:27,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:27,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:37:29,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:29,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 12:37:29,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:31,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:32,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:35,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:36,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:36,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:37:36,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:37:37,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=882073.3333333334, ans=0.125 2023-10-02 12:37:37,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=882073.3333333334, ans=0.125 2023-10-02 12:37:38,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:41,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 12:37:41,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 12:37:42,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:42,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:37:44,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:37:44,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:37:44,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:37:44,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:37:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:37:50,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:37:53,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:37:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:37:55,796 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.900e+02 2.147e+02 2.539e+02 4.141e+02, threshold=4.294e+02, percent-clipped=1.0 2023-10-02 12:37:59,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 12:37:59,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:38:00,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:00,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:38:00,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:38:05,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:38:06,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:38:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:07,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:38:08,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:38:09,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:38:09,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=882206.6666666666, ans=0.0 2023-10-02 12:38:12,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:12,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:12,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:38:14,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 12:38:15,489 INFO [train.py:1046] (2/4) Epoch 25, batch 4850, loss[loss=0.1509, simple_loss=0.2411, pruned_loss=0.0304, over 24312.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2467, pruned_loss=0.04618, over 4732223.01 frames. ], batch size: 74, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:38:15,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 12:38:15,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:38:15,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:38:16,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:38:16,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:19,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:38:27,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 12:38:27,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:34,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:38:34,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:38:34,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:37,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:38,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:38:38,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:38:38,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 12:38:42,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:38:44,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:38:44,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:38:46,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:38:46,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 12:38:48,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:38:49,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:38:53,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:38:53,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 12:38:53,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 12:38:53,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=882406.6666666666, ans=0.125 2023-10-02 12:38:55,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:39:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:39:02,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 12:39:03,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:39:03,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:39:04,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=882473.3333333334, ans=0.125 2023-10-02 12:39:05,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:39:05,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.31 vs. limit=22.5 2023-10-02 12:39:06,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 12:39:06,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:39:08,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 12:39:08,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:09,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:39:09,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 12:39:18,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:39:18,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=882540.0, ans=0.1 2023-10-02 12:39:24,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:39:24,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:39:28,605 INFO [train.py:1046] (2/4) Epoch 25, batch 4900, loss[loss=0.1614, simple_loss=0.221, pruned_loss=0.05092, over 23480.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2455, pruned_loss=0.04586, over 4713949.72 frames. ], batch size: 285, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:39:28,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 12:39:28,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:39:33,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=882606.6666666666, ans=0.0 2023-10-02 12:39:34,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:39:34,966 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:39:36,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:36,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:39:38,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 12:39:41,054 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.59 vs. limit=15.0 2023-10-02 12:39:43,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 12:39:47,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 12:39:47,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 12:39:48,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=882673.3333333334, ans=0.2 2023-10-02 12:39:48,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=882673.3333333334, ans=0.125 2023-10-02 12:39:49,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:39:49,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:49,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:39:49,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:39:49,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:39:51,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 12:39:53,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 12:39:54,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=882673.3333333334, ans=0.0 2023-10-02 12:39:55,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:39:56,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:39:57,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:40:01,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:40:01,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:04,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:04,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 12:40:04,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:40:05,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:40:05,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 12:40:05,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 12:40:09,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=882740.0, ans=0.2 2023-10-02 12:40:10,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 12:40:11,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:40:11,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=882806.6666666666, ans=0.0 2023-10-02 12:40:12,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:40:14,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:40:14,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:14,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 12:40:14,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:40:14,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 12:40:17,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:18,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:40:22,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:40:23,574 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.865e+02 2.106e+02 2.440e+02 3.328e+02, threshold=4.212e+02, percent-clipped=0.0 2023-10-02 12:40:25,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 12:40:26,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:40:27,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 12:40:27,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 12:40:33,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:40:35,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:40:37,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 12:40:37,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:40:37,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:40:37,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=882873.3333333334, ans=0.1 2023-10-02 12:40:38,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:39,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.19 vs. limit=15.0 2023-10-02 12:40:42,661 INFO [train.py:1046] (2/4) Epoch 25, batch 4950, loss[loss=0.1459, simple_loss=0.2331, pruned_loss=0.02933, over 24512.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2452, pruned_loss=0.04542, over 4712533.10 frames. ], batch size: 66, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:40:42,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:40:42,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:40:44,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:40:44,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 12:40:45,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:40:47,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:40:47,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:40:50,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 12:40:50,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 12:40:50,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:40:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 12:40:53,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:53,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:40:53,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:40:53,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:40:55,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:55,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:40:57,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:40:59,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:40:59,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:59,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:41:01,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:41:07,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:09,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:41:11,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:11,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:14,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:41:15,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 12:41:16,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 12:41:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:20,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:41:20,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:41:21,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:41:22,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:41:22,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:41:24,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:41:25,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:41:28,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:41:29,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:31,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 12:41:31,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:41:33,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:41:37,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:41:39,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:41:39,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:41:39,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:40,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:41:41,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:41:45,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:41:45,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:41:45,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:41:47,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 12:41:52,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:41:56,713 INFO [train.py:1046] (2/4) Epoch 25, batch 5000, loss[loss=0.185, simple_loss=0.2651, pruned_loss=0.05244, over 23987.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2445, pruned_loss=0.04536, over 4705994.56 frames. ], batch size: 86, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:41:58,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 12:41:58,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:42:03,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:42:03,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:42:04,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 12:42:06,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 12:42:06,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:42:10,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 12:42:10,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:42:10,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:42:11,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 12:42:11,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:11,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:42:12,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 12:42:12,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:42:12,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=883340.0, ans=0.0 2023-10-02 12:42:14,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:42:15,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 12:42:17,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 12:42:17,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:42:18,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 12:42:18,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:42:18,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:20,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:42:20,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 12:42:20,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 12:42:20,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 12:42:21,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:21,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:24,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 12:42:24,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:42:24,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=883406.6666666666, ans=0.125 2023-10-02 12:42:24,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=883406.6666666666, ans=0.125 2023-10-02 12:42:26,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:26,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:42:28,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 12:42:30,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 12:42:30,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:42:31,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:42:31,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=883406.6666666666, ans=0.5 2023-10-02 12:42:35,741 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 12:42:39,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:42:39,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:39,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:42:39,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=883473.3333333334, ans=0.0 2023-10-02 12:42:43,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 12:42:43,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:43,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:42:45,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:42:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 12:42:47,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:42:50,602 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.815e+02 1.960e+02 2.170e+02 3.682e+02, threshold=3.920e+02, percent-clipped=0.0 2023-10-02 12:42:50,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:42:50,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:42:56,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 12:43:02,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:04,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=883540.0, ans=0.2 2023-10-02 12:43:09,567 INFO [train.py:1046] (2/4) Epoch 25, batch 5050, loss[loss=0.1813, simple_loss=0.2524, pruned_loss=0.05506, over 23800.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2449, pruned_loss=0.04525, over 4716951.90 frames. ], batch size: 195, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:43:11,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:43:11,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:11,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:43:13,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:43:13,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:43:14,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:43:14,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:18,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 12:43:18,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:43:21,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:43:23,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:43:23,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 12:43:24,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:43:24,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:43:27,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:43:29,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:43:29,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:43:30,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=883673.3333333334, ans=0.125 2023-10-02 12:43:38,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 12:43:38,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:43:40,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:43:40,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 12:43:42,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:43:43,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:43,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:43:45,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:43:45,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 12:43:45,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 12:43:45,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=883740.0, ans=0.125 2023-10-02 12:43:45,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=883740.0, ans=0.1 2023-10-02 12:43:46,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:48,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=883740.0, ans=0.0 2023-10-02 12:43:49,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:43:51,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:51,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 12:43:53,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:43:55,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 12:43:56,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:43:57,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:43:58,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:43:58,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:44:01,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:44:02,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:44:04,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:04,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:44:04,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:44:04,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 12:44:04,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=883806.6666666666, ans=0.2 2023-10-02 12:44:05,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:44:06,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:44:08,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=883873.3333333334, ans=0.125 2023-10-02 12:44:10,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:44:11,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 12:44:11,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:44:13,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:44:13,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=883873.3333333334, ans=0.125 2023-10-02 12:44:13,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.76 vs. limit=15.0 2023-10-02 12:44:14,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 12:44:17,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:44:17,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 12:44:17,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:22,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:44:22,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:22,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 12:44:23,757 INFO [train.py:1046] (2/4) Epoch 25, batch 5100, loss[loss=0.1566, simple_loss=0.2312, pruned_loss=0.04099, over 23681.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2459, pruned_loss=0.04544, over 4707439.21 frames. ], batch size: 149, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:44:23,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 12:44:24,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=883940.0, ans=0.1 2023-10-02 12:44:25,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:26,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:44:26,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:44:29,488 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 12:44:31,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:44:34,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 12:44:35,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 12:44:35,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:37,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:44:38,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:44:39,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 12:44:41,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 12:44:45,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:44:45,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:44:49,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:50,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=884006.6666666666, ans=0.0 2023-10-02 12:44:52,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 12:44:53,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:44:56,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:56,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:44:59,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:44:59,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:44:59,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 12:45:01,293 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 12:45:02,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:45:02,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 12:45:02,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 12:45:05,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:45:05,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=884073.3333333334, ans=0.125 2023-10-02 12:45:09,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=884140.0, ans=0.125 2023-10-02 12:45:12,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:15,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 12:45:15,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 12:45:15,039 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 12:45:17,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 12:45:17,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:45:17,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=884140.0, ans=0.1 2023-10-02 12:45:19,063 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.925e+02 2.148e+02 2.500e+02 3.994e+02, threshold=4.296e+02, percent-clipped=1.0 2023-10-02 12:45:19,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 12:45:23,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 12:45:26,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:45:27,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:45:30,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 12:45:32,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:45:33,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 12:45:37,764 INFO [train.py:1046] (2/4) Epoch 25, batch 5150, loss[loss=0.2151, simple_loss=0.274, pruned_loss=0.07811, over 19219.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2468, pruned_loss=0.04582, over 4702172.98 frames. ], batch size: 388, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:45:37,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:45:37,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:45:37,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:45:39,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:45:39,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:45:39,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:45:40,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 12:45:40,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 12:45:41,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 12:45:41,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:45:42,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=884273.3333333334, ans=0.0 2023-10-02 12:45:43,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 12:45:45,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:46,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 12:45:48,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:45:48,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:45:51,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=884340.0, ans=0.125 2023-10-02 12:45:53,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:45:53,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 12:45:54,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:55,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:45:58,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:45:58,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:45:58,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:45:58,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:45:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:45:59,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 12:46:00,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=884340.0, ans=0.125 2023-10-02 12:46:01,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:46:01,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:46:04,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:46:07,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 12:46:08,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:46:11,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:46:15,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 12:46:18,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:46:24,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:46:25,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:46:29,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:46:29,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:46:31,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 12:46:35,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:46:36,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:46:36,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:46:40,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:46:41,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.23 vs. limit=15.0 2023-10-02 12:46:42,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:46:43,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 12:46:45,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:46:48,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:46:51,464 INFO [train.py:1046] (2/4) Epoch 25, batch 5200, loss[loss=0.1801, simple_loss=0.2698, pruned_loss=0.04524, over 24614.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2474, pruned_loss=0.04622, over 4705619.66 frames. ], batch size: 68, lr: 4.06e-03, grad_scale: 32.0 2023-10-02 12:46:51,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:46:51,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:46:51,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=884606.6666666666, ans=0.0 2023-10-02 12:46:52,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:46:52,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:46:52,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:46:54,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:46:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:46:59,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:47:00,119 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.17 vs. limit=15.0 2023-10-02 12:47:00,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:03,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 12:47:05,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:47:05,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:08,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:09,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:47:09,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:11,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 12:47:12,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:47:14,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:17,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 12:47:20,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:47:21,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:47:22,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=884740.0, ans=0.125 2023-10-02 12:47:23,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 12:47:23,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 12:47:24,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 12:47:26,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:26,072 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 12:47:26,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:29,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:29,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:47:29,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 12:47:30,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:47:31,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:34,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 12:47:34,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 12:47:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 12:47:40,245 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.82 vs. limit=15.0 2023-10-02 12:47:40,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 12:47:40,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:47:44,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:47:46,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:47:48,191 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.909e+02 2.152e+02 2.481e+02 3.751e+02, threshold=4.304e+02, percent-clipped=0.0 2023-10-02 12:47:48,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 12:47:48,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:48,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 12:47:48,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:50,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:47:51,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:47:52,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-10-02 12:47:53,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:47:55,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:56,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.58 vs. limit=15.0 2023-10-02 12:47:57,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:47:57,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:59,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=884873.3333333334, ans=0.0 2023-10-02 12:48:00,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:48:02,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 12:48:03,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:48:04,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:48:06,303 INFO [train.py:1046] (2/4) Epoch 25, batch 5250, loss[loss=0.1713, simple_loss=0.2373, pruned_loss=0.0526, over 23750.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2469, pruned_loss=0.04613, over 4709038.47 frames. ], batch size: 212, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:48:06,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:06,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:48:07,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:48:09,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:48:13,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:48:13,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:48:14,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:48:19,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:48:22,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:48:23,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:48:23,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=885006.6666666666, ans=0.125 2023-10-02 12:48:24,294 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.61 vs. limit=22.5 2023-10-02 12:48:24,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:48:26,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 12:48:26,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:48:27,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:48,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=885140.0, ans=0.2 2023-10-02 12:49:09,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=885206.6666666666, ans=0.125 2023-10-02 12:49:14,851 INFO [train.py:1046] (2/4) Epoch 25, batch 5300, loss[loss=0.1629, simple_loss=0.2375, pruned_loss=0.04418, over 24593.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2449, pruned_loss=0.04564, over 4708260.45 frames. ], batch size: 60, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:49:15,348 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.13 vs. limit=6.0 2023-10-02 12:49:17,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=885273.3333333334, ans=0.125 2023-10-02 12:49:19,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.67 vs. limit=12.0 2023-10-02 12:49:29,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:49:29,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 12:49:29,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 12:49:29,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:29,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:29,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:29,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:29,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:29,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:49:29,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:29,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:49:30,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:49:30,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 12:49:30,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 12:49:30,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 12:49:30,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:49:30,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 12:49:30,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 12:49:30,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:31,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:31,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:49:31,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:49:31,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:49:31,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:49:31,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:31,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:31,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:49:31,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:31,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:49:31,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:31,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:49:32,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 12:49:32,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:49:32,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 12:49:32,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 12:49:32,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:49:32,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:49:32,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 12:49:32,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 12:49:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:49:33,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:49:33,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:49:33,990 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 12:49:34,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 12:49:34,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:49:34,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:34,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 12:49:34,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 12:49:34,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 12:49:34,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:49:40,830 INFO [train.py:1046] (2/4) Epoch 26, batch 0, loss[loss=0.1448, simple_loss=0.2176, pruned_loss=0.03598, over 21895.00 frames. ], tot_loss[loss=0.1448, simple_loss=0.2176, pruned_loss=0.03598, over 21895.00 frames. ], batch size: 48, lr: 3.98e-03, grad_scale: 32.0 2023-10-02 12:49:40,830 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 12:49:53,971 INFO [train.py:1078] (2/4) Epoch 26, validation: loss=0.3276, simple_loss=0.28, pruned_loss=0.1876, over 1125622.00 frames. 2023-10-02 12:49:53,971 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 12:49:57,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 12:49:58,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:50:00,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:50:06,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:06,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:50:06,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=885353.3333333334, ans=0.125 2023-10-02 12:50:07,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:07,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 12:50:09,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 12:50:10,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:11,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:12,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=885420.0, ans=0.1 2023-10-02 12:50:12,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=885420.0, ans=0.1 2023-10-02 12:50:16,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:16,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:17,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:50:17,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:50:17,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=885420.0, ans=0.0 2023-10-02 12:50:19,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 12:50:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:50:27,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:50:28,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:31,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 12:50:33,076 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.877e+02 2.109e+02 2.369e+02 3.021e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-02 12:50:35,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:50:35,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:50:38,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:50:42,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:50:45,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:50:50,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 12:50:54,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 12:50:54,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:50:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:50:55,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:50:55,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:58,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 12:51:00,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:51:00,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:51:02,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:51:06,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 12:51:07,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:51:08,978 INFO [train.py:1046] (2/4) Epoch 26, batch 50, loss[loss=0.1892, simple_loss=0.2698, pruned_loss=0.05432, over 24018.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2504, pruned_loss=0.04589, over 1078055.78 frames. ], batch size: 86, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:51:12,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:51:12,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=885686.6666666666, ans=0.0 2023-10-02 12:51:13,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:51:13,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 12:51:15,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:51:15,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:51:16,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:51:17,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:51:19,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:51:19,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=885686.6666666666, ans=0.125 2023-10-02 12:51:21,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 12:51:21,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:22,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=885753.3333333334, ans=0.125 2023-10-02 12:51:28,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:51:31,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 12:51:32,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 12:51:34,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:51:36,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:51:36,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:37,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:51:37,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:51:38,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:51:38,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:46,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:51:47,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:51:47,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:51:49,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 12:51:51,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:51:52,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:51:52,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 12:51:53,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:51:54,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 12:52:02,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:02,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:52:04,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:04,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:52:04,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:52:07,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 12:52:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 12:52:09,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:09,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:52:09,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:52:10,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:52:10,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 12:52:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 12:52:12,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 12:52:13,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-10-02 12:52:15,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:15,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:52:16,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 12:52:16,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 12:52:17,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:17,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:52:19,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:52:20,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:52:22,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:52:23,549 INFO [train.py:1046] (2/4) Epoch 26, batch 100, loss[loss=0.1649, simple_loss=0.2436, pruned_loss=0.04309, over 24692.00 frames. ], tot_loss[loss=0.172, simple_loss=0.251, pruned_loss=0.04643, over 1890375.91 frames. ], batch size: 65, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:52:23,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=886020.0, ans=0.0 2023-10-02 12:52:26,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:52:29,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:52:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 12:52:32,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:35,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:52:36,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:52:36,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:52:36,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:52:36,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:52:38,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 12:52:39,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:52:39,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:41,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:41,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:52:44,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 12:52:46,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:47,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:49,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:52:50,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:52:50,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=886086.6666666666, ans=0.125 2023-10-02 12:52:53,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=886153.3333333334, ans=0.125 2023-10-02 12:52:54,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 12:52:54,728 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 12:52:56,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:52:56,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:52:56,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=886153.3333333334, ans=0.2 2023-10-02 12:52:58,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:53:00,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-10-02 12:53:02,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:53:02,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=886153.3333333334, ans=0.125 2023-10-02 12:53:04,020 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.882e+02 2.043e+02 2.257e+02 3.571e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 12:53:04,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:08,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:10,025 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 12:53:11,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 12:53:15,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:53:16,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:53:17,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=886220.0, ans=0.2 2023-10-02 12:53:19,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:22,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:24,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:53:26,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:53:30,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:30,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:53:31,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:31,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:53:31,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:33,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 12:53:33,104 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 12:53:33,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:35,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:53:35,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:35,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:35,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:53:35,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:53:36,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:53:36,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:36,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:53:37,390 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.75 vs. limit=10.0 2023-10-02 12:53:37,861 INFO [train.py:1046] (2/4) Epoch 26, batch 150, loss[loss=0.1651, simple_loss=0.2427, pruned_loss=0.04374, over 23572.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2498, pruned_loss=0.04627, over 2512580.27 frames. ], batch size: 135, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:53:39,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:39,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:53:39,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:53:42,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:45,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:53:45,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:53:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:49,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:49,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:49,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=886353.3333333334, ans=0.0 2023-10-02 12:53:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:53:53,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:53,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=886420.0, ans=0.1 2023-10-02 12:53:57,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 12:53:57,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 12:53:57,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 12:54:00,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:54:00,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:54:00,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:54:01,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:54:01,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:02,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:02,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:03,553 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 12:54:05,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:12,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:54:15,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:54:16,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 12:54:19,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:54:19,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:54:19,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:54:22,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:54:23,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:54:23,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:54:25,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:26,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 12:54:29,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=886553.3333333334, ans=0.125 2023-10-02 12:54:30,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:30,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:54:30,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:54:32,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:54:34,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:37,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 12:54:37,944 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.16 vs. limit=22.5 2023-10-02 12:54:38,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:54:40,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:54:40,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:54:43,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:54:43,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 12:54:44,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:54:44,773 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 12:54:44,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=886620.0, ans=0.125 2023-10-02 12:54:49,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:50,660 INFO [train.py:1046] (2/4) Epoch 26, batch 200, loss[loss=0.1646, simple_loss=0.2408, pruned_loss=0.04422, over 23534.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2501, pruned_loss=0.04737, over 2997543.71 frames. ], batch size: 134, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:54:52,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:52,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:54:55,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 12:54:55,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:54:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:54:58,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 12:55:00,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:55:03,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:03,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:08,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:55:08,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:55:10,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:13,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.69 vs. limit=10.0 2023-10-02 12:55:26,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:55:26,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:55:28,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:55:30,114 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.774e+02 1.970e+02 2.261e+02 2.926e+02, threshold=3.941e+02, percent-clipped=0.0 2023-10-02 12:55:30,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:55:31,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:55:31,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:55:32,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:34,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:55:34,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:55:34,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:55:36,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 12:55:37,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:55:37,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:40,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:55:45,616 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-10-02 12:55:47,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:55:51,415 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:55:55,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:55,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:56:01,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:01,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=886953.3333333334, ans=0.07 2023-10-02 12:56:03,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 12:56:04,293 INFO [train.py:1046] (2/4) Epoch 26, batch 250, loss[loss=0.1867, simple_loss=0.247, pruned_loss=0.06315, over 19549.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2494, pruned_loss=0.04677, over 3378354.57 frames. ], batch size: 389, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:56:04,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:56:04,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:56:04,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:56:04,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=887020.0, ans=0.125 2023-10-02 12:56:05,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:56:05,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 12:56:07,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:56:09,079 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 12:56:10,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:10,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:56:11,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:11,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:56:15,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:56:15,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:16,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:56:21,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:56:30,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:56:32,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:56:32,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:56:40,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:56:41,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:56:41,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:56:41,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=887153.3333333334, ans=0.5 2023-10-02 12:56:42,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:56:42,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:56:42,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:56:44,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:56:47,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:56:49,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 12:56:50,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:56:50,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:56:51,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=887220.0, ans=0.1 2023-10-02 12:56:52,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:56:52,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:56:53,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:56:53,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:56:55,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:56:56,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:56:57,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:56:59,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:01,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.27 vs. limit=6.0 2023-10-02 12:57:02,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:57:05,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:57:10,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:57:13,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:14,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:57:15,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 12:57:17,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:57:17,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:57:18,989 INFO [train.py:1046] (2/4) Epoch 26, batch 300, loss[loss=0.147, simple_loss=0.2276, pruned_loss=0.03319, over 24570.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2464, pruned_loss=0.04607, over 3665752.32 frames. ], batch size: 60, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 12:57:19,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 12:57:19,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:57:22,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:57:22,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 12:57:27,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:57:29,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:57:32,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:57:32,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 12:57:34,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:35,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:57:35,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 12:57:35,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:57:40,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:57:44,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:57:44,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 12:57:49,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 12:57:49,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:57:51,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:57:53,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:57:53,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 12:57:53,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:57:55,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:57:56,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:57:57,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:57:59,150 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.798e+02 1.983e+02 2.227e+02 3.244e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-02 12:58:01,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:58:01,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 12:58:02,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:58:05,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:06,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 12:58:06,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:11,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:58:14,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:58:14,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 12:58:18,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:18,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:58:20,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=887620.0, ans=0.1 2023-10-02 12:58:21,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:21,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:58:23,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 12:58:23,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:58:25,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:58:26,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 12:58:26,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:26,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:28,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:58:29,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:30,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:31,334 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-10-02 12:58:33,303 INFO [train.py:1046] (2/4) Epoch 26, batch 350, loss[loss=0.1795, simple_loss=0.265, pruned_loss=0.04703, over 24381.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2445, pruned_loss=0.04588, over 3895469.52 frames. ], batch size: 77, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 12:58:35,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:58:35,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 12:58:38,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:43,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=887686.6666666666, ans=0.2 2023-10-02 12:58:44,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:58:47,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:50,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 12:58:51,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:58:51,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 12:58:53,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=887753.3333333334, ans=0.125 2023-10-02 12:58:54,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:54,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 12:58:54,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:58:57,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 12:58:59,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:59:00,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:59:00,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:59:02,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:03,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:03,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:59:03,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:59:06,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:59:06,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:59:12,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:59:12,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:59:12,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:59:13,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.85 vs. limit=22.5 2023-10-02 12:59:14,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 12:59:20,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:59:24,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:24,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:24,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:59:25,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 12:59:28,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:29,361 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 12:59:30,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 12:59:30,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:33,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:59:33,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 12:59:35,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:40,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:59:41,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:42,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:42,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:44,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:47,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:59:49,264 INFO [train.py:1046] (2/4) Epoch 26, batch 400, loss[loss=0.1424, simple_loss=0.2201, pruned_loss=0.03238, over 24334.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2445, pruned_loss=0.04584, over 4080323.68 frames. ], batch size: 56, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 12:59:50,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:59:50,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=888020.0, ans=0.125 2023-10-02 12:59:51,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 12:59:51,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:53,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:59:54,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=888020.0, ans=0.04949747468305833 2023-10-02 12:59:56,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:59:56,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:59:58,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:00,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:02,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 13:00:03,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 13:00:03,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:00:03,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 13:00:05,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:06,741 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:00:09,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:00:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:09,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 13:00:11,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:00:11,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:11,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:12,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:00:15,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 13:00:15,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 13:00:20,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:00:20,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=888153.3333333334, ans=0.1 2023-10-02 13:00:21,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:23,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 13:00:23,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 13:00:26,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:00:28,742 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.840e+02 2.026e+02 2.260e+02 3.455e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-02 13:00:28,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:00:31,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=888220.0, ans=0.0 2023-10-02 13:00:31,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=888220.0, ans=0.2 2023-10-02 13:00:35,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 13:00:37,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:00:39,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 13:00:42,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:43,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:00:43,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 13:00:46,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:00:48,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:00:51,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:54,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:00:54,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 13:00:55,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:00:55,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 13:00:58,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:00:58,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:01:00,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 13:01:00,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:01:02,153 INFO [train.py:1046] (2/4) Epoch 26, batch 450, loss[loss=0.2015, simple_loss=0.2615, pruned_loss=0.07078, over 23560.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2445, pruned_loss=0.04549, over 4239754.27 frames. ], batch size: 256, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 13:01:02,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:01:02,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:01:03,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 13:01:05,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:01:05,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:01:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:01:07,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 13:01:07,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:01:08,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:01:10,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:01:22,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:22,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:01:25,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 13:01:25,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 13:01:28,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:01:29,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:30,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:01:33,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:01:35,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:01:38,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 13:01:38,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 13:01:39,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 13:01:39,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:01:41,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:01:42,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:01:43,937 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 13:01:43,947 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 13:01:43,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:45,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:01:47,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:01:50,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:01:50,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:01:50,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:01:51,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 13:01:54,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:01:55,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:01:55,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:01:57,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 13:02:01,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:02:02,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 13:02:04,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 13:02:04,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:02:04,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=888620.0, ans=0.125 2023-10-02 13:02:08,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:02:10,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:02:10,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:02:10,729 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 13:02:13,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:02:15,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:02:16,859 INFO [train.py:1046] (2/4) Epoch 26, batch 500, loss[loss=0.223, simple_loss=0.2822, pruned_loss=0.08193, over 19325.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2448, pruned_loss=0.04545, over 4344200.23 frames. ], batch size: 388, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 13:02:16,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:02:16,963 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 13:02:18,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 13:02:18,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:02:21,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:02:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:02:27,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:02:29,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:02:29,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:02:30,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:38,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:38,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:02:38,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:02:38,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:38,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=888753.3333333334, ans=0.2 2023-10-02 13:02:40,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 13:02:40,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:02:41,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:02:43,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:02:44,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:02:44,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:44,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 13:02:48,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=888820.0, ans=0.125 2023-10-02 13:02:49,400 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 13:02:51,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=888820.0, ans=0.0 2023-10-02 13:02:52,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:02:53,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:55,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:55,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:56,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:02:57,649 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.792e+02 1.964e+02 2.180e+02 2.813e+02, threshold=3.927e+02, percent-clipped=0.0 2023-10-02 13:02:57,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 13:03:00,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:03:00,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:05,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:08,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:03:16,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:03:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 13:03:21,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:21,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:03:23,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 13:03:23,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:03:25,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:30,035 INFO [train.py:1046] (2/4) Epoch 26, batch 550, loss[loss=0.1514, simple_loss=0.2341, pruned_loss=0.03433, over 24329.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2452, pruned_loss=0.04531, over 4429577.55 frames. ], batch size: 61, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:03:30,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 13:03:31,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 13:03:31,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:31,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 13:03:32,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:03:32,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:34,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:35,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:35,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:03:35,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:03:38,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:38,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 13:03:38,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:03:44,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:03:44,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:47,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:03:47,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:52,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 13:03:52,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 13:03:54,586 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.78 vs. limit=10.0 2023-10-02 13:03:55,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:03:59,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:03:59,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:04:02,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:04:03,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:03,965 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 13:04:05,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:04:06,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:04:08,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:04:10,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:04:10,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:04:10,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=889153.3333333334, ans=0.125 2023-10-02 13:04:11,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:12,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 13:04:14,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 13:04:15,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:15,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:04:15,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:04:15,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:04:19,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:04:19,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:04:21,997 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.51 vs. limit=15.0 2023-10-02 13:04:22,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:04:24,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:24,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 13:04:25,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:04:26,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:28,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:04:28,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:29,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:04:29,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 13:04:35,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 13:04:39,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 13:04:40,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.49 vs. limit=22.5 2023-10-02 13:04:41,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:04:41,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:04:41,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:43,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=889353.3333333334, ans=0.0 2023-10-02 13:04:44,869 INFO [train.py:1046] (2/4) Epoch 26, batch 600, loss[loss=0.1653, simple_loss=0.2472, pruned_loss=0.04169, over 24647.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.246, pruned_loss=0.04579, over 4496954.21 frames. ], batch size: 65, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:04:50,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:04:54,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:04:54,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 13:04:55,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:04:57,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:04:58,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:01,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 13:05:02,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:05:08,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 13:05:08,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=889420.0, ans=0.125 2023-10-02 13:05:10,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:05:10,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:10,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:05:17,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=889486.6666666666, ans=0.125 2023-10-02 13:05:18,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:05:18,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:05:18,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:05:25,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:05:28,841 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.874e+02 2.049e+02 2.312e+02 3.828e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 13:05:29,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:05:29,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:05:29,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:34,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=889553.3333333334, ans=0.07 2023-10-02 13:05:37,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 13:05:39,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.45 vs. limit=15.0 2023-10-02 13:05:42,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:05:42,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:05:46,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=889620.0, ans=0.125 2023-10-02 13:05:47,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 13:05:47,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:05:50,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 13:05:52,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:05:52,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:05:58,314 INFO [train.py:1046] (2/4) Epoch 26, batch 650, loss[loss=0.1735, simple_loss=0.2468, pruned_loss=0.05006, over 23597.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2454, pruned_loss=0.04541, over 4551825.47 frames. ], batch size: 135, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:05:58,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 13:05:59,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:06:01,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:06:02,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:06:05,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:06,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 13:06:08,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:06:12,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:06:12,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:12,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=889753.3333333334, ans=0.1 2023-10-02 13:06:17,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:20,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 13:06:21,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:06:22,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:23,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=889753.3333333334, ans=0.0 2023-10-02 13:06:25,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:06:25,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=889753.3333333334, ans=0.1 2023-10-02 13:06:26,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:06:27,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=889820.0, ans=0.125 2023-10-02 13:06:28,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:28,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:06:31,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:32,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:06:33,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:06:35,190 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 13:06:35,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:35,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:06:38,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:39,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:06:39,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:06:39,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:06:40,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 13:06:40,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:06:42,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:06:43,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:06:43,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:06:45,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:06:45,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 13:06:47,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 13:06:47,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:47,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:06:47,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:06:47,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:06:49,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:55,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:55,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:06:57,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:07:00,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:07:00,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:07:00,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:07:07,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:07:08,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:08,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:07:08,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:08,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=889953.3333333334, ans=0.1 2023-10-02 13:07:11,144 INFO [train.py:1046] (2/4) Epoch 26, batch 700, loss[loss=0.1776, simple_loss=0.2454, pruned_loss=0.05491, over 23447.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2445, pruned_loss=0.04552, over 4591275.73 frames. ], batch size: 285, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:07:12,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=890020.0, ans=0.0 2023-10-02 13:07:14,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 13:07:17,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 13:07:17,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=890020.0, ans=0.125 2023-10-02 13:07:19,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 13:07:19,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:19,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=890020.0, ans=0.2 2023-10-02 13:07:20,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:07:22,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 13:07:26,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:07:28,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:07:30,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:31,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:07:31,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:07:33,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:37,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 13:07:37,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:07:39,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 13:07:42,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 13:07:42,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=890153.3333333334, ans=0.1 2023-10-02 13:07:45,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:07:45,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:07:47,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:07:51,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:07:51,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 13:07:56,015 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.890e+02 2.209e+02 2.852e+02 4.841e+02, threshold=4.419e+02, percent-clipped=5.0 2023-10-02 13:07:56,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:57,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:07:57,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 13:08:03,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:08:04,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:07,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:11,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:08:13,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 13:08:16,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 13:08:16,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 13:08:18,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:18,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=890286.6666666666, ans=0.2 2023-10-02 13:08:19,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:08:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:08:22,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:22,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 13:08:25,902 INFO [train.py:1046] (2/4) Epoch 26, batch 750, loss[loss=0.1795, simple_loss=0.2591, pruned_loss=0.04993, over 23611.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.245, pruned_loss=0.04522, over 4625213.96 frames. ], batch size: 85, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:08:27,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 13:08:28,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 13:08:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 13:08:28,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 13:08:29,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 13:08:29,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:08:31,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 13:08:33,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:08:34,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:08:36,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=890353.3333333334, ans=0.125 2023-10-02 13:08:37,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:37,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:08:37,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:08:39,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:08:40,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:08:43,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:08:45,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=890420.0, ans=0.125 2023-10-02 13:08:46,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:08:46,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:46,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 13:08:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:08:49,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:50,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:52,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:08:53,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 13:08:53,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:08:55,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 13:08:55,114 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 13:08:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 13:08:56,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:08:56,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:08:59,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:09:01,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=890486.6666666666, ans=0.125 2023-10-02 13:09:07,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:09:07,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:07,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:09:08,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=890553.3333333334, ans=0.125 2023-10-02 13:09:09,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:09:10,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:10,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 13:09:10,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:09:11,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 13:09:11,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:09:15,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:09:15,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 13:09:16,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:19,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:09:19,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=890553.3333333334, ans=0.0 2023-10-02 13:09:22,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:09:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:23,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:09:23,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=890620.0, ans=0.125 2023-10-02 13:09:27,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 13:09:28,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:09:28,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:09:30,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:09:30,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:34,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:34,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:09:39,949 INFO [train.py:1046] (2/4) Epoch 26, batch 800, loss[loss=0.1699, simple_loss=0.2509, pruned_loss=0.04448, over 24000.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2454, pruned_loss=0.04569, over 4644736.98 frames. ], batch size: 80, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:09:43,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:43,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:45,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:09:45,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:48,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:48,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:49,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:53,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:09:53,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:09:56,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 13:09:57,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:58,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:58,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:09:59,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.22 vs. limit=22.5 2023-10-02 13:10:00,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:00,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 13:10:00,839 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=15.0 2023-10-02 13:10:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:01,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 13:10:03,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:05,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:06,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:10:06,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:10,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:10,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:10,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=890820.0, ans=0.125 2023-10-02 13:10:12,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:10:14,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:10:14,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 13:10:17,304 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 13:10:17,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 13:10:17,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:10:18,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:10:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:20,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:10:24,219 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.808e+02 1.916e+02 2.137e+02 2.899e+02, threshold=3.832e+02, percent-clipped=0.0 2023-10-02 13:10:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 13:10:25,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 13:10:27,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:10:29,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:10:32,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:10:37,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:38,279 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.32 vs. limit=10.0 2023-10-02 13:10:39,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 13:10:39,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:10:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 13:10:47,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:10:50,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:10:50,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 13:10:51,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:10:53,186 INFO [train.py:1046] (2/4) Epoch 26, batch 850, loss[loss=0.1639, simple_loss=0.238, pruned_loss=0.04491, over 23548.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2462, pruned_loss=0.04511, over 4678410.58 frames. ], batch size: 120, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:10:53,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:54,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 13:10:54,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:10:55,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:57,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:58,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:11:00,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:11:01,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 13:11:02,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 13:11:02,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 13:11:04,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:11:04,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:11:07,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:07,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:11:08,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:11:12,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:11:12,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:12,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 13:11:15,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 13:11:18,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:11:19,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 13:11:22,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 13:11:24,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 13:11:25,480 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 13:11:25,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:11:25,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:11:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:11:28,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:31,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:32,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 13:11:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:11:34,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:36,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:11:37,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:11:38,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:11:40,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:11:40,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 13:11:46,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:11:46,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:11:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:11:47,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:11:47,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:49,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:51,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:11:53,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:11:53,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:11:55,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:11:55,905 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.60 vs. limit=15.0 2023-10-02 13:11:58,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=891286.6666666666, ans=0.1 2023-10-02 13:11:59,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:12:00,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:12:02,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 13:12:02,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:12:02,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:12:05,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 13:12:07,820 INFO [train.py:1046] (2/4) Epoch 26, batch 900, loss[loss=0.1662, simple_loss=0.2481, pruned_loss=0.04209, over 24114.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2468, pruned_loss=0.04507, over 4688876.87 frames. ], batch size: 80, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:12:10,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:12:13,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:12:13,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 13:12:16,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:12:16,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 13:12:18,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:12:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:12:19,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:12:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:12:21,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:12:29,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:12:29,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:12:31,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:12:32,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:12:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 13:12:39,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=891486.6666666666, ans=0.125 2023-10-02 13:12:41,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:12:46,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:12:46,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:12:46,623 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 13:12:47,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 13:12:51,897 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.855e+02 2.022e+02 2.290e+02 3.129e+02, threshold=4.044e+02, percent-clipped=0.0 2023-10-02 13:12:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:12:54,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:12:54,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:12:56,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=891553.3333333334, ans=0.125 2023-10-02 13:12:58,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=891553.3333333334, ans=0.125 2023-10-02 13:13:00,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:00,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:00,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=891553.3333333334, ans=0.04949747468305833 2023-10-02 13:13:03,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 13:13:03,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:13:03,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 13:13:05,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=891620.0, ans=0.0 2023-10-02 13:13:06,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:13:06,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:08,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:13:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:12,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 13:13:13,403 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 13:13:13,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=891620.0, ans=0.0 2023-10-02 13:13:16,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:13:16,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 13:13:18,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:20,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 13:13:21,664 INFO [train.py:1046] (2/4) Epoch 26, batch 950, loss[loss=0.1547, simple_loss=0.2276, pruned_loss=0.04094, over 23703.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2475, pruned_loss=0.04559, over 4682453.58 frames. ], batch size: 135, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:13:26,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:29,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:29,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:30,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:13:33,400 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 13:13:36,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:36,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:13:37,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:37,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:13:37,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 13:13:37,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=891753.3333333334, ans=0.0 2023-10-02 13:13:39,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:13:40,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.30 vs. limit=22.5 2023-10-02 13:13:42,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:42,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 13:13:44,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:45,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:45,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:45,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:47,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 13:13:49,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 13:13:51,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:13:52,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:13:57,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:13:58,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:14:02,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 13:14:04,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 13:14:04,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:14:04,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:04,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:04,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:14:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 13:14:10,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:14:14,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:15,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:15,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 13:14:15,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:14:15,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:14:16,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 13:14:19,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:14:20,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:14:26,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:14:29,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 13:14:29,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 13:14:29,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=891953.3333333334, ans=0.2 2023-10-02 13:14:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:34,912 INFO [train.py:1046] (2/4) Epoch 26, batch 1000, loss[loss=0.1516, simple_loss=0.2303, pruned_loss=0.03641, over 24336.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2464, pruned_loss=0.04559, over 4689007.14 frames. ], batch size: 61, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:14:35,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 13:14:36,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:14:38,331 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=12.0 2023-10-02 13:14:39,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=892020.0, ans=0.0 2023-10-02 13:14:41,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=892020.0, ans=0.125 2023-10-02 13:14:42,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:14:43,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 13:14:43,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 13:14:45,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=892020.0, ans=0.1 2023-10-02 13:14:49,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:14:49,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:14:50,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:54,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 13:14:58,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 13:14:59,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 13:14:59,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:02,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 13:15:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 13:15:03,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 13:15:04,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.46 vs. limit=12.0 2023-10-02 13:15:05,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:06,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:12,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:15:13,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:15:15,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:15,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:15,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 13:15:15,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:17,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:15:18,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:15:18,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=892220.0, ans=0.1 2023-10-02 13:15:19,620 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.933e+02 2.169e+02 2.725e+02 4.611e+02, threshold=4.339e+02, percent-clipped=3.0 2023-10-02 13:15:19,709 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 13:15:23,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 13:15:24,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 13:15:24,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=892220.0, ans=0.125 2023-10-02 13:15:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 13:15:25,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:15:28,313 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.80 vs. limit=15.0 2023-10-02 13:15:34,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:15:34,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:35,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:15:37,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 13:15:38,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:15:38,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 13:15:40,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 13:15:41,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:15:41,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:43,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:15:43,564 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.70 vs. limit=15.0 2023-10-02 13:15:46,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:15:46,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:50,025 INFO [train.py:1046] (2/4) Epoch 26, batch 1050, loss[loss=0.184, simple_loss=0.2603, pruned_loss=0.05383, over 23775.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2453, pruned_loss=0.04538, over 4697142.19 frames. ], batch size: 85, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:15:51,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:15:51,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:15:54,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:15:54,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:55,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:15:58,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:16:00,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:16:00,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=892353.3333333334, ans=0.0 2023-10-02 13:16:03,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:16:04,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:16:04,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:16:04,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:16:04,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=892420.0, ans=0.0 2023-10-02 13:16:04,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=892420.0, ans=0.0 2023-10-02 13:16:06,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 13:16:06,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:16:06,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 13:16:10,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:16:10,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 13:16:10,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:16:10,389 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:16:16,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:16:16,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:16:17,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:16:20,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 13:16:21,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 13:16:21,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:16:25,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 13:16:28,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 13:16:29,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:16:32,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:16:35,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:16:35,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:16:36,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:16:40,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:16:43,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 13:16:45,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 13:16:45,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 13:16:47,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:16:47,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:16:49,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 13:16:52,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:16:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:16:55,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:16:55,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:16:56,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:00,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:00,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 13:17:00,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:17:02,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 13:17:02,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 13:17:04,020 INFO [train.py:1046] (2/4) Epoch 26, batch 1100, loss[loss=0.1748, simple_loss=0.2584, pruned_loss=0.04559, over 23986.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2443, pruned_loss=0.04541, over 4690847.24 frames. ], batch size: 86, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:17:04,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:17:06,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:17:08,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=892686.6666666666, ans=0.125 2023-10-02 13:17:11,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:17:15,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:17:16,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:17:16,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:17:16,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 13:17:17,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:17:21,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:17:22,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:17:24,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:17:24,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 13:17:25,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:17:26,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:17:26,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:17:29,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:17:31,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:17:35,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:17:39,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 13:17:39,865 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 13:17:40,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=892820.0, ans=0.125 2023-10-02 13:17:41,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:42,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:43,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:17:44,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:17:45,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=892886.6666666666, ans=0.125 2023-10-02 13:17:46,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 13:17:48,055 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.826e+02 2.020e+02 2.449e+02 3.878e+02, threshold=4.041e+02, percent-clipped=0.0 2023-10-02 13:17:48,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:17:48,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:17:48,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:17:50,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:50,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 13:17:53,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=892886.6666666666, ans=0.0 2023-10-02 13:17:54,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:17:54,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 13:17:57,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:17:59,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=892886.6666666666, ans=0.0 2023-10-02 13:18:01,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:18:01,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=892953.3333333334, ans=0.125 2023-10-02 13:18:03,832 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:18:05,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 13:18:05,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=892953.3333333334, ans=0.95 2023-10-02 13:18:06,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:18:07,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:09,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:18:10,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:18:10,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 13:18:11,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:18:11,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:18:13,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 13:18:13,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:18:13,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 13:18:13,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=892953.3333333334, ans=0.0 2023-10-02 13:18:14,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:18:14,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:18:16,033 INFO [train.py:1046] (2/4) Epoch 26, batch 1150, loss[loss=0.184, simple_loss=0.2539, pruned_loss=0.05702, over 23719.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2445, pruned_loss=0.04566, over 4698484.30 frames. ], batch size: 232, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:18:16,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:18:16,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.68 vs. limit=15.0 2023-10-02 13:18:20,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:24,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:18:25,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:18:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:18:26,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 13:18:26,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:18:29,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 13:18:31,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:31,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:18:37,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 13:18:39,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:43,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:45,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:18:45,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 13:18:45,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:18:45,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:18:48,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 13:18:49,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:51,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:19:03,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:19:03,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=893220.0, ans=0.1 2023-10-02 13:19:06,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=893220.0, ans=0.07 2023-10-02 13:19:10,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:19:10,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 13:19:10,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:12,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:16,318 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 13:19:17,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:24,271 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 13:19:24,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=893286.6666666666, ans=0.125 2023-10-02 13:19:28,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:19:28,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:19:29,670 INFO [train.py:1046] (2/4) Epoch 26, batch 1200, loss[loss=0.1484, simple_loss=0.2255, pruned_loss=0.03565, over 23505.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2447, pruned_loss=0.04546, over 4708865.12 frames. ], batch size: 134, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:19:29,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:19:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:19:34,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:19:39,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:19:39,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:19:41,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:19:41,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:19:41,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:19:44,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:19:46,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:19:46,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:19:46,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:48,924 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 13:19:50,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 13:19:53,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:19:56,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:19:58,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:20:01,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:20:01,067 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 13:20:01,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:20:01,946 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=15.0 2023-10-02 13:20:08,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:20:08,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:20:10,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 13:20:10,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:20:10,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=893486.6666666666, ans=0.0 2023-10-02 13:20:14,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 13:20:15,603 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.901e+02 2.098e+02 2.367e+02 3.990e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-02 13:20:18,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 13:20:18,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:20:19,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:20:21,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:20:21,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:20:21,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:20:21,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:20:23,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:20:23,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 13:20:23,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:20:23,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:20:23,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:20:26,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:20:26,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:20:30,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:20:31,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:20:33,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 13:20:34,854 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.20 vs. limit=15.0 2023-10-02 13:20:38,042 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 13:20:39,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:20:43,355 INFO [train.py:1046] (2/4) Epoch 26, batch 1250, loss[loss=0.1772, simple_loss=0.251, pruned_loss=0.05166, over 23710.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2461, pruned_loss=0.0458, over 4710581.47 frames. ], batch size: 232, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:20:43,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:20:43,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:20:44,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:20:47,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 13:20:52,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:20:53,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:20:53,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 13:20:55,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:20:56,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:20:59,424 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.70 vs. limit=15.0 2023-10-02 13:20:59,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:21:01,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:21:02,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:21:02,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:21:05,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:21:06,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=893753.3333333334, ans=0.025 2023-10-02 13:21:11,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 13:21:11,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:21:11,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:21:13,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:21:14,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:16,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:17,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:21:21,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 13:21:21,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:21:24,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:21:26,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 13:21:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:21:26,386 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 13:21:28,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:28,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:30,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:33,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:33,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:21:35,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 13:21:35,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 13:21:36,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 13:21:40,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:21:41,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 13:21:41,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:43,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 13:21:43,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.52 vs. limit=15.0 2023-10-02 13:21:44,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:21:45,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 13:21:45,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:21:46,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=893953.3333333334, ans=0.125 2023-10-02 13:21:47,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:21:47,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:21:48,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:21:51,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 13:21:53,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:21:55,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:21:55,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:21:57,240 INFO [train.py:1046] (2/4) Epoch 26, batch 1300, loss[loss=0.1545, simple_loss=0.2259, pruned_loss=0.04158, over 21216.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2458, pruned_loss=0.04514, over 4729856.54 frames. ], batch size: 46, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:21:58,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:22:02,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:22:02,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 13:22:06,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:22:08,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:22:09,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:22:09,630 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:22:11,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:22:12,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:22:14,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 13:22:18,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:22:19,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:22:21,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 13:22:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:22:25,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:22:27,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:22:28,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:22:30,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:22:30,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:22:31,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:22:31,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 13:22:37,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:22:37,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:22:40,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 13:22:42,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:22:43,835 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.906e+02 2.137e+02 2.529e+02 3.286e+02, threshold=4.274e+02, percent-clipped=0.0 2023-10-02 13:22:43,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:22:44,590 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.47 vs. limit=15.0 2023-10-02 13:22:45,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:22:45,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 13:22:46,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:22:46,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 13:22:48,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:22:51,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:22:51,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:22:55,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 13:22:55,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 13:22:57,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 13:23:02,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:23:04,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 13:23:06,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:23:11,703 INFO [train.py:1046] (2/4) Epoch 26, batch 1350, loss[loss=0.1639, simple_loss=0.2201, pruned_loss=0.05385, over 19489.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2454, pruned_loss=0.04457, over 4742193.14 frames. ], batch size: 388, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:23:11,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 13:23:15,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:23:19,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:22,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:23:23,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:23:24,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:23:24,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:23:25,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=894420.0, ans=0.2 2023-10-02 13:23:27,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:23:30,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 13:23:30,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:23:31,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:23:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 13:23:35,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:23:35,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=894420.0, ans=0.125 2023-10-02 13:23:36,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:23:36,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 13:23:38,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 13:23:38,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=894420.0, ans=0.125 2023-10-02 13:23:39,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 13:23:43,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:43,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 13:23:52,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:59,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:59,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:01,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 13:24:05,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:05,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 13:24:05,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:24:06,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:24:08,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:24:11,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 13:24:14,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:24:15,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=894620.0, ans=0.125 2023-10-02 13:24:19,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.52 vs. limit=6.0 2023-10-02 13:24:21,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 13:24:22,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 13:24:25,931 INFO [train.py:1046] (2/4) Epoch 26, batch 1400, loss[loss=0.139, simple_loss=0.2177, pruned_loss=0.03017, over 21540.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2436, pruned_loss=0.04477, over 4712121.52 frames. ], batch size: 47, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:24:26,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 13:24:27,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:31,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:24:31,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:24:37,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 13:24:39,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 13:24:49,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:24:49,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=894753.3333333334, ans=0.1 2023-10-02 13:24:51,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:24:53,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:24:53,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:24:55,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.07 vs. limit=15.0 2023-10-02 13:24:58,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:24:58,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 13:24:58,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.68 vs. limit=15.0 2023-10-02 13:25:08,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:10,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:11,512 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.828e+02 2.043e+02 2.426e+02 3.539e+02, threshold=4.086e+02, percent-clipped=0.0 2023-10-02 13:25:12,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 13:25:13,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:25:14,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:25:14,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:25:14,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=894886.6666666666, ans=0.125 2023-10-02 13:25:16,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:25:17,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:25:17,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:25:18,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:25:20,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 13:25:20,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:25:23,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:23,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=894953.3333333334, ans=0.0 2023-10-02 13:25:27,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:25:27,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=894953.3333333334, ans=0.1 2023-10-02 13:25:35,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 13:25:36,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:25:36,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:25:37,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 13:25:38,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:25:39,369 INFO [train.py:1046] (2/4) Epoch 26, batch 1450, loss[loss=0.1779, simple_loss=0.2639, pruned_loss=0.046, over 24441.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2438, pruned_loss=0.04454, over 4721784.77 frames. ], batch size: 69, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:25:39,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:25:41,243 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.53 vs. limit=15.0 2023-10-02 13:25:43,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:25:45,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:25:45,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:45,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 13:25:49,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:25:49,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:25:50,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:25:50,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 13:25:51,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:25:52,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 13:25:52,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:54,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:25:54,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 13:25:55,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:25:55,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:25:57,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 13:25:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:25:58,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:25:59,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:02,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:26:07,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:26:07,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:26:09,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:26:09,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:10,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:26:10,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:26:11,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:11,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:16,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 13:26:17,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:26:20,755 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 13:26:21,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.58 vs. limit=10.0 2023-10-02 13:26:22,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:26:25,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:26:26,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.58 vs. limit=6.0 2023-10-02 13:26:26,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:28,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 13:26:29,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:31,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 13:26:31,791 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.07 vs. limit=22.5 2023-10-02 13:26:32,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 13:26:35,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:37,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:26:37,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:26:40,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 13:26:42,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 13:26:42,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=12.0 2023-10-02 13:26:43,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 13:26:45,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:46,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:26:53,477 INFO [train.py:1046] (2/4) Epoch 26, batch 1500, loss[loss=0.172, simple_loss=0.2451, pruned_loss=0.04944, over 23657.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2445, pruned_loss=0.04478, over 4721850.13 frames. ], batch size: 149, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:26:55,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 13:26:56,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:26:56,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:26:58,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:59,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:26:59,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:27:00,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 13:27:01,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:27:02,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:27:02,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:27:04,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:27:06,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:27:08,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:27:11,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=895420.0, ans=0.1 2023-10-02 13:27:14,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:27:14,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 13:27:14,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:27:14,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:27:16,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:27:18,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 13:27:20,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=895420.0, ans=0.07 2023-10-02 13:27:21,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 13:27:24,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:27:24,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 13:27:27,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:27:28,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:27:29,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:27:29,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:27:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 13:27:30,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:27:30,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:27:31,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 13:27:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:27:36,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:27:36,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 13:27:39,720 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.823e+02 1.989e+02 2.187e+02 2.730e+02, threshold=3.978e+02, percent-clipped=0.0 2023-10-02 13:27:41,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:27:44,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:27:48,601 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 13:27:48,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:27:48,661 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 13:27:50,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:27:51,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:27:52,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 13:27:54,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:27:54,602 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.37 vs. limit=22.5 2023-10-02 13:27:57,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 13:27:58,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:00,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=895620.0, ans=0.2 2023-10-02 13:28:01,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:28:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:03,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:28:03,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:03,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:28:05,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=895620.0, ans=0.125 2023-10-02 13:28:06,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 13:28:06,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 13:28:07,716 INFO [train.py:1046] (2/4) Epoch 26, batch 1550, loss[loss=0.1677, simple_loss=0.2604, pruned_loss=0.03752, over 24563.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.246, pruned_loss=0.04574, over 4707530.43 frames. ], batch size: 71, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:28:07,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:28:07,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 13:28:09,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 13:28:10,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:28:12,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:12,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:28:12,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:28:14,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=895686.6666666666, ans=0.125 2023-10-02 13:28:15,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:16,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:19,622 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 13:28:19,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:19,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:28:19,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:28:22,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:28:22,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 13:28:22,780 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:28:25,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:28:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 13:28:25,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 13:28:25,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 13:28:26,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:28,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:28:32,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:28:32,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=895753.3333333334, ans=0.1 2023-10-02 13:28:33,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 13:28:33,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 13:28:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:28:44,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=895820.0, ans=0.1 2023-10-02 13:28:45,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:28:45,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:28:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:28:47,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 13:28:52,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:28:54,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:56,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:28:59,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:29:01,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:29:01,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 13:29:01,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:29:02,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:29:02,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:29:02,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 13:29:02,891 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 13:29:05,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:11,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 13:29:16,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:29:17,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:29:17,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 13:29:21,072 INFO [train.py:1046] (2/4) Epoch 26, batch 1600, loss[loss=0.1623, simple_loss=0.2321, pruned_loss=0.04626, over 19037.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2467, pruned_loss=0.0459, over 4711422.43 frames. ], batch size: 41, lr: 3.96e-03, grad_scale: 32.0 2023-10-02 13:29:21,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:29:22,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:29:22,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:29:22,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:29:25,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:29:28,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:28,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 13:29:29,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 13:29:29,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 13:29:31,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:29:31,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=896020.0, ans=0.2 2023-10-02 13:29:34,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 13:29:34,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:29:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:29:41,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:29:41,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=896086.6666666666, ans=0.1 2023-10-02 13:29:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 13:29:45,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:29:45,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 13:29:47,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:48,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 13:29:51,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=896153.3333333334, ans=0.2 2023-10-02 13:29:54,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=896153.3333333334, ans=0.125 2023-10-02 13:29:55,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 13:30:02,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:30:02,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 13:30:03,229 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.39 vs. limit=15.0 2023-10-02 13:30:04,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:30:04,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:30:04,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:30:06,790 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.834e+02 2.040e+02 2.278e+02 3.104e+02, threshold=4.080e+02, percent-clipped=0.0 2023-10-02 13:30:06,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 13:30:10,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 13:30:12,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:30:12,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:14,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:14,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:30:15,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:30:18,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:30:20,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:30:25,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:26,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:30:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 13:30:27,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:30:27,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=896286.6666666666, ans=0.1 2023-10-02 13:30:30,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 13:30:34,790 INFO [train.py:1046] (2/4) Epoch 26, batch 1650, loss[loss=0.2333, simple_loss=0.2955, pruned_loss=0.08553, over 19587.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2472, pruned_loss=0.04672, over 4685285.79 frames. ], batch size: 388, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:30:36,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:30:37,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:30:37,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:30:37,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 13:30:38,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 13:30:38,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 13:30:38,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 13:30:42,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:43,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:30:43,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:30:43,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:30:45,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:30:46,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 13:30:48,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=896420.0, ans=0.0 2023-10-02 13:30:49,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:30:51,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:30:51,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:30:51,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:30:52,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 13:30:52,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 13:30:58,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:31:00,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:31:07,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 13:31:07,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:10,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 13:31:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:16,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:31:16,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:31:16,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:19,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:31:19,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:22,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:31:23,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:23,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:31:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:31:25,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:31:26,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:31:27,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:31:29,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 13:31:29,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:31:29,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 13:31:33,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 13:31:33,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 13:31:33,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:31:34,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:31:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:37,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:37,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 13:31:40,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:42,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:31:42,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:44,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 13:31:48,793 INFO [train.py:1046] (2/4) Epoch 26, batch 1700, loss[loss=0.1651, simple_loss=0.2171, pruned_loss=0.05649, over 19202.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2471, pruned_loss=0.0467, over 4685930.04 frames. ], batch size: 389, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:31:50,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:50,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:31:50,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 13:31:50,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:31:50,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:31:50,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:31:53,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:31:53,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:31:54,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 13:31:56,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:32:05,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:32:07,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:32:11,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:32:11,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:32:13,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:32:13,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:32:14,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 13:32:17,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:32:18,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:20,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:32:22,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:32:25,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 13:32:25,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 13:32:25,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:25,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=896820.0, ans=0.0 2023-10-02 13:32:27,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 13:32:28,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:32:35,978 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.840e+02 2.040e+02 2.267e+02 3.457e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 13:32:36,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=896886.6666666666, ans=0.025 2023-10-02 13:32:37,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:37,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=896886.6666666666, ans=0.125 2023-10-02 13:32:40,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:32:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:32:43,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:32:43,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 13:32:43,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:32:46,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:46,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 13:32:46,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:32:46,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:32:46,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:46,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:32:49,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:32:49,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:32:49,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:32:50,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:32:50,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:54,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=896953.3333333334, ans=0.0 2023-10-02 13:32:55,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:32:55,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 13:32:58,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:58,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.81 vs. limit=15.0 2023-10-02 13:32:59,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:33:00,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 13:33:04,817 INFO [train.py:1046] (2/4) Epoch 26, batch 1750, loss[loss=0.1689, simple_loss=0.2263, pruned_loss=0.05572, over 22640.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2464, pruned_loss=0.04606, over 4699838.90 frames. ], batch size: 322, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:33:06,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:08,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:08,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:33:10,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 13:33:10,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:33:13,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:33:13,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:17,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 13:33:19,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:19,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=897086.6666666666, ans=0.1 2023-10-02 13:33:22,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 13:33:22,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:33:22,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:33:24,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.96 vs. limit=15.0 2023-10-02 13:33:25,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:33:26,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 13:33:29,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:33:29,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 13:33:38,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:33:40,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:33:40,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:33:43,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:43,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:33:45,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:33:46,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:49,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:33:49,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:33:52,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 13:33:52,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:33:55,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 13:33:56,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:33:57,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:59,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:34:03,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:34:05,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:34:06,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:34:06,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:34:10,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:34:12,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:34:15,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:34:17,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 13:34:17,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:34:18,359 INFO [train.py:1046] (2/4) Epoch 26, batch 1800, loss[loss=0.159, simple_loss=0.2432, pruned_loss=0.03745, over 24530.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.246, pruned_loss=0.04594, over 4705496.42 frames. ], batch size: 63, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:34:18,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:34:18,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:18,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:34:18,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:34:18,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:34:23,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:34:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:34:25,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:34:28,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:34:30,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:34:31,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:34:34,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:34:35,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:35,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:37,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:34:39,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:34:39,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 13:34:39,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:34:43,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:34:44,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 13:34:47,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=897486.6666666666, ans=0.0 2023-10-02 13:34:48,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 13:34:48,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 13:34:48,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:34:49,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:49,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:34:51,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:34:57,183 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 13:34:58,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:34:59,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:01,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 13:35:01,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 13:35:03,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:35:04,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:35:05,754 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.971e+02 2.205e+02 2.598e+02 3.859e+02, threshold=4.411e+02, percent-clipped=0.0 2023-10-02 13:35:05,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:35:06,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=897553.3333333334, ans=0.0 2023-10-02 13:35:10,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 13:35:15,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:35:16,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 13:35:16,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:35:16,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:35:18,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:35:18,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 13:35:20,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:35:20,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:35:21,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 13:35:21,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:35:25,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:35:25,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:35:25,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:27,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:27,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:35:29,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:35:29,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:35:33,122 INFO [train.py:1046] (2/4) Epoch 26, batch 1850, loss[loss=0.17, simple_loss=0.2579, pruned_loss=0.04109, over 24447.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2459, pruned_loss=0.04608, over 4695949.86 frames. ], batch size: 66, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:35:33,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:35:34,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:35:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:35:40,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 13:35:45,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 13:35:47,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 13:35:50,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:35:50,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 13:35:50,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 13:35:51,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=897753.3333333334, ans=0.0 2023-10-02 13:35:53,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=897753.3333333334, ans=0.1 2023-10-02 13:36:01,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:36:01,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=897820.0, ans=0.05 2023-10-02 13:36:02,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 13:36:05,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:36:05,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:36:10,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 13:36:10,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:11,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:36:13,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:36:15,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=897820.0, ans=0.125 2023-10-02 13:36:17,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:36:19,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:36:20,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=897886.6666666666, ans=0.125 2023-10-02 13:36:21,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:36:21,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:22,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=897886.6666666666, ans=0.0 2023-10-02 13:36:23,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:36:23,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:24,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:36:26,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:36:29,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 13:36:30,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:36:33,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:36:34,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:36:34,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 13:36:34,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 13:36:36,297 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 13:36:36,372 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 13:36:36,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=897953.3333333334, ans=0.0 2023-10-02 13:36:38,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:36:38,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:36:39,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:36:39,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:40,947 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 13:36:40,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:36:41,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:42,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:36:44,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:36:44,826 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.27 vs. limit=15.0 2023-10-02 13:36:44,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.23 vs. limit=10.0 2023-10-02 13:36:45,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:36:45,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 13:36:47,269 INFO [train.py:1046] (2/4) Epoch 26, batch 1900, loss[loss=0.1846, simple_loss=0.2546, pruned_loss=0.05729, over 23805.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2467, pruned_loss=0.04611, over 4702360.29 frames. ], batch size: 212, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:36:49,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:49,939 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 13:36:49,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:36:51,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:56,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:58,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:36:58,286 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 13:37:00,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 13:37:00,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:37:00,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:37:00,406 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 13:37:01,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 13:37:04,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 13:37:05,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:37:10,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 13:37:10,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=898086.6666666666, ans=0.0 2023-10-02 13:37:12,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 13:37:24,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 13:37:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 13:37:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:37:25,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=898153.3333333334, ans=0.125 2023-10-02 13:37:27,149 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 13:37:27,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 13:37:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 13:37:28,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 13:37:28,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:37:34,435 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.789e+02 2.022e+02 2.205e+02 2.844e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-02 13:37:34,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 13:37:37,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:37:40,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:37:40,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 13:37:42,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:37:47,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 13:37:47,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:37:51,174 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:37:52,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:37:52,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:37:52,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:37:54,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:37:54,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:37:55,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:37:55,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:37:59,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:37:59,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:38:01,077 INFO [train.py:1046] (2/4) Epoch 26, batch 1950, loss[loss=0.1863, simple_loss=0.2611, pruned_loss=0.05571, over 23618.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2475, pruned_loss=0.04604, over 4714534.37 frames. ], batch size: 256, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:38:01,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:38:01,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:38:02,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:38:02,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:38:05,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:38:08,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:38:08,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:10,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:38:12,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 13:38:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:38:12,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:14,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:17,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:38:17,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:38:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:19,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:38:22,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:38:22,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:38:23,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:38:24,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:26,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:30,188 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.26 vs. limit=12.0 2023-10-02 13:38:30,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:38:30,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:38:30,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:38:30,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 13:38:31,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:38:31,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:38:32,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:35,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:39,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:38:42,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:38:45,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:38:45,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:38:46,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 13:38:47,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:38:48,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=898553.3333333334, ans=0.0 2023-10-02 13:38:50,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:38:51,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:38:51,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:39:00,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:00,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:03,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:04,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:39:08,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:39:09,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:39:09,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 13:39:09,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:39:11,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:39:12,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 13:39:15,103 INFO [train.py:1046] (2/4) Epoch 26, batch 2000, loss[loss=0.162, simple_loss=0.2573, pruned_loss=0.03338, over 24334.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2481, pruned_loss=0.04613, over 4714488.39 frames. ], batch size: 74, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:39:15,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:39:16,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:39:18,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:39:18,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:39:20,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:39:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:26,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 13:39:27,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:39:27,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=898686.6666666666, ans=0.125 2023-10-02 13:39:29,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:39:30,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 13:39:31,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:39:31,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:39:33,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:39:35,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 13:39:36,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 13:39:40,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:39:41,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 13:39:41,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:39:45,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:39:45,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:39:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:47,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:39:49,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:39:49,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 13:39:53,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 13:39:53,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:39:53,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:39:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:02,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:40:02,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:40:02,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:40:02,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=898886.6666666666, ans=0.125 2023-10-02 13:40:03,567 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.946e+02 2.166e+02 2.855e+02 3.639e+02, threshold=4.333e+02, percent-clipped=0.0 2023-10-02 13:40:03,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:40:03,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:05,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:40:05,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:05,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=898886.6666666666, ans=0.1 2023-10-02 13:40:06,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:08,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:40:09,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 13:40:12,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:40:14,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:14,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=898953.3333333334, ans=0.0 2023-10-02 13:40:16,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:16,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:40:20,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:23,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:40:23,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:25,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:40:25,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:40:29,531 INFO [train.py:1046] (2/4) Epoch 26, batch 2050, loss[loss=0.1458, simple_loss=0.2316, pruned_loss=0.03001, over 24314.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2474, pruned_loss=0.04626, over 4695461.77 frames. ], batch size: 61, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:40:29,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:30,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:32,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:40:33,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:36,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:40:38,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=899020.0, ans=0.2 2023-10-02 13:40:39,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:40:41,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:42,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:40:43,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 13:40:43,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:40:45,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:45,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=899086.6666666666, ans=0.2 2023-10-02 13:40:46,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:40:56,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:40:56,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:57,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 13:41:00,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:41:02,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 13:41:02,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:41:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:41:06,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:06,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:41:07,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:41:10,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:41:10,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:41:12,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:41:15,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:17,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:41:19,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:41:21,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:41:25,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:41:30,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:41:30,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=899286.6666666666, ans=0.0 2023-10-02 13:41:31,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 13:41:36,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:41:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:41:37,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=899286.6666666666, ans=0.1 2023-10-02 13:41:38,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:41:40,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 13:41:43,097 INFO [train.py:1046] (2/4) Epoch 26, batch 2100, loss[loss=0.1565, simple_loss=0.2287, pruned_loss=0.04215, over 23418.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2457, pruned_loss=0.04587, over 4700573.06 frames. ], batch size: 105, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:41:45,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 13:41:45,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:41:45,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:46,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:41:46,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:41:46,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 13:41:47,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 13:41:47,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:41:51,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:41:52,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:41:54,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:41:55,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:41:55,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 13:41:56,540 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-10-02 13:41:57,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:41:57,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 13:41:57,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 13:41:59,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.34 vs. limit=15.0 2023-10-02 13:42:00,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:00,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:42:00,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 13:42:00,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=899420.0, ans=0.125 2023-10-02 13:42:01,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 13:42:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 13:42:07,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:42:08,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:42:09,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:42:14,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:42:14,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 13:42:14,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:14,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:42:16,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=899486.6666666666, ans=0.09899494936611666 2023-10-02 13:42:17,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 13:42:17,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:17,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 13:42:17,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 13:42:18,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 13:42:21,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:42:23,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:42:26,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:42:26,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:42:28,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:29,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 13:42:29,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:29,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:31,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:31,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 13:42:32,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 13:42:33,880 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.840e+02 2.077e+02 2.431e+02 3.502e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-02 13:42:33,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 13:42:36,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:42:39,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:42:39,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 13:42:44,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:47,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:42:48,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:42:48,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:42:48,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 13:42:48,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:42:49,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:49,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:42:51,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:42:51,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:52,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 13:42:54,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 13:42:54,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:42:56,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=899686.6666666666, ans=0.0 2023-10-02 13:42:57,298 INFO [train.py:1046] (2/4) Epoch 26, batch 2150, loss[loss=0.1675, simple_loss=0.2394, pruned_loss=0.04784, over 23525.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2455, pruned_loss=0.04563, over 4700743.97 frames. ], batch size: 134, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:42:58,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:58,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:42:58,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:42:58,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:43:03,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:43:05,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:07,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:09,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:43:09,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:11,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:43:12,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:13,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:43:13,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:43:18,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:18,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 13:43:23,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:25,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:43:25,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:25,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=899820.0, ans=0.0 2023-10-02 13:43:26,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:26,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:26,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:43:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:28,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:43:28,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:43:30,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 13:43:31,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:43:32,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:33,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:35,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:43:35,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:43:37,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=899820.0, ans=0.0 2023-10-02 13:43:38,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:38,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:43:39,364 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.11 vs. limit=15.0 2023-10-02 13:43:40,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:40,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 13:43:41,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:43:44,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:45,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:45,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=899886.6666666666, ans=0.0 2023-10-02 13:43:46,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:48,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.65 vs. limit=15.0 2023-10-02 13:43:49,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:43:49,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:49,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:49,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 13:43:50,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 13:43:51,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:43:51,938 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 13:43:53,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:53,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:43:55,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 13:43:55,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:43:55,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 13:43:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 13:43:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 13:43:56,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 13:43:56,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:59,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:59,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:43:59,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:01,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:44:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:44:02,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:10,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=900020.0, ans=0.035 2023-10-02 13:44:11,255 INFO [train.py:1046] (2/4) Epoch 26, batch 2200, loss[loss=0.1691, simple_loss=0.2515, pruned_loss=0.04342, over 24485.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2458, pruned_loss=0.04542, over 4708897.08 frames. ], batch size: 66, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:44:11,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:44:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 13:44:17,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:44:19,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.10 vs. limit=15.0 2023-10-02 13:44:21,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:22,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:44:22,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:44:24,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:44:26,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:44:26,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:44:26,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 13:44:29,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=900086.6666666666, ans=0.0 2023-10-02 13:44:30,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 13:44:32,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:44:38,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 13:44:39,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=900153.3333333334, ans=0.125 2023-10-02 13:44:41,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:41,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:44:42,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:44:45,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:44:45,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 13:44:50,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:44:51,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:51,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 13:44:57,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:44:57,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:44:58,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:45:00,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:01,645 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.763e+02 1.852e+02 2.075e+02 2.576e+02, threshold=3.704e+02, percent-clipped=0.0 2023-10-02 13:45:01,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 13:45:03,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:05,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 13:45:08,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:08,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:45:08,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:11,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:45:11,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:45:11,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:11,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:12,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:45:12,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:45:15,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:45:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:45:17,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:45:19,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:45:19,847 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 13:45:20,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=900286.6666666666, ans=0.0 2023-10-02 13:45:22,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:45:22,593 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 13:45:22,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:45:24,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 13:45:25,721 INFO [train.py:1046] (2/4) Epoch 26, batch 2250, loss[loss=0.1728, simple_loss=0.2434, pruned_loss=0.05108, over 23864.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2464, pruned_loss=0.0454, over 4720447.07 frames. ], batch size: 179, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:45:25,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:45:27,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:45:27,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:45:28,616 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 13:45:29,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:45:31,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:45:32,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=900353.3333333334, ans=0.1 2023-10-02 13:45:36,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:45:37,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:45:40,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:45:41,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:45:42,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:45:46,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 13:45:46,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:46,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:45:47,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 13:45:49,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:45:49,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:45:50,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:45:55,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:45:55,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 13:45:56,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:45:58,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 13:45:59,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:46:01,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:46:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:46:06,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:46:08,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=900486.6666666666, ans=0.125 2023-10-02 13:46:09,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:09,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:46:10,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:46:11,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:46:16,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:46:16,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=900553.3333333334, ans=0.125 2023-10-02 13:46:19,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:46:22,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:46:23,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:46:23,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:46:28,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:46:28,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=900620.0, ans=0.1 2023-10-02 13:46:31,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:46:31,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 13:46:31,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:46:35,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 13:46:38,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:46:38,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:39,667 INFO [train.py:1046] (2/4) Epoch 26, batch 2300, loss[loss=0.1829, simple_loss=0.2651, pruned_loss=0.05032, over 23450.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.247, pruned_loss=0.04578, over 4719796.33 frames. ], batch size: 93, lr: 3.94e-03, grad_scale: 8.0 2023-10-02 13:46:44,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:44,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:46:46,953 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 13:46:49,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:57,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:46:57,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:46:57,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:46:57,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 13:46:59,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:47:01,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:47:01,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:47:05,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:47:08,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:47:09,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:47:11,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=900820.0, ans=0.0 2023-10-02 13:47:15,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:47:15,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:47:18,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:47:20,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:47:23,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=900886.6666666666, ans=0.04949747468305833 2023-10-02 13:47:24,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:47:24,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:47:26,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:47:26,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 13:47:30,977 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.002e+02 2.209e+02 2.529e+02 4.134e+02, threshold=4.417e+02, percent-clipped=1.0 2023-10-02 13:47:32,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:47:32,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:47:32,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:47:32,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:47:32,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:47:33,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 13:47:33,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:47:33,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 13:47:33,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:47:34,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=900886.6666666666, ans=0.125 2023-10-02 13:47:35,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:47:35,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 13:47:41,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:47:45,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:47:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:47:48,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:47:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:47:49,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:47:49,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:47:51,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:47:51,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 13:47:54,265 INFO [train.py:1046] (2/4) Epoch 26, batch 2350, loss[loss=0.2239, simple_loss=0.2917, pruned_loss=0.07801, over 19763.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2478, pruned_loss=0.04608, over 4726514.87 frames. ], batch size: 388, lr: 3.94e-03, grad_scale: 8.0 2023-10-02 13:47:57,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=901020.0, ans=0.015 2023-10-02 13:47:58,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:47:58,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 13:48:03,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 13:48:03,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=901020.0, ans=0.09899494936611666 2023-10-02 13:48:05,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:48:10,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:10,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:10,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:48:11,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:48:11,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 13:48:15,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:48:20,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 13:48:22,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:48:25,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:48:26,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:48:29,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:48:29,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 13:48:31,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:48:33,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:48:33,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:48:35,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:48:37,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:48:40,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 13:48:40,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:48:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:45,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:48:46,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 13:48:47,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:48:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 13:48:49,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:48:53,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 13:48:58,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 13:48:58,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:48:58,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:48:59,533 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 13:48:59,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 13:49:02,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 13:49:05,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:49:08,596 INFO [train.py:1046] (2/4) Epoch 26, batch 2400, loss[loss=0.1612, simple_loss=0.2513, pruned_loss=0.03553, over 24077.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2474, pruned_loss=0.04596, over 4732316.45 frames. ], batch size: 80, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:49:10,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:49:14,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:49:15,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:49:15,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 13:49:17,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 13:49:23,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:49:23,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:49:25,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 13:49:25,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:49:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:27,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 13:49:33,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:34,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 13:49:40,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:49:43,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 13:49:46,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:49:46,995 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:49:49,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:52,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:49:52,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 13:49:52,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:49:58,273 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.173e+02 2.694e+02 4.951e+02, threshold=4.347e+02, percent-clipped=1.0 2023-10-02 13:50:00,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:02,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:50:03,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:05,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:50:05,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:50:05,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:50:05,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:05,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:50:05,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:50:07,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=901620.0, ans=0.1 2023-10-02 13:50:11,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:50:11,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:50:11,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 13:50:13,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 13:50:15,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:50:15,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:16,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 13:50:16,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 13:50:16,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 13:50:16,573 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 13:50:17,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 13:50:19,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:50:20,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:20,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:50:22,018 INFO [train.py:1046] (2/4) Epoch 26, batch 2450, loss[loss=0.1714, simple_loss=0.2549, pruned_loss=0.04395, over 24467.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2462, pruned_loss=0.04568, over 4724178.71 frames. ], batch size: 69, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:50:22,092 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 13:50:23,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:23,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:50:26,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=901686.6666666666, ans=0.0 2023-10-02 13:50:28,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:50:28,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:50:31,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=901686.6666666666, ans=0.1 2023-10-02 13:50:32,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:32,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:50:33,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 13:50:38,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:50:38,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:42,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:50:42,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:50:42,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:50:42,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 13:50:43,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=901753.3333333334, ans=0.1 2023-10-02 13:50:48,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:50,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:50:50,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:50:54,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:50:54,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:50:54,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=901820.0, ans=0.125 2023-10-02 13:50:55,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:50:55,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:58,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 13:50:59,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:51:05,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:07,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:51:07,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:07,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:51:09,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:10,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:51:12,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 13:51:13,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:51:13,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:51:16,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:51:16,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:21,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:51:21,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 13:51:22,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:51:22,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:51:23,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 13:51:23,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:51:25,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:51:29,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:51:32,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:32,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:51:34,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 13:51:35,273 INFO [train.py:1046] (2/4) Epoch 26, batch 2500, loss[loss=0.1695, simple_loss=0.2471, pruned_loss=0.04598, over 23699.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2458, pruned_loss=0.04521, over 4712388.57 frames. ], batch size: 135, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:51:35,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:51:36,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.39 vs. limit=22.5 2023-10-02 13:51:40,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=902020.0, ans=0.125 2023-10-02 13:51:41,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:51:49,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:51:49,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:50,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:51:50,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 13:51:57,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=902086.6666666666, ans=0.0 2023-10-02 13:51:57,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=902086.6666666666, ans=0.125 2023-10-02 13:51:58,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:51:58,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:51:59,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:51:59,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 13:51:59,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 13:52:01,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:02,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:52:02,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 13:52:02,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:04,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 13:52:04,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:04,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=902153.3333333334, ans=0.0 2023-10-02 13:52:08,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:52:08,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:52:11,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:52:13,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 13:52:15,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:52:16,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:20,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:20,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=902220.0, ans=0.0 2023-10-02 13:52:24,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:25,910 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.808e+02 1.922e+02 2.142e+02 2.952e+02, threshold=3.844e+02, percent-clipped=0.0 2023-10-02 13:52:26,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:52:26,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=902220.0, ans=0.1 2023-10-02 13:52:27,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=902220.0, ans=0.0 2023-10-02 13:52:31,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:52:35,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 13:52:35,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:52:35,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:52:36,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:52:36,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:52:38,282 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 13:52:38,282 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 13:52:38,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 13:52:39,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:42,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 13:52:42,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 13:52:42,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:52:44,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 13:52:47,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 13:52:50,315 INFO [train.py:1046] (2/4) Epoch 26, batch 2550, loss[loss=0.1711, simple_loss=0.2622, pruned_loss=0.04003, over 24631.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2454, pruned_loss=0.04484, over 4713213.90 frames. ], batch size: 68, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:52:50,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:52:53,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:52:53,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:52:55,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:52:59,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 13:52:59,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:53:01,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 13:53:03,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:53:06,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:07,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:53:08,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 13:53:08,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:53:09,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:53:09,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:53:12,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:53:12,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 13:53:13,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:53:13,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:13,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 13:53:19,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=902486.6666666666, ans=0.2 2023-10-02 13:53:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:53:26,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=902486.6666666666, ans=0.125 2023-10-02 13:53:29,660 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:53:30,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:53:30,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:30,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:53:32,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:53:32,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=902553.3333333334, ans=0.0 2023-10-02 13:53:40,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:53:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:53:41,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:53:41,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:53:41,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:53:43,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:53:45,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:53:47,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:52,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:53:53,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 13:53:53,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:53:53,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:53,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:53:55,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:53:56,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:03,846 INFO [train.py:1046] (2/4) Epoch 26, batch 2600, loss[loss=0.1734, simple_loss=0.2516, pruned_loss=0.04762, over 23259.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.245, pruned_loss=0.04468, over 4715911.81 frames. ], batch size: 93, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:54:03,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:54:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:09,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 13:54:11,739 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 13:54:11,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:54:11,796 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 13:54:13,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 13:54:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 13:54:15,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:54:15,899 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 13:54:17,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 13:54:19,101 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 13:54:20,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:54:21,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 13:54:23,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 13:54:25,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:54:25,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 13:54:26,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 13:54:26,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 13:54:32,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=902820.0, ans=0.125 2023-10-02 13:54:34,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:54:34,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:36,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:54:36,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 13:54:38,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=902820.0, ans=0.125 2023-10-02 13:54:40,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:54:43,652 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=15.0 2023-10-02 13:54:43,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.73 vs. limit=15.0 2023-10-02 13:54:44,371 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 13:54:47,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=902886.6666666666, ans=0.125 2023-10-02 13:54:50,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:50,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:54:51,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 13:54:53,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:54:53,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:54:53,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 13:54:54,628 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.918e+02 2.069e+02 2.446e+02 3.571e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-02 13:54:55,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:54:56,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:54:59,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:02,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 13:55:02,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:02,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:55:06,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=902953.3333333334, ans=0.2 2023-10-02 13:55:07,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:55:09,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:55:09,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 13:55:10,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:55:12,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:55:14,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:55:18,196 INFO [train.py:1046] (2/4) Epoch 26, batch 2650, loss[loss=0.1723, simple_loss=0.2613, pruned_loss=0.04161, over 24448.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2462, pruned_loss=0.04522, over 4719054.66 frames. ], batch size: 69, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:55:19,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 13:55:19,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:21,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=903020.0, ans=0.125 2023-10-02 13:55:22,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:55:26,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 13:55:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:27,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:55:28,822 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 13:55:28,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:55:31,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:33,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:55:34,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:55:36,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=903086.6666666666, ans=0.125 2023-10-02 13:55:36,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=903086.6666666666, ans=0.0 2023-10-02 13:55:37,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:37,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 13:55:37,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:55:37,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:55:42,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 13:55:42,800 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 13:55:44,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:55:46,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 13:55:46,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:55:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 13:55:51,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:55:51,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:55:51,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:55:52,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:55:56,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 13:55:56,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 13:55:59,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:56:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 13:56:03,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:56:04,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:04,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:56:05,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:56:05,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:56:07,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:56:09,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:56:11,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:56:11,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=903220.0, ans=0.2 2023-10-02 13:56:12,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:56:13,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:56:15,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:15,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=903220.0, ans=0.125 2023-10-02 13:56:16,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:56:16,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:19,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:56:19,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:56:22,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:23,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:56:23,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:23,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 13:56:27,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:56:29,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:29,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:31,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:33,060 INFO [train.py:1046] (2/4) Epoch 26, batch 2700, loss[loss=0.1631, simple_loss=0.2287, pruned_loss=0.04879, over 22722.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2459, pruned_loss=0.04479, over 4733586.66 frames. ], batch size: 322, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:56:33,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:56:33,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:35,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:56:35,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 13:56:36,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=903353.3333333334, ans=0.125 2023-10-02 13:56:39,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:56:40,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 13:56:42,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:56:43,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:43,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:56:43,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:43,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:56:45,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:56:45,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 13:56:45,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:56:48,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:56:49,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:56:49,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:53,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:56:53,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 13:56:53,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:56:56,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=903420.0, ans=0.04949747468305833 2023-10-02 13:56:56,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.85 vs. limit=10.0 2023-10-02 13:56:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:56:57,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:03,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:57:03,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:57:04,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:57:05,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:57:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:08,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=903486.6666666666, ans=0.1 2023-10-02 13:57:11,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:57:11,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:57:11,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:57:14,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=903486.6666666666, ans=0.125 2023-10-02 13:57:16,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:16,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:57:22,897 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.793e+02 2.064e+02 2.296e+02 3.697e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-02 13:57:23,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:57:24,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:57:27,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:57:27,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:31,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:33,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:34,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:57:36,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:36,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:37,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=903620.0, ans=0.1 2023-10-02 13:57:38,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:57:41,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:57:43,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:43,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:44,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 13:57:46,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:47,400 INFO [train.py:1046] (2/4) Epoch 26, batch 2750, loss[loss=0.1428, simple_loss=0.2225, pruned_loss=0.0315, over 24591.00 frames. ], tot_loss[loss=0.167, simple_loss=0.245, pruned_loss=0.0445, over 4725254.99 frames. ], batch size: 60, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:57:48,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:57:48,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 13:57:50,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 13:57:50,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:52,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:57:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:55,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:55,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:57:55,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:59,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:57:59,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:57:59,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:57:59,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:59,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 13:57:59,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:57:59,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:58:03,504 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:58:05,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 13:58:07,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:58:07,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:09,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:58:09,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:58:09,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:58:11,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:58:11,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:12,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:13,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-10-02 13:58:15,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:58:15,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:58:16,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:58:18,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:19,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:58:25,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:27,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:58:27,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:58:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:33,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:58:33,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:58:38,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:58:39,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:58:39,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 13:58:42,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:58:44,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 13:58:48,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:58:49,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:58:51,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 13:58:51,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:58:52,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:58:54,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 13:58:54,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:58:57,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 13:58:57,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:58:58,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:00,217 INFO [train.py:1046] (2/4) Epoch 26, batch 2800, loss[loss=0.1504, simple_loss=0.1997, pruned_loss=0.05053, over 19162.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2443, pruned_loss=0.04473, over 4712024.30 frames. ], batch size: 388, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 13:59:00,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 13:59:00,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:00,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:04,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:04,815 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 13:59:04,816 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 13:59:06,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=904020.0, ans=0.125 2023-10-02 13:59:08,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:09,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:59:09,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:59:13,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:59:16,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 13:59:17,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 13:59:19,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 13:59:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:20,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:59:20,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:59:24,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:59:24,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:24,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:59:26,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:59:31,382 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.44 vs. limit=22.5 2023-10-02 13:59:33,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:59:35,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:39,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:39,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:59:41,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:59:42,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=904153.3333333334, ans=0.125 2023-10-02 13:59:44,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:59:45,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 13:59:45,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:46,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:59:46,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:59:47,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=904220.0, ans=0.5 2023-10-02 13:59:50,941 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.855e+02 1.985e+02 2.213e+02 3.780e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-02 13:59:50,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:51,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:55,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:59:56,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:59:56,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:56,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:59:56,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:59:58,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:59:58,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.09 vs. limit=10.0 2023-10-02 13:59:59,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 13:59:59,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:00,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:00:00,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:02,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 14:00:04,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:04,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:00:05,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:00:07,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 14:00:10,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:00:12,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:00:12,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:00:13,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:00:14,809 INFO [train.py:1046] (2/4) Epoch 26, batch 2850, loss[loss=0.1739, simple_loss=0.2451, pruned_loss=0.05135, over 23488.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2443, pruned_loss=0.04435, over 4713995.66 frames. ], batch size: 120, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 14:00:16,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:00:16,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:00:16,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:00:20,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:20,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:00:23,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:00:23,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 14:00:31,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 14:00:31,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:00:33,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 14:00:33,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:36,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 14:00:37,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 14:00:38,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=904420.0, ans=0.025 2023-10-02 14:00:39,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:43,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=904486.6666666666, ans=0.125 2023-10-02 14:00:50,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:51,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=904486.6666666666, ans=0.125 2023-10-02 14:00:53,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:00:53,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:00:53,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:00:53,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:00:53,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:00:54,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:00:54,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 14:00:57,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:00:57,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:00:58,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:01:00,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:01,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:01,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:03,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:04,356 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.75 vs. limit=15.0 2023-10-02 14:01:06,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:01:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:01:08,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:09,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:11,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:01:15,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:01:17,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 14:01:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 14:01:18,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:01:19,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:21,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 14:01:21,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:01:22,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:22,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:01:22,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:01:22,641 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 14:01:22,671 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 14:01:22,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:01:24,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:28,027 INFO [train.py:1046] (2/4) Epoch 26, batch 2900, loss[loss=0.1776, simple_loss=0.259, pruned_loss=0.04803, over 23196.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2442, pruned_loss=0.04444, over 4704983.27 frames. ], batch size: 93, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 14:01:28,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 14:01:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:01:29,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:01:29,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 14:01:32,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 14:01:34,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 14:01:37,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:01:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:01:40,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:44,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:01:46,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:01:46,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:50,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:01:51,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 14:01:52,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:01:53,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 14:01:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 14:01:57,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.77 vs. limit=10.0 2023-10-02 14:01:57,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:57,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 14:01:57,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:02:00,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:02:00,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 14:02:03,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:02:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:02:06,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:02:09,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:09,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 14:02:11,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 14:02:11,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:02:15,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:02:18,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 14:02:19,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.46 vs. limit=15.0 2023-10-02 14:02:19,822 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.775e+02 2.023e+02 2.342e+02 3.264e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-02 14:02:19,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:02:22,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:02:27,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=904953.3333333334, ans=0.125 2023-10-02 14:02:32,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:02:33,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:02:34,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 14:02:37,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:37,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 14:02:37,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:02:38,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:02:41,846 INFO [train.py:1046] (2/4) Epoch 26, batch 2950, loss[loss=0.2381, simple_loss=0.2976, pruned_loss=0.08931, over 19276.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2455, pruned_loss=0.04482, over 4710411.26 frames. ], batch size: 388, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 14:02:43,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:02:45,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 14:02:47,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:02:47,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:47,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:02:48,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:02:49,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 14:02:51,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 14:02:51,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:02:51,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:02:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:02:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:02:58,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=905086.6666666666, ans=0.1 2023-10-02 14:03:00,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:00,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:03:04,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:03:04,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:03:07,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:03:07,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:03:07,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:03:11,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 14:03:14,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=905153.3333333334, ans=0.125 2023-10-02 14:03:15,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 14:03:15,103 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 14:03:17,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:03:18,033 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.41 vs. limit=10.0 2023-10-02 14:03:18,457 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 14:03:19,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 14:03:19,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:03:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:03:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 14:03:19,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:03:22,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 14:03:23,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:03:24,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:03:26,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:03:28,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:03:28,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:28,185 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 14:03:28,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:03:29,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 14:03:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:36,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:03:38,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 14:03:38,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:03:39,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 14:03:41,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:03:43,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:03:43,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:03:43,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=905286.6666666666, ans=0.125 2023-10-02 14:03:44,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:44,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:03:48,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:03:48,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:48,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:03:49,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:03:49,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:03:49,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=905286.6666666666, ans=0.0 2023-10-02 14:03:50,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:03:53,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:53,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 14:03:55,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:56,575 INFO [train.py:1046] (2/4) Epoch 26, batch 3000, loss[loss=0.1497, simple_loss=0.2318, pruned_loss=0.03377, over 22497.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2465, pruned_loss=0.04532, over 4713842.07 frames. ], batch size: 49, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:03:56,576 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 14:04:08,888 INFO [train.py:1078] (2/4) Epoch 26, validation: loss=0.3521, simple_loss=0.2784, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-02 14:04:08,888 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 14:04:09,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=905353.3333333334, ans=0.125 2023-10-02 14:04:10,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:04:11,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:04:17,150 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 14:04:17,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 14:04:18,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=905353.3333333334, ans=0.0 2023-10-02 14:04:19,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:04:19,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:04:20,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 14:04:20,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:04:26,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:04:33,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:04:39,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 14:04:40,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:04:44,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:04:46,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:04:46,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:04:48,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:04:48,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 14:04:48,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=905486.6666666666, ans=0.0 2023-10-02 14:04:49,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.47 vs. limit=15.0 2023-10-02 14:04:50,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 14:04:52,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:04:52,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:04:54,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:04:54,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:04:56,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:04:56,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:04:57,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:04:59,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:04:59,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:05:00,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:05:03,752 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.996e+02 2.253e+02 2.544e+02 4.342e+02, threshold=4.506e+02, percent-clipped=3.0 2023-10-02 14:05:03,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 14:05:03,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:05:05,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:05,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:05:09,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:09,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:10,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 14:05:10,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 14:05:10,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:05:10,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 14:05:12,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:05:14,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.55 vs. limit=6.0 2023-10-02 14:05:15,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 14:05:19,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:05:20,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:05:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 14:05:20,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=905620.0, ans=0.125 2023-10-02 14:05:21,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 14:05:21,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:05:23,229 INFO [train.py:1046] (2/4) Epoch 26, batch 3050, loss[loss=0.1511, simple_loss=0.2306, pruned_loss=0.03578, over 24332.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2465, pruned_loss=0.04525, over 4728854.19 frames. ], batch size: 61, lr: 3.93e-03, grad_scale: 4.0 2023-10-02 14:05:23,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:05:24,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:24,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:05:24,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:26,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:05:27,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 14:05:30,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:05:31,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:31,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=905686.6666666666, ans=0.2 2023-10-02 14:05:32,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:05:35,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:38,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 14:05:43,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 14:05:44,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 14:05:44,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:05:49,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:05:54,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:54,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:05:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:05:57,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:05:58,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:05:58,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:58,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:06:00,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:06:01,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:03,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:06:03,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 14:06:04,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:06:04,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:06:06,569 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=12.0 2023-10-02 14:06:07,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:06:07,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:06:07,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:06:08,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:10,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=905886.6666666666, ans=0.125 2023-10-02 14:06:13,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:06:14,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:14,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=905886.6666666666, ans=0.1 2023-10-02 14:06:19,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:19,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:06:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:06:21,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:06:22,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:06:22,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:06:24,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 14:06:24,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:06:24,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:25,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 14:06:26,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:32,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:33,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:06:34,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=905953.3333333334, ans=0.2 2023-10-02 14:06:36,515 INFO [train.py:1046] (2/4) Epoch 26, batch 3100, loss[loss=0.1495, simple_loss=0.2261, pruned_loss=0.03645, over 24308.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2457, pruned_loss=0.04538, over 4724288.37 frames. ], batch size: 56, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:06:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:06:38,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 14:06:39,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 14:06:41,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 14:06:42,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:06:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:06:47,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:49,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=906020.0, ans=0.0 2023-10-02 14:06:50,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:06:53,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:59,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 14:07:04,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:07:04,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:04,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:07:06,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:07:07,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 14:07:09,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:07:09,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 14:07:09,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:07:10,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:07:12,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 14:07:13,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:07:15,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=906153.3333333334, ans=0.2 2023-10-02 14:07:16,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:07:16,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 14:07:16,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 14:07:18,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:18,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:07:21,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:07:21,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:22,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:07:23,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:07:23,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:07:24,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:07:24,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:07:26,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:26,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:07:30,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:07:31,917 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.814e+02 2.073e+02 2.408e+02 3.405e+02, threshold=4.147e+02, percent-clipped=0.0 2023-10-02 14:07:32,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 14:07:34,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:07:36,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 14:07:36,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:07:37,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:37,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 14:07:41,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=906286.6666666666, ans=0.2 2023-10-02 14:07:45,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=906286.6666666666, ans=0.125 2023-10-02 14:07:48,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 14:07:51,595 INFO [train.py:1046] (2/4) Epoch 26, batch 3150, loss[loss=0.1375, simple_loss=0.1878, pruned_loss=0.04359, over 19231.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2438, pruned_loss=0.04499, over 4712567.77 frames. ], batch size: 388, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:07:51,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:07:51,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:54,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:07:54,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:07:54,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 14:07:56,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:07:57,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:07:59,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 14:08:00,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:02,142 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 14:08:04,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 14:08:04,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:08:05,062 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 14:08:06,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 14:08:07,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 14:08:09,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 14:08:09,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 14:08:09,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:08:10,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:11,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.30 vs. limit=22.5 2023-10-02 14:08:12,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 14:08:12,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=906420.0, ans=0.125 2023-10-02 14:08:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:08:13,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:08:14,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:08:17,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:08:21,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 14:08:22,108 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-10-02 14:08:22,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:08:25,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:08:26,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:08:26,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 14:08:29,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 14:08:30,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:08:30,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:08:32,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:08:32,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:08:32,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:08:33,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:08:33,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:08:36,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 14:08:36,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:08:36,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:37,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:08:37,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:08:38,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 14:08:38,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:08:42,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 14:08:42,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:43,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 14:08:43,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 14:08:45,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:08:46,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:08:46,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 14:08:47,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 14:08:48,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:08:50,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=906620.0, ans=0.2 2023-10-02 14:08:51,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:08:52,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:52,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:08:54,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=906620.0, ans=0.125 2023-10-02 14:08:57,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:08:58,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:00,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 14:09:08,278 INFO [train.py:1046] (2/4) Epoch 26, batch 3200, loss[loss=0.1846, simple_loss=0.2695, pruned_loss=0.04985, over 24326.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2433, pruned_loss=0.04478, over 4711391.26 frames. ], batch size: 77, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:09:08,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:09:08,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 14:09:11,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:11,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:09:11,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 14:09:14,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:09:19,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:09:21,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:29,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:09:34,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=906753.3333333334, ans=0.125 2023-10-02 14:09:36,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 14:09:37,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=906820.0, ans=0.125 2023-10-02 14:09:38,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:09:41,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 14:09:41,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:09:43,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=906820.0, ans=0.125 2023-10-02 14:09:44,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:09:45,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:09:46,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:09:49,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 14:09:50,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 14:09:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 14:09:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 14:09:56,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:10:01,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:01,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:10:01,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:02,484 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 14:10:02,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:10:03,778 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.962e+02 2.253e+02 2.668e+02 3.638e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-02 14:10:06,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:09,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 14:10:09,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 14:10:11,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 14:10:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 14:10:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:10:17,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:10:17,182 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 14:10:17,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:10:17,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 14:10:21,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=907020.0, ans=0.0 2023-10-02 14:10:21,741 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.99 vs. limit=15.0 2023-10-02 14:10:22,561 INFO [train.py:1046] (2/4) Epoch 26, batch 3250, loss[loss=0.1807, simple_loss=0.2479, pruned_loss=0.0567, over 23692.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2436, pruned_loss=0.04535, over 4701302.82 frames. ], batch size: 232, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:10:24,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:10:26,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:10:33,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:10:33,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 14:10:34,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=907020.0, ans=0.035 2023-10-02 14:10:35,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:35,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:35,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:10:36,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:10:38,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:10:41,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:41,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:10:42,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:42,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:42,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:42,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:10:46,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:10:46,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:10:46,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.30 vs. limit=10.0 2023-10-02 14:10:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:48,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:51,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:10:51,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:10:53,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=907153.3333333334, ans=0.125 2023-10-02 14:10:54,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 14:10:55,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:10:55,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:10:57,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:57,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.03 vs. limit=15.0 2023-10-02 14:10:58,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:11:05,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:11:12,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:11:13,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:13,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 14:11:13,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:11:13,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:11:15,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:17,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 14:11:18,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 14:11:18,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:11:19,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:11:21,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:11:21,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:11:21,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=907286.6666666666, ans=0.0 2023-10-02 14:11:22,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:11:25,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:11:25,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:11:28,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 14:11:28,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:31,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:11:31,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 14:11:34,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:11:34,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 14:11:36,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 14:11:37,353 INFO [train.py:1046] (2/4) Epoch 26, batch 3300, loss[loss=0.1726, simple_loss=0.244, pruned_loss=0.05061, over 23875.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2442, pruned_loss=0.04527, over 4702608.01 frames. ], batch size: 195, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:11:37,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 14:11:37,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:11:40,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:11:41,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:11:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:43,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=907353.3333333334, ans=0.1 2023-10-02 14:11:44,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 14:11:44,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:11:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:48,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:11:54,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 14:11:54,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:11:54,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:56,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:57,578 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 14:11:57,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:11:59,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:12:00,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:12:00,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:00,413 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 14:12:05,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:12:05,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:12:07,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:07,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 14:12:07,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=907486.6666666666, ans=0.1 2023-10-02 14:12:07,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=907486.6666666666, ans=0.125 2023-10-02 14:12:08,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 14:12:08,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:09,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:12:11,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=907486.6666666666, ans=0.04949747468305833 2023-10-02 14:12:12,563 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 14:12:14,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 14:12:14,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:12:16,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 14:12:16,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:12:20,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:12:20,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:12:23,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:12:23,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:12:23,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:12:23,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:12:24,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:12:24,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:26,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:12:27,620 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 14:12:28,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 14:12:32,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.777e+02 1.941e+02 2.097e+02 3.420e+02, threshold=3.882e+02, percent-clipped=0.0 2023-10-02 14:12:32,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:12:33,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:12:33,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:35,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:12:35,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:36,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:12:38,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:38,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:12:39,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:39,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:12:42,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 14:12:42,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:43,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:45,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:12:46,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:12:47,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:12:50,868 INFO [train.py:1046] (2/4) Epoch 26, batch 3350, loss[loss=0.1905, simple_loss=0.2771, pruned_loss=0.05192, over 24334.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.245, pruned_loss=0.04521, over 4696840.00 frames. ], batch size: 77, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:12:50,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:50,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:53,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:12:55,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:57,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:12:58,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:01,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:13:02,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:13:02,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=907686.6666666666, ans=0.125 2023-10-02 14:13:03,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:13:04,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 14:13:07,839 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 14:13:07,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:13:10,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 14:13:10,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 14:13:10,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:13:11,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:13:13,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:13,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 14:13:13,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=907753.3333333334, ans=0.125 2023-10-02 14:13:14,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:14,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:13:17,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:18,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:20,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:13:24,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:26,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:28,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:13:32,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:35,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:35,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:36,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=907886.6666666666, ans=0.125 2023-10-02 14:13:38,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:40,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 14:13:40,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:13:40,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 14:13:40,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:13:42,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 14:13:43,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:44,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:44,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=907886.6666666666, ans=0.125 2023-10-02 14:13:47,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=907886.6666666666, ans=0.1 2023-10-02 14:13:51,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:53,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 14:13:53,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:13:55,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:13:55,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:14:02,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:14:03,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 14:14:05,074 INFO [train.py:1046] (2/4) Epoch 26, batch 3400, loss[loss=0.1775, simple_loss=0.2608, pruned_loss=0.04707, over 23197.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2464, pruned_loss=0.04579, over 4710439.87 frames. ], batch size: 105, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:14:05,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:14:05,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:14:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:07,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 14:14:09,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:14:09,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 14:14:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:14:10,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:14:10,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:14:11,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:14:11,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 14:14:16,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 14:14:16,776 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 14:14:16,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:19,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=908086.6666666666, ans=0.1 2023-10-02 14:14:20,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:14:20,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:14:20,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:22,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:14:25,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=908086.6666666666, ans=0.125 2023-10-02 14:14:25,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=908086.6666666666, ans=0.1 2023-10-02 14:14:26,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:14:30,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 14:14:34,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:14:37,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:38,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:38,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 14:14:44,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:14:47,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 14:14:49,110 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:14:51,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:51,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:52,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 14:14:52,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:14:54,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:54,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:14:55,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:14:57,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:58,789 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.837e+02 2.033e+02 2.297e+02 3.671e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 14:15:00,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:15:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:15:06,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:15:06,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 14:15:13,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:15:17,844 INFO [train.py:1046] (2/4) Epoch 26, batch 3450, loss[loss=0.149, simple_loss=0.2352, pruned_loss=0.03136, over 24449.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2472, pruned_loss=0.04647, over 4697079.36 frames. ], batch size: 63, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:15:17,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 14:15:20,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.22 vs. limit=15.0 2023-10-02 14:15:20,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 14:15:20,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:15:22,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:15:22,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 14:15:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:15:24,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.01 vs. limit=15.0 2023-10-02 14:15:28,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:15:34,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:15:35,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:15:36,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:15:36,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:15:37,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:15:43,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=908420.0, ans=0.125 2023-10-02 14:15:44,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 14:15:45,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=908420.0, ans=0.0 2023-10-02 14:15:49,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 14:15:49,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=908486.6666666666, ans=0.125 2023-10-02 14:15:50,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:15:50,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:15:51,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:15:55,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 14:15:55,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:15:59,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:16:01,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:16:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:16:03,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:16:05,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 14:16:05,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:16:06,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:16:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:16:11,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 14:16:12,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=908553.3333333334, ans=0.125 2023-10-02 14:16:14,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:16:17,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=908620.0, ans=0.125 2023-10-02 14:16:17,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=908620.0, ans=0.07 2023-10-02 14:16:19,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:16:21,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:24,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:27,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:27,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:16:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:16:29,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:16:32,135 INFO [train.py:1046] (2/4) Epoch 26, batch 3500, loss[loss=0.1826, simple_loss=0.2527, pruned_loss=0.05625, over 23872.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2466, pruned_loss=0.04605, over 4703455.81 frames. ], batch size: 179, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:16:33,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:38,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:16:39,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 14:16:40,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:16:43,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:16:45,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 14:16:51,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:16:52,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:16:52,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:16:52,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:16:53,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:16:53,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:55,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:16:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 14:16:58,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:58,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:17:00,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:17:04,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:05,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 14:17:06,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:17:07,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.34 vs. limit=22.5 2023-10-02 14:17:09,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:17:12,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:17:12,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:13,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:17:13,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:17:15,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 14:17:16,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 14:17:16,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 14:17:16,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:17:19,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:19,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:17:19,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:17:22,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:17:23,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:17:28,012 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.927e+02 2.309e+02 3.023e+02 4.699e+02, threshold=4.619e+02, percent-clipped=3.0 2023-10-02 14:17:28,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:17:29,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 14:17:29,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 14:17:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:17:32,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:17:34,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:17:34,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:37,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 14:17:38,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:17:39,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=908953.3333333334, ans=0.0 2023-10-02 14:17:40,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:17:41,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 14:17:43,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 14:17:45,638 INFO [train.py:1046] (2/4) Epoch 26, batch 3550, loss[loss=0.1627, simple_loss=0.25, pruned_loss=0.03769, over 24684.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2456, pruned_loss=0.04547, over 4721747.69 frames. ], batch size: 73, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:17:45,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:47,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:17:47,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:17:47,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:17:50,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:17:50,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=909020.0, ans=0.125 2023-10-02 14:17:57,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:17:59,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 14:18:03,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:18:03,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:18:05,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:05,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:18:05,387 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:18:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:18:09,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:18:09,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:18:09,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:18:09,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:18:09,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=909086.6666666666, ans=0.125 2023-10-02 14:18:10,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:18:16,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:18:16,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:18:17,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:18:17,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:18:17,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:18:17,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=909153.3333333334, ans=0.125 2023-10-02 14:18:18,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 14:18:19,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:20,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:20,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:18:24,259 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.07 vs. limit=22.5 2023-10-02 14:18:26,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:18:26,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:18:27,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:18:31,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 14:18:32,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:18:32,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 14:18:33,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:18:36,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:18:36,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:18:40,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 14:18:42,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:18:48,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:18:48,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 14:18:48,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:18:52,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:54,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 14:18:58,950 INFO [train.py:1046] (2/4) Epoch 26, batch 3600, loss[loss=0.173, simple_loss=0.2609, pruned_loss=0.04254, over 24421.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2448, pruned_loss=0.04492, over 4714221.82 frames. ], batch size: 69, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:19:00,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 14:19:00,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:19:00,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:19:03,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:19:05,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:19:07,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:19:10,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:19:11,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:13,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:19:13,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:19:14,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:14,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 14:19:17,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=909420.0, ans=0.125 2023-10-02 14:19:18,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:19:18,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:21,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:19:24,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:19:25,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:19:25,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:19:25,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 14:19:27,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:19:30,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:30,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:19:31,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:34,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:19:36,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:19:36,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 14:19:42,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:19:44,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:19:44,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 14:19:48,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:19:48,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=909553.3333333334, ans=0.04949747468305833 2023-10-02 14:19:51,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=909553.3333333334, ans=0.1 2023-10-02 14:19:52,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:55,938 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.927e+02 2.147e+02 2.530e+02 3.358e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-02 14:19:56,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:57,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=909620.0, ans=0.1 2023-10-02 14:20:00,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:20:00,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:20:00,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 14:20:02,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 14:20:03,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 14:20:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:20:07,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:20:07,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 14:20:08,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:20:08,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:20:08,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:20:09,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 14:20:10,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 14:20:11,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=909620.0, ans=0.95 2023-10-02 14:20:12,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:20:14,027 INFO [train.py:1046] (2/4) Epoch 26, batch 3650, loss[loss=0.2135, simple_loss=0.279, pruned_loss=0.07404, over 19542.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2461, pruned_loss=0.04567, over 4705296.42 frames. ], batch size: 388, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:20:14,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 14:20:18,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 14:20:19,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:20:22,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 14:20:23,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 14:20:29,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:20:29,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:20:31,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:20:35,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:20:35,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:20:35,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 14:20:35,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:20:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:20:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 14:20:38,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:20:38,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:20:38,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:20:41,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:20:42,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 14:20:44,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 14:20:44,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:20:45,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 14:20:48,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:20:48,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:20:53,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:20:55,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:20:55,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:20:56,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:20:58,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:21:00,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:21:02,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:21:04,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:04,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:21:05,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:21:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:21:07,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:13,383 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 14:21:16,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:21:16,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:17,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:21:18,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:18,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:21:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:21,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 14:21:21,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:24,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:21:27,017 INFO [train.py:1046] (2/4) Epoch 26, batch 3700, loss[loss=0.1967, simple_loss=0.2596, pruned_loss=0.06688, over 22782.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2456, pruned_loss=0.04538, over 4705404.00 frames. ], batch size: 322, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:21:27,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:21:28,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:21:28,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=910020.0, ans=0.0 2023-10-02 14:21:31,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:31,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 14:21:31,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:31,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:21:33,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:21:37,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:21:41,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:21:41,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:21:41,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:21:42,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:21:45,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:21:45,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=910086.6666666666, ans=0.1 2023-10-02 14:21:46,628 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 14:21:52,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:21:52,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:21:53,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:21:53,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 14:21:55,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:21:55,715 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.12 vs. limit=22.5 2023-10-02 14:21:58,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.08 vs. limit=6.0 2023-10-02 14:21:59,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:59,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 14:22:00,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:02,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:22:04,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:05,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:22:06,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=910153.3333333334, ans=0.0 2023-10-02 14:22:07,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:22:12,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:22:12,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 14:22:13,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:22:13,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 14:22:19,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:22:19,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:22:21,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:22:22,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 14:22:24,734 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.885e+02 2.100e+02 2.312e+02 3.361e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-02 14:22:24,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:22:24,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:22:24,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:22:24,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:22:27,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:22:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 14:22:30,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 14:22:31,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:22:31,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:32,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:22:33,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=910286.6666666666, ans=0.1 2023-10-02 14:22:34,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:22:38,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:41,243 INFO [train.py:1046] (2/4) Epoch 26, batch 3750, loss[loss=0.1563, simple_loss=0.24, pruned_loss=0.03633, over 24461.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2462, pruned_loss=0.04554, over 4718728.10 frames. ], batch size: 66, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:22:41,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:22:42,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:22:44,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 14:22:45,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 14:22:47,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:22:48,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 14:22:48,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:22:49,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:52,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:22:52,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=910353.3333333334, ans=0.125 2023-10-02 14:22:52,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=910353.3333333334, ans=0.0 2023-10-02 14:22:56,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:22:59,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:23:00,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:23:02,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:23:03,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:23:05,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 14:23:06,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:23:08,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:23:10,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:23:12,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 14:23:17,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 14:23:19,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:23:19,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:23:20,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:23:26,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:23:27,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 14:23:30,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 14:23:31,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:23:36,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:23:36,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:23:38,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=910620.0, ans=0.2 2023-10-02 14:23:39,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:23:39,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=910620.0, ans=0.2 2023-10-02 14:23:43,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:23:43,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=910620.0, ans=0.125 2023-10-02 14:23:44,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:23:46,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:23:47,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:23:50,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:23:53,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=910686.6666666666, ans=0.0 2023-10-02 14:23:54,549 INFO [train.py:1046] (2/4) Epoch 26, batch 3800, loss[loss=0.149, simple_loss=0.2271, pruned_loss=0.0354, over 24584.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.246, pruned_loss=0.04555, over 4723340.18 frames. ], batch size: 60, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:23:58,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:24:01,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:01,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:24:02,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 14:24:04,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:24:05,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:07,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:24:08,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 14:24:08,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:10,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:24:12,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:24:12,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:24:13,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:13,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 14:24:17,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 14:24:18,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:24:20,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:21,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:24:23,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:24:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:24:24,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:27,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:28,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:32,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:24:32,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 14:24:34,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:24:42,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:24:48,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:24:50,719 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.895e+02 2.051e+02 2.424e+02 3.630e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 14:24:50,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 14:24:50,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 14:24:50,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:53,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:24:53,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:56,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 14:24:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 14:24:59,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 14:24:59,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:00,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:25:03,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:25:05,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:25:06,906 INFO [train.py:1046] (2/4) Epoch 26, batch 3850, loss[loss=0.1379, simple_loss=0.2002, pruned_loss=0.03783, over 22688.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2445, pruned_loss=0.04546, over 4711587.24 frames. ], batch size: 322, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:25:12,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:25:13,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 14:25:14,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:25:14,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:18,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:25:21,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:25:23,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:25:24,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 14:25:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:31,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:34,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:25:35,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:25:37,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:38,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=911153.3333333334, ans=0.2 2023-10-02 14:25:38,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:25:39,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:25:39,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:25:40,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:25:43,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:25:43,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:45,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:25:45,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 14:25:45,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 14:25:47,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:25:47,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:49,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:25:49,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:49,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 14:25:52,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 14:25:54,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:25:55,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=911220.0, ans=0.0 2023-10-02 14:25:56,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 14:25:58,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:26:02,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:02,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:26:06,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:06,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 14:26:10,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 14:26:10,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:10,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:15,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:26:15,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:26:15,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:16,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:16,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:26:16,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 14:26:19,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:26:20,296 INFO [train.py:1046] (2/4) Epoch 26, batch 3900, loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03848, over 24662.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2427, pruned_loss=0.04499, over 4708534.06 frames. ], batch size: 65, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:26:20,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 14:26:21,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:21,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:23,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:26:23,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:24,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:26:24,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:24,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:25,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:26:25,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 14:26:27,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:27,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=911353.3333333334, ans=0.1 2023-10-02 14:26:30,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:26:30,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:26:31,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:26:32,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:26:34,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:26:34,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:37,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:26:38,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 14:26:38,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:26:40,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 14:26:40,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:42,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 14:26:42,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 14:26:45,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:26:48,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:26:48,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:26:48,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:26:50,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=911486.6666666666, ans=0.0 2023-10-02 14:26:52,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:26:55,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:26:58,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:26:58,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:26:59,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:27:05,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:27:05,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:27:06,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=911553.3333333334, ans=0.125 2023-10-02 14:27:13,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:27:14,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:27:17,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.831e+02 1.989e+02 2.156e+02 3.105e+02, threshold=3.979e+02, percent-clipped=0.0 2023-10-02 14:27:20,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:27:23,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:27:23,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 14:27:23,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 14:27:25,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:27:26,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 14:27:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:27:26,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 14:27:32,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:27:33,700 INFO [train.py:1046] (2/4) Epoch 26, batch 3950, loss[loss=0.1704, simple_loss=0.2387, pruned_loss=0.05104, over 22716.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.242, pruned_loss=0.04473, over 4688693.09 frames. ], batch size: 322, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:27:33,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 14:27:35,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:27:38,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:27:40,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:27:45,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 14:27:47,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:27:47,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 14:27:48,521 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 14:27:48,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:27:50,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=911753.3333333334, ans=10.0 2023-10-02 14:27:51,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:27:51,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:27:51,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:27:56,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 14:27:57,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:27:57,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:27:57,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:27:57,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:27:58,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-10-02 14:27:58,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:28:05,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=911820.0, ans=0.0 2023-10-02 14:28:05,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.73 vs. limit=15.0 2023-10-02 14:28:09,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:28:10,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:28:14,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 14:28:19,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 14:28:19,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 14:28:20,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:28:22,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:28:29,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:28:29,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:28:29,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:28:30,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:28:30,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 14:28:35,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:28:37,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:28:41,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 14:28:48,304 INFO [train.py:1046] (2/4) Epoch 26, batch 4000, loss[loss=0.1713, simple_loss=0.2397, pruned_loss=0.05146, over 23848.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2432, pruned_loss=0.04452, over 4713791.16 frames. ], batch size: 212, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:28:48,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:28:54,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:28:56,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=912020.0, ans=0.1 2023-10-02 14:28:58,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:28:59,075 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:29:00,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:29:00,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:29:00,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 14:29:00,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:29:01,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 14:29:01,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:29:01,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 14:29:03,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:06,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:29:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:29:08,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:29:08,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:29:08,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:29:09,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:29:10,046 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 14:29:11,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:29:12,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 14:29:16,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:29:16,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:29:18,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=912153.3333333334, ans=0.0 2023-10-02 14:29:25,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 14:29:26,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:29:28,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:29:28,438 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 14:29:29,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:29:29,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 14:29:29,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:29:31,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:32,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:29:33,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:29:33,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:29:33,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:29:34,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=912220.0, ans=0.125 2023-10-02 14:29:35,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 14:29:37,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:38,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 14:29:38,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=912220.0, ans=0.1 2023-10-02 14:29:42,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=912220.0, ans=0.1 2023-10-02 14:29:45,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:29:46,607 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.860e+02 2.015e+02 2.242e+02 2.909e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-02 14:29:48,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 14:29:48,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=912286.6666666666, ans=0.125 2023-10-02 14:29:49,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:29:49,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:49,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:29:51,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:29:55,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:56,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=912286.6666666666, ans=0.2 2023-10-02 14:29:58,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:29:58,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 14:29:58,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=912286.6666666666, ans=0.125 2023-10-02 14:29:59,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:29:59,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:02,646 INFO [train.py:1046] (2/4) Epoch 26, batch 4050, loss[loss=0.15, simple_loss=0.2349, pruned_loss=0.03255, over 24474.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2435, pruned_loss=0.04447, over 4703555.33 frames. ], batch size: 66, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:30:02,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:30:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:30:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:30:08,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:30:11,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:30:11,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:30:12,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:30:12,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:30:16,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=912353.3333333334, ans=0.0 2023-10-02 14:30:16,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=912353.3333333334, ans=0.125 2023-10-02 14:30:17,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:30:19,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:30:21,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 14:30:22,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 14:30:22,798 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 14:30:26,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:30:31,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=912420.0, ans=15.0 2023-10-02 14:30:31,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 14:30:33,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:30:35,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=912486.6666666666, ans=0.1 2023-10-02 14:30:36,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:39,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:30:39,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:30:39,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:43,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:30:45,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.56 vs. limit=10.0 2023-10-02 14:30:47,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 14:30:47,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:30:47,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=912553.3333333334, ans=0.125 2023-10-02 14:30:48,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:30:50,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 14:30:55,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:31:00,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=912553.3333333334, ans=0.125 2023-10-02 14:31:01,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 14:31:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:31:01,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:31:04,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 14:31:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 14:31:04,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:05,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:31:06,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=912620.0, ans=0.2 2023-10-02 14:31:09,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:09,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:31:16,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 14:31:18,135 INFO [train.py:1046] (2/4) Epoch 26, batch 4100, loss[loss=0.1917, simple_loss=0.27, pruned_loss=0.05668, over 23279.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2454, pruned_loss=0.04528, over 4712310.04 frames. ], batch size: 105, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:31:18,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 14:31:19,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 14:31:21,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 14:31:21,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:21,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:22,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:31:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 14:31:23,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:31:25,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:31:25,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:26,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:31:29,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=912686.6666666666, ans=15.0 2023-10-02 14:31:32,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:31:32,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:31:34,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:31:34,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 14:31:34,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:34,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:31:34,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:31:36,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:31:37,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 14:31:41,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:31:42,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 14:31:44,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:31:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:31:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 14:31:48,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:31:49,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:31:49,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:31:51,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 14:31:52,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:31:52,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=912820.0, ans=0.125 2023-10-02 14:31:54,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:31:56,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 14:31:56,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:56,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:31:59,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:32:07,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:08,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:32:08,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:32:11,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=22.5 2023-10-02 14:32:16,360 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.852e+02 2.033e+02 2.251e+02 3.212e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 14:32:17,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:17,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:32:18,074 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:32:21,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:32:23,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:32:26,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:32:28,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:32:28,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=912953.3333333334, ans=0.125 2023-10-02 14:32:29,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:32:29,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:32:32,303 INFO [train.py:1046] (2/4) Epoch 26, batch 4150, loss[loss=0.1476, simple_loss=0.2261, pruned_loss=0.03454, over 24582.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2441, pruned_loss=0.04523, over 4705676.28 frames. ], batch size: 60, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:32:32,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 14:32:32,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:33,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 14:32:33,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 14:32:34,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=913020.0, ans=0.125 2023-10-02 14:32:34,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=913020.0, ans=0.125 2023-10-02 14:32:35,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 14:32:36,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:39,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:32:39,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:43,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:32:45,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:32:46,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:32:46,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=913086.6666666666, ans=0.0 2023-10-02 14:32:47,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:32:47,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:32:49,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:32:54,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:55,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=913086.6666666666, ans=0.2 2023-10-02 14:32:59,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:33:00,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 14:33:03,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 14:33:03,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:33:03,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 14:33:03,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:33:04,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:33:07,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:07,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:33:14,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 14:33:16,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:33:18,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:33:20,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 14:33:20,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:33:21,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 14:33:24,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:33:24,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:33:26,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:26,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 14:33:26,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:33:26,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:33:26,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=913220.0, ans=0.125 2023-10-02 14:33:27,739 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:33:28,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:33:31,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 14:33:31,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:31,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:33:32,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:33:32,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 14:33:32,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:33:34,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:33:34,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:33:37,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:37,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 14:33:37,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:33:42,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:33:44,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 14:33:45,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:33:46,856 INFO [train.py:1046] (2/4) Epoch 26, batch 4200, loss[loss=0.149, simple_loss=0.1969, pruned_loss=0.05055, over 19276.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2435, pruned_loss=0.04513, over 4710168.39 frames. ], batch size: 388, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:33:47,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:33:48,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:33:50,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:33:50,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:33:51,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 14:33:51,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=913353.3333333334, ans=0.125 2023-10-02 14:33:56,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 14:33:57,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:33:58,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:33:59,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=913353.3333333334, ans=0.125 2023-10-02 14:34:00,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:34:02,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=913420.0, ans=0.125 2023-10-02 14:34:03,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:34:04,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:34:05,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:05,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 14:34:05,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:34:07,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:07,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:34:07,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:34:08,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:34:12,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 14:34:13,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:17,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:34:19,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:34:21,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:34:21,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=913486.6666666666, ans=0.2 2023-10-02 14:34:22,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:34:25,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:34:25,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 14:34:25,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:34:27,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:34:27,706 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.98 vs. limit=12.0 2023-10-02 14:34:32,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:34:35,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:34:39,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:34:42,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 14:34:43,625 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.384e+02 1.882e+02 2.071e+02 2.346e+02 3.876e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 14:34:43,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:34:47,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=913620.0, ans=0.0 2023-10-02 14:34:50,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:34:51,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:34:53,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 14:34:58,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=913620.0, ans=0.1 2023-10-02 14:34:59,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:35:01,174 INFO [train.py:1046] (2/4) Epoch 26, batch 4250, loss[loss=0.1535, simple_loss=0.2382, pruned_loss=0.03441, over 24636.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2432, pruned_loss=0.04504, over 4711317.60 frames. ], batch size: 65, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:35:02,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:35:02,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:35:05,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:09,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:35:09,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 14:35:11,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:35:12,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:15,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:35:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:20,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:21,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:35:21,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:35:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:25,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:25,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:28,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:35:29,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:35:31,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 14:35:35,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 14:35:36,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:36,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:35:36,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:36,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:35:36,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:38,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:42,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:35:42,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:35:45,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:35:47,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:35:48,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 14:35:48,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:35:50,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 14:35:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:35:54,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:35:56,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.58 vs. limit=12.0 2023-10-02 14:35:57,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:57,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:35:58,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 14:35:59,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:36:01,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:36:04,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:36:06,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:36:07,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=913953.3333333334, ans=0.0 2023-10-02 14:36:08,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:36:09,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:36:09,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=913953.3333333334, ans=0.1 2023-10-02 14:36:11,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:36:11,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=913953.3333333334, ans=0.125 2023-10-02 14:36:12,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=913953.3333333334, ans=0.025 2023-10-02 14:36:13,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-10-02 14:36:13,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:36:13,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:36:13,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 14:36:15,018 INFO [train.py:1046] (2/4) Epoch 26, batch 4300, loss[loss=0.1745, simple_loss=0.2499, pruned_loss=0.04956, over 23536.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2436, pruned_loss=0.04503, over 4715472.07 frames. ], batch size: 256, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:36:15,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:36:21,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:36:21,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:36:23,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=914020.0, ans=0.125 2023-10-02 14:36:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:36:32,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:36:32,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 14:36:33,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:36:33,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=914086.6666666666, ans=0.125 2023-10-02 14:36:35,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:36:36,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:36:36,353 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 14:36:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:36:41,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:36:44,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 14:36:44,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:36:44,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 14:36:47,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:36:48,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:36:49,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=914153.3333333334, ans=0.1 2023-10-02 14:36:52,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:36:53,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:36:53,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:36:55,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:36:57,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:36:57,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 14:36:58,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 14:37:00,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:37:01,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=914220.0, ans=0.125 2023-10-02 14:37:02,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:02,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:37:02,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:02,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:37:02,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 14:37:02,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 14:37:02,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=914220.0, ans=0.0 2023-10-02 14:37:03,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 14:37:03,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=914220.0, ans=0.035 2023-10-02 14:37:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:37:05,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=914220.0, ans=0.125 2023-10-02 14:37:06,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 14:37:06,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 14:37:10,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:37:11,862 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 14:37:12,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=914220.0, ans=0.125 2023-10-02 14:37:13,156 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.843e+02 2.069e+02 2.319e+02 3.612e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-02 14:37:13,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:37:14,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:14,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:37:18,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 14:37:18,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:37:18,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:18,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:37:18,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:37:19,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:37:21,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:37:24,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:25,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:26,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:37:29,589 INFO [train.py:1046] (2/4) Epoch 26, batch 4350, loss[loss=0.1478, simple_loss=0.2278, pruned_loss=0.03391, over 21086.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2444, pruned_loss=0.04511, over 4727917.50 frames. ], batch size: 46, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:37:32,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 14:37:32,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:37:37,602 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.66 vs. limit=10.0 2023-10-02 14:37:38,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:37:41,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:43,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:37:43,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:37:45,872 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.45 vs. limit=6.0 2023-10-02 14:37:49,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:37:52,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:53,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:37:53,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:37:57,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:38:00,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:38:02,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:38:06,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 14:38:07,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:38:07,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:12,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:13,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 14:38:16,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:16,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=914553.3333333334, ans=0.95 2023-10-02 14:38:17,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:38:21,818 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 14:38:23,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:38:23,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:38:25,052 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 14:38:26,393 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 14:38:26,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:38:26,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:38:28,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:38:28,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:38:28,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=914620.0, ans=0.1 2023-10-02 14:38:28,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=914620.0, ans=0.2 2023-10-02 14:38:29,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:38:29,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:38:32,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 14:38:32,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:32,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:34,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 14:38:36,070 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 14:38:36,074 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 14:38:36,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 14:38:39,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=914620.0, ans=0.0 2023-10-02 14:38:40,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:38:40,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:38:40,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:38:40,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:38:42,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 14:38:42,607 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=22.5 2023-10-02 14:38:43,279 INFO [train.py:1046] (2/4) Epoch 26, batch 4400, loss[loss=0.1534, simple_loss=0.2282, pruned_loss=0.03931, over 24598.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2454, pruned_loss=0.04546, over 4727536.95 frames. ], batch size: 60, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:38:44,725 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 14:38:44,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:46,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=914686.6666666666, ans=0.125 2023-10-02 14:38:48,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:38:48,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:50,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:51,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 14:38:51,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 14:38:52,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 14:38:52,994 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 14:38:54,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:38:54,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:38:57,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 14:38:57,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:00,871 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 14:39:02,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:02,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 14:39:02,442 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 14:39:05,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 14:39:07,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 14:39:07,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 14:39:07,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.14 vs. limit=15.0 2023-10-02 14:39:08,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:08,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=914753.3333333334, ans=0.125 2023-10-02 14:39:09,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:39:10,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=914753.3333333334, ans=0.1 2023-10-02 14:39:10,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=914753.3333333334, ans=0.125 2023-10-02 14:39:11,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:39:11,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:39:13,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 14:39:13,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 14:39:14,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:16,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:39:16,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:18,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:18,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:18,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 14:39:18,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=914820.0, ans=0.0 2023-10-02 14:39:19,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 14:39:22,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:27,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:39:29,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 14:39:35,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:39:37,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:39:38,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:39:38,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 14:39:38,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:39:38,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:39:38,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:39:40,331 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.911e+02 2.191e+02 2.459e+02 3.786e+02, threshold=4.382e+02, percent-clipped=0.0 2023-10-02 14:39:40,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:39:40,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=914953.3333333334, ans=0.125 2023-10-02 14:39:44,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 14:39:44,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=914953.3333333334, ans=0.0 2023-10-02 14:39:44,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=914953.3333333334, ans=0.125 2023-10-02 14:39:47,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 14:39:48,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 14:39:48,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:48,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 14:39:49,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:39:51,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:39:52,987 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:39:54,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 14:39:55,582 INFO [train.py:1046] (2/4) Epoch 26, batch 4450, loss[loss=0.2264, simple_loss=0.2878, pruned_loss=0.08246, over 19549.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2463, pruned_loss=0.04546, over 4727482.04 frames. ], batch size: 388, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:39:59,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:40:02,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:40:04,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.42 vs. limit=22.5 2023-10-02 14:40:09,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:09,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:40:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:13,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:40:16,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:40:17,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:40:17,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 14:40:17,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:40:19,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:19,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:40:19,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:40:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:40:26,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:26,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:28,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:40:29,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:40:29,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:40:31,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=915153.3333333334, ans=0.1 2023-10-02 14:40:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 14:40:35,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 14:40:35,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 14:40:35,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:40:37,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:38,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 14:40:41,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:40:45,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:45,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=915220.0, ans=0.1 2023-10-02 14:40:46,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 14:40:46,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:40:46,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:40:46,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:49,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:52,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:40:53,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 14:40:54,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:40:56,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:40:58,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:40:59,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:59,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:41:02,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:41:03,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 14:41:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:41:08,732 INFO [train.py:1046] (2/4) Epoch 26, batch 4500, loss[loss=0.1646, simple_loss=0.2501, pruned_loss=0.03954, over 24494.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2466, pruned_loss=0.04588, over 4711045.87 frames. ], batch size: 66, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:41:11,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:41:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 14:41:11,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 14:41:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:41:18,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:41:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:41:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:41:19,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:41:19,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:41:21,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:41:25,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=915420.0, ans=0.125 2023-10-02 14:41:31,166 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-10-02 14:41:34,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:41:36,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:41:38,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:41:39,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:41:40,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:41:46,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:41:49,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:41:52,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=915553.3333333334, ans=0.07 2023-10-02 14:41:53,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:41:53,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=915553.3333333334, ans=0.125 2023-10-02 14:41:56,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:41:56,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 14:41:57,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:41:57,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:01,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:01,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:42:03,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:42:03,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 14:42:03,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:42:03,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:07,070 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.824e+02 1.951e+02 2.153e+02 2.921e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-02 14:42:09,786 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.70 vs. limit=15.0 2023-10-02 14:42:10,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:42:10,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:42:11,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:13,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:42:13,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:42:14,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 14:42:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 14:42:16,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 14:42:20,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 14:42:21,631 INFO [train.py:1046] (2/4) Epoch 26, batch 4550, loss[loss=0.1707, simple_loss=0.2622, pruned_loss=0.03966, over 24448.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2472, pruned_loss=0.04559, over 4709027.42 frames. ], batch size: 69, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:42:24,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 14:42:26,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:42:29,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:42:29,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:42:33,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:42:36,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:42:36,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:39,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:42:39,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:42:39,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:42,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:42:42,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:42:45,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:42:48,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 14:42:49,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 14:42:50,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:42:52,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 14:42:54,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 14:42:54,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:42:58,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 14:42:59,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:43:04,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:04,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:04,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:43:05,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 14:43:09,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:43:12,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:12,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:43:13,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:43:13,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 14:43:13,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 14:43:15,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:43:16,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 14:43:17,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 14:43:19,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:43:19,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:19,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:43:20,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:20,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:43:22,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:43:22,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 14:43:23,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:43:23,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 14:43:25,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 14:43:25,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:43:25,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 14:43:28,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:43:28,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:43:31,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:43:33,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:33,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:43:33,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:43:36,496 INFO [train.py:1046] (2/4) Epoch 26, batch 4600, loss[loss=0.1562, simple_loss=0.2466, pruned_loss=0.0329, over 24367.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2457, pruned_loss=0.04507, over 4710299.07 frames. ], batch size: 74, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:43:36,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:43:38,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=916020.0, ans=0.02 2023-10-02 14:43:39,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:39,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:43:42,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:43:42,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:43:44,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:43:45,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 14:43:46,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:43:48,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:43:48,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:43:52,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:58,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 14:43:59,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:01,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:03,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:44:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:44:10,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 14:44:10,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:44:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:44:11,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=916153.3333333334, ans=0.0 2023-10-02 14:44:15,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:15,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:44:18,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:44:20,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 14:44:20,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=916220.0, ans=0.0 2023-10-02 14:44:22,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:44:22,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=916220.0, ans=0.0 2023-10-02 14:44:26,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:28,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:44:30,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:30,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 14:44:31,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:32,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.34 vs. limit=22.5 2023-10-02 14:44:32,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 14:44:32,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:32,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:34,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:34,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:44:34,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:36,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 14:44:36,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 14:44:37,541 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.833e+02 1.993e+02 2.244e+02 3.849e+02, threshold=3.987e+02, percent-clipped=0.0 2023-10-02 14:44:37,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 14:44:37,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:39,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:44:39,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:40,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:48,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=916286.6666666666, ans=0.125 2023-10-02 14:44:49,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:44:50,683 INFO [train.py:1046] (2/4) Epoch 26, batch 4650, loss[loss=0.1651, simple_loss=0.2356, pruned_loss=0.04726, over 23562.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2453, pruned_loss=0.04529, over 4714868.34 frames. ], batch size: 256, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:44:52,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:44:52,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:53,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:44:54,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:54,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:44:55,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:55,796 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.41 vs. limit=15.0 2023-10-02 14:44:58,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 14:45:00,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=916353.3333333334, ans=0.125 2023-10-02 14:45:01,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:45:03,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 14:45:03,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:45:04,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 14:45:04,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:45:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 14:45:04,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 14:45:04,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:05,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:45:10,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:45:11,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:11,848 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 14:45:14,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:14,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 14:45:17,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:17,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:45:17,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 14:45:18,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:45:21,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:45:25,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:45:31,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:33,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:35,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:35,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:45:39,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 14:45:39,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 14:45:39,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 14:45:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 14:45:41,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:45:41,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=916553.3333333334, ans=0.0 2023-10-02 14:45:49,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:45:49,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:45:49,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 14:45:49,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:45:51,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:45:51,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:45:52,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:45:53,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.55 vs. limit=12.0 2023-10-02 14:45:55,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:45:55,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:45:55,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:58,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:45:58,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:45:58,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:46:02,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 14:46:02,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:46:03,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 14:46:04,807 INFO [train.py:1046] (2/4) Epoch 26, batch 4700, loss[loss=0.1514, simple_loss=0.2362, pruned_loss=0.03334, over 24471.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2457, pruned_loss=0.04563, over 4707609.24 frames. ], batch size: 63, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:46:10,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:12,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:46:13,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:46:15,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:46:16,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:46:17,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=916686.6666666666, ans=0.125 2023-10-02 14:46:21,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 14:46:22,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 14:46:25,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:25,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=916753.3333333334, ans=0.125 2023-10-02 14:46:27,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:46:27,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:46:27,825 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.19 vs. limit=15.0 2023-10-02 14:46:29,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:34,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:46:36,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:46:38,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:46:43,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 14:46:45,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:46:46,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:46:52,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 14:46:53,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:46:55,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=916886.6666666666, ans=0.09899494936611666 2023-10-02 14:46:59,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:46:59,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 14:47:02,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:02,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:03,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=916953.3333333334, ans=0.125 2023-10-02 14:47:05,236 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.831e+02 2.029e+02 2.263e+02 3.134e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 14:47:05,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:47:05,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:47:05,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 14:47:06,737 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 14:47:08,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:08,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:08,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:08,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 14:47:10,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:14,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 14:47:17,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:47:18,753 INFO [train.py:1046] (2/4) Epoch 26, batch 4750, loss[loss=0.1577, simple_loss=0.2326, pruned_loss=0.04143, over 23525.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.246, pruned_loss=0.04548, over 4716233.84 frames. ], batch size: 119, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:47:18,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:21,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=917020.0, ans=0.125 2023-10-02 14:47:24,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:24,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:47:26,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 14:47:26,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:47:29,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 14:47:30,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:47:30,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:32,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:47:36,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 14:47:42,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:47:45,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 14:47:45,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:47:49,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:47:49,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:47:49,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:49,906 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 14:47:49,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 14:47:55,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 14:47:58,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:00,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:03,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:48:03,529 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 14:48:03,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:48:05,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:48:07,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:48:08,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=917220.0, ans=0.125 2023-10-02 14:48:09,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 14:48:10,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 14:48:10,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:48:10,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:48:11,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:12,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:48:12,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 14:48:15,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 14:48:18,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:48:19,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:48:19,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 14:48:21,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:48:22,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:48:24,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:48:24,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:24,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:48:28,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:48:28,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 14:48:28,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 14:48:30,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 14:48:33,386 INFO [train.py:1046] (2/4) Epoch 26, batch 4800, loss[loss=0.1827, simple_loss=0.263, pruned_loss=0.05124, over 23705.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2472, pruned_loss=0.0459, over 4720010.01 frames. ], batch size: 85, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:48:34,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:48:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:48:36,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 14:48:43,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:43,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:48:49,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:48:51,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:51,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:52,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 14:48:52,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=917420.0, ans=10.0 2023-10-02 14:48:53,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:48:53,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:48:55,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:48:57,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=917420.0, ans=0.0 2023-10-02 14:48:58,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:01,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:01,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:49:02,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:02,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 14:49:02,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:03,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:04,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:04,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=917486.6666666666, ans=0.5 2023-10-02 14:49:07,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:07,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:07,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:49:10,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:49:10,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:11,424 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.51 vs. limit=22.5 2023-10-02 14:49:13,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 14:49:13,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 14:49:14,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:14,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:49:15,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:49:15,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:49:16,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:49:18,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:49:19,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:49:22,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:49:23,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:26,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:49:31,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 14:49:31,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:33,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:33,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:49:34,319 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.969e+02 2.180e+02 2.462e+02 3.461e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-02 14:49:34,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:34,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=917620.0, ans=0.0 2023-10-02 14:49:37,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:49:37,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:49:38,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:39,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:49:40,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:49:40,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:49:44,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:49:44,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:44,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:46,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 14:49:48,374 INFO [train.py:1046] (2/4) Epoch 26, batch 4850, loss[loss=0.1675, simple_loss=0.252, pruned_loss=0.0415, over 23357.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2474, pruned_loss=0.04585, over 4716558.10 frames. ], batch size: 93, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:49:48,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 14:49:48,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:48,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:48,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:49:48,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:51,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:58,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 14:49:58,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:50:01,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:50:03,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:50:03,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:50:03,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.03 vs. limit=15.0 2023-10-02 14:50:06,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:50:07,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:50:09,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:50:09,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 14:50:09,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=917753.3333333334, ans=0.125 2023-10-02 14:50:13,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:50:15,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:50:16,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:50:18,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:50:18,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 14:50:19,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:50:19,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:23,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:23,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 14:50:24,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=917820.0, ans=0.1 2023-10-02 14:50:25,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 14:50:26,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:50:26,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=917820.0, ans=0.0 2023-10-02 14:50:32,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.86 vs. limit=12.0 2023-10-02 14:50:33,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:50:34,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 14:50:35,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:50:35,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:50:37,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:50:38,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 14:50:38,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:38,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 14:50:38,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:50:40,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:50:42,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 14:50:50,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:56,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:50:56,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:50:59,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=917953.3333333334, ans=0.1 2023-10-02 14:51:00,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 14:51:00,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:51:01,638 INFO [train.py:1046] (2/4) Epoch 26, batch 4900, loss[loss=0.1621, simple_loss=0.2106, pruned_loss=0.05675, over 19169.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2453, pruned_loss=0.04593, over 4708560.66 frames. ], batch size: 388, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:51:06,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:08,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:51:09,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:51:11,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=918020.0, ans=0.1 2023-10-02 14:51:13,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 14:51:16,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=918086.6666666666, ans=0.2 2023-10-02 14:51:17,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 14:51:20,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 14:51:21,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 14:51:21,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:51:23,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:51:23,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:51:23,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:51:23,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:51:24,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 14:51:27,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 14:51:28,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:51:28,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:51:29,214 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.51 vs. limit=12.0 2023-10-02 14:51:29,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:51:31,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:51:32,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:34,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:51:34,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 14:51:36,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:51:36,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:51:36,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 14:51:38,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 14:51:42,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 14:51:42,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:51:44,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:51:44,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:51:45,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.01 vs. limit=8.0 2023-10-02 14:51:45,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:45,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 14:51:46,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=918220.0, ans=0.2 2023-10-02 14:51:47,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:51:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 14:51:50,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:51:50,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:51:53,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:51:55,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 14:51:56,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:51:57,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 14:51:58,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 14:52:00,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=918286.6666666666, ans=0.125 2023-10-02 14:52:02,744 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.867e+02 2.051e+02 2.311e+02 3.725e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-02 14:52:04,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:52:05,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:52:06,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 14:52:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:52:06,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:52:10,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:14,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:52:14,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:52:14,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:52:14,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 14:52:16,113 INFO [train.py:1046] (2/4) Epoch 26, batch 4950, loss[loss=0.1514, simple_loss=0.2346, pruned_loss=0.03408, over 24540.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2441, pruned_loss=0.04538, over 4718392.92 frames. ], batch size: 63, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:52:16,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:52:19,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:52:19,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:52:22,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 14:52:22,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 14:52:22,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:52:23,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 14:52:23,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:23,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:52:23,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:52:25,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:26,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:26,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:52:27,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:52:29,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:52:31,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:52:34,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:52:39,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:42,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:52:44,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:44,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:45,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:52:47,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 14:52:47,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 14:52:50,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:53,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:52:53,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:52:53,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:52:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:52:55,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:52:57,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:58,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:52:58,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:53:00,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:00,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:01,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 14:53:01,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:53:03,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=918553.3333333334, ans=0.2 2023-10-02 14:53:04,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:53:07,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:53:09,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:53:09,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:53:11,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:11,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:53:12,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:53:13,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:53:15,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:53:15,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:53:15,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=918620.0, ans=0.125 2023-10-02 14:53:16,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 14:53:21,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:24,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=918620.0, ans=0.1 2023-10-02 14:53:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 14:53:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:53:29,752 INFO [train.py:1046] (2/4) Epoch 26, batch 5000, loss[loss=0.1787, simple_loss=0.2337, pruned_loss=0.06184, over 19089.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2437, pruned_loss=0.04536, over 4706049.30 frames. ], batch size: 389, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:53:31,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:31,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:53:34,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 14:53:35,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 14:53:36,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:53:38,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 14:53:38,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:53:38,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=918686.6666666666, ans=0.0 2023-10-02 14:53:40,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:53:40,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 14:53:42,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:53:43,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 14:53:43,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:45,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:53:45,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 14:53:45,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 14:53:46,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:53:46,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 14:53:47,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:53:47,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:49,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:53:49,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 14:53:49,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 14:53:50,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 14:53:51,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:51,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:52,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 14:53:52,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:53:55,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:56,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:57,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 14:53:58,538 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.08 vs. limit=22.5 2023-10-02 14:53:59,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 14:54:00,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:54:01,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:54:05,953 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 14:54:06,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=918820.0, ans=0.125 2023-10-02 14:54:07,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:54:09,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:54:09,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:12,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 14:54:14,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:54:14,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:54:14,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:54:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 14:54:17,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:54:21,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:54:21,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:54:27,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 14:54:30,192 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.795e+02 1.977e+02 2.128e+02 2.824e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-02 14:54:31,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:33,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=918953.3333333334, ans=0.125 2023-10-02 14:54:37,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=918953.3333333334, ans=0.125 2023-10-02 14:54:40,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:54:40,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:40,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:54:42,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:54:42,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:54:42,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:54:43,489 INFO [train.py:1046] (2/4) Epoch 26, batch 5050, loss[loss=0.1628, simple_loss=0.2502, pruned_loss=0.03767, over 24306.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2448, pruned_loss=0.04495, over 4712882.80 frames. ], batch size: 77, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:54:43,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:48,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:48,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 14:54:48,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:54:48,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=919020.0, ans=0.125 2023-10-02 14:54:50,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:54:50,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:54:52,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 14:54:52,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:54:54,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:54:56,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:54:56,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:54:57,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=919086.6666666666, ans=0.2 2023-10-02 14:54:58,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:54:59,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=919086.6666666666, ans=0.0 2023-10-02 14:55:07,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 14:55:08,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:55:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:55:09,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 14:55:10,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:55:12,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:12,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:55:13,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:55:13,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 14:55:14,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 14:55:14,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:17,715 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.92 vs. limit=10.0 2023-10-02 14:55:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:55:20,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:20,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 14:55:23,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:55:26,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 14:55:26,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:55:26,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:55:27,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:55:29,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:55:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:55:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:55:32,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=919220.0, ans=0.125 2023-10-02 14:55:33,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:33,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:55:33,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:55:33,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 14:55:34,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:55:36,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:55:39,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:55:39,522 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 14:55:39,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:55:40,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:55:42,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:42,253 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 14:55:44,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:55:44,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 14:55:44,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:48,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:55:49,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=919286.6666666666, ans=0.0 2023-10-02 14:55:50,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:50,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 14:55:50,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 14:55:53,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:55:53,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:55:53,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:55:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 14:55:57,806 INFO [train.py:1046] (2/4) Epoch 26, batch 5100, loss[loss=0.1738, simple_loss=0.2466, pruned_loss=0.05048, over 23818.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2454, pruned_loss=0.04478, over 4716209.66 frames. ], batch size: 164, lr: 3.90e-03, grad_scale: 8.0 2023-10-02 14:55:59,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:56:00,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 14:56:02,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 14:56:02,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:56:03,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:56:06,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:56:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 14:56:06,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 14:56:11,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:56:11,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:56:15,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:56:20,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 14:56:20,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:56:21,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=919420.0, ans=0.0 2023-10-02 14:56:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:56:23,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:56:23,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.99 vs. limit=15.0 2023-10-02 14:56:24,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:26,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:26,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 14:56:29,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 14:56:30,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:30,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 14:56:30,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 14:56:30,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=919486.6666666666, ans=0.125 2023-10-02 14:56:33,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:56:42,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:56:43,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 14:56:43,692 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 14:56:43,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=919553.3333333334, ans=0.025 2023-10-02 14:56:45,007 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 14:56:46,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 14:56:46,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:49,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 14:56:54,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 14:56:55,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:56:56,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:56:59,930 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.826e+02 2.092e+02 2.413e+02 3.639e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 14:57:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 14:57:02,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 14:57:02,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=919620.0, ans=0.0 2023-10-02 14:57:04,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 14:57:04,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=919620.0, ans=0.125 2023-10-02 14:57:08,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:57:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:57:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:57:09,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:57:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:57:10,904 INFO [train.py:1046] (2/4) Epoch 26, batch 5150, loss[loss=0.1893, simple_loss=0.2622, pruned_loss=0.05826, over 23635.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2458, pruned_loss=0.04529, over 4714530.66 frames. ], batch size: 232, lr: 3.90e-03, grad_scale: 8.0 2023-10-02 14:57:10,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:57:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 14:57:11,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 14:57:11,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 14:57:12,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:57:12,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 14:57:14,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:14,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 14:57:15,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:57:17,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:57:21,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:57:23,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 14:57:24,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:24,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:57:26,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:57:26,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:57:26,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:57:28,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:57:28,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:57:28,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 14:57:31,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:57:32,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:57:32,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:57:34,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 14:57:36,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:57:38,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=919753.3333333334, ans=0.125 2023-10-02 14:57:41,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:57:42,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 14:57:45,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:57:52,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:57:54,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:56,739 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.46 vs. limit=10.0 2023-10-02 14:57:58,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:57:59,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:58:02,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 14:58:03,774 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.00 vs. limit=15.0 2023-10-02 14:58:05,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:58:05,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:58:07,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:58:09,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:11,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:58:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 14:58:16,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:58:18,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:58:21,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:58:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:58:21,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:58:23,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:58:23,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:58:23,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:58:25,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.14 vs. limit=15.0 2023-10-02 14:58:25,770 INFO [train.py:1046] (2/4) Epoch 26, batch 5200, loss[loss=0.18, simple_loss=0.2649, pruned_loss=0.04757, over 23589.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.247, pruned_loss=0.04599, over 4705797.84 frames. ], batch size: 94, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 14:58:25,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:58:27,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:58:30,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:58:32,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=920020.0, ans=0.125 2023-10-02 14:58:35,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 14:58:36,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:58:37,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:40,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:58:40,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:58:41,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:42,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 14:58:42,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.00 vs. limit=15.0 2023-10-02 14:58:43,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=920086.6666666666, ans=0.0 2023-10-02 14:58:44,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.64 vs. limit=15.0 2023-10-02 14:58:44,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:58:44,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:47,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 14:58:50,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:58:52,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:58:52,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 14:58:53,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 14:58:55,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 14:58:55,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:55,572 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 14:58:55,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:58,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:58:58,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:58:58,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 14:58:59,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:59:01,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:59:04,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 14:59:04,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=920153.3333333334, ans=0.125 2023-10-02 14:59:05,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 14:59:05,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 14:59:09,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 14:59:09,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:59:15,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:59:15,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:17,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 14:59:18,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:59:18,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:59:18,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:18,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:59:23,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:59:25,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:59:26,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=920286.6666666666, ans=0.125 2023-10-02 14:59:27,488 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.858e+02 2.023e+02 2.241e+02 5.045e+02, threshold=4.045e+02, percent-clipped=1.0 2023-10-02 14:59:28,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-10-02 14:59:28,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:59:30,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:59:30,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:31,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=920286.6666666666, ans=0.1 2023-10-02 14:59:33,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:33,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 14:59:35,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:59:35,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:59:37,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:37,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:59:38,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:59:39,851 INFO [train.py:1046] (2/4) Epoch 26, batch 5250, loss[loss=0.154, simple_loss=0.2355, pruned_loss=0.03621, over 24499.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2461, pruned_loss=0.04575, over 4714279.01 frames. ], batch size: 63, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 14:59:41,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:59:44,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:59:44,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:59:44,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:59:51,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:59:57,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:59:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:00:00,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 15:00:00,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:00:02,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:00:23,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=920553.3333333334, ans=0.1 2023-10-02 15:00:30,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=920553.3333333334, ans=0.125 2023-10-02 15:00:35,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=920620.0, ans=0.125 2023-10-02 15:00:40,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=920620.0, ans=0.125 2023-10-02 15:00:45,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=920620.0, ans=0.5 2023-10-02 15:00:48,978 INFO [train.py:1046] (2/4) Epoch 26, batch 5300, loss[loss=0.1656, simple_loss=0.2325, pruned_loss=0.04934, over 23755.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2442, pruned_loss=0.04551, over 4698497.23 frames. ], batch size: 212, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 15:00:50,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=920686.6666666666, ans=0.1 2023-10-02 15:01:03,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:01:03,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 15:01:03,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 15:01:03,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:03,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:03,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:03,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:03,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:03,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:03,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:03,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:01:04,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:01:04,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 15:01:04,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 15:01:04,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 15:01:04,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:01:04,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 15:01:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 15:01:04,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:04,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:05,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:01:05,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:01:05,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:01:05,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:01:05,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:05,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:05,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:01:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:05,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:01:05,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:05,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:01:06,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 15:01:06,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:01:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:06,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 15:01:06,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 15:01:06,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:01:06,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:06,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 15:01:07,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 15:01:07,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:01:07,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:01:08,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:01:08,181 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 15:01:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 15:01:08,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:01:08,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:08,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 15:01:08,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 15:01:08,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 15:01:08,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:01:15,201 INFO [train.py:1046] (2/4) Epoch 27, batch 0, loss[loss=0.1505, simple_loss=0.2262, pruned_loss=0.03744, over 23908.00 frames. ], tot_loss[loss=0.1505, simple_loss=0.2262, pruned_loss=0.03744, over 23908.00 frames. ], batch size: 195, lr: 3.83e-03, grad_scale: 32.0 2023-10-02 15:01:15,201 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 15:01:27,542 INFO [train.py:1078] (2/4) Epoch 27, validation: loss=0.313, simple_loss=0.2744, pruned_loss=0.1758, over 1125622.00 frames. 2023-10-02 15:01:27,544 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 15:01:30,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 15:01:31,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:01:33,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:01:37,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:37,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:01:37,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:39,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 15:01:40,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 15:01:41,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:43,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:46,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:46,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:46,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:01:46,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:01:48,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 15:01:50,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:01:56,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:01:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:58,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 15:02:01,496 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=12.0 2023-10-02 15:02:04,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:02:04,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:02:06,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:09,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:02:10,446 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.17 vs. limit=6.0 2023-10-02 15:02:11,023 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 2.074e+02 2.559e+02 3.176e+02 5.504e+02, threshold=5.117e+02, percent-clipped=16.0 2023-10-02 15:02:12,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:15,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=920966.6666666666, ans=0.0 2023-10-02 15:02:16,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 15:02:20,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 15:02:20,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:02:20,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:22,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:02:22,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:02:25,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 15:02:27,506 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.31 vs. limit=15.0 2023-10-02 15:02:28,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:28,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:32,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:02:35,412 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 15:02:36,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:02:38,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:02:39,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:02:40,890 INFO [train.py:1046] (2/4) Epoch 27, batch 50, loss[loss=0.1743, simple_loss=0.2634, pruned_loss=0.04256, over 24628.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2431, pruned_loss=0.04192, over 1080614.59 frames. ], batch size: 68, lr: 3.83e-03, grad_scale: 32.0 2023-10-02 15:02:40,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 15:02:41,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:02:42,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:02:43,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:02:44,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:02:46,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:02:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 15:02:51,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:51,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=921100.0, ans=0.0 2023-10-02 15:02:53,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=921166.6666666666, ans=0.1 2023-10-02 15:02:53,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=921166.6666666666, ans=0.2 2023-10-02 15:02:57,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:03:01,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 15:03:02,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 15:03:03,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:03:05,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:03:05,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:03:05,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:03:07,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:03:07,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 15:03:07,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:03:15,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:03:17,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:03:17,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:03:18,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 15:03:21,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:03:21,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:03:21,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 15:03:22,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:03:25,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 15:03:32,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=921300.0, ans=0.05 2023-10-02 15:03:33,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:03:33,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:03:35,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:03:36,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.39 vs. limit=15.0 2023-10-02 15:03:36,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:03:36,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:03:39,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 15:03:40,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 15:03:43,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:03:43,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:03:44,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:03:44,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:03:44,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 15:03:44,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 15:03:46,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 15:03:47,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:03:48,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:03:48,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 15:03:48,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 15:03:50,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:03:51,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:03:53,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:03:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:03:54,364 INFO [train.py:1046] (2/4) Epoch 27, batch 100, loss[loss=0.1474, simple_loss=0.2309, pruned_loss=0.03196, over 24475.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2456, pruned_loss=0.04399, over 1875307.78 frames. ], batch size: 66, lr: 3.83e-03, grad_scale: 16.0 2023-10-02 15:03:55,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:03:58,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:04:02,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:04:03,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 15:04:03,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:04:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:04:07,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=921433.3333333334, ans=0.09899494936611666 2023-10-02 15:04:08,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:04:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:04:08,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:04:08,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:04:09,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 15:04:14,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:04:14,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:14,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:14,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:04:18,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 15:04:18,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=921500.0, ans=0.0 2023-10-02 15:04:19,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:21,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:04:22,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:04:23,075 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:04:27,027 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 15:04:27,052 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 15:04:28,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:04:28,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:04:31,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:04:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:35,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:38,512 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.77 vs. limit=10.0 2023-10-02 15:04:39,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=921633.3333333334, ans=0.0 2023-10-02 15:04:40,519 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.396e+02 1.778e+02 2.000e+02 2.218e+02 5.015e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-02 15:04:40,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:40,667 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 15:04:42,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.59 vs. limit=15.0 2023-10-02 15:04:43,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:04:46,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:04:46,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:04:46,588 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:04:47,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:50,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:04:54,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:04:55,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:04:57,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:57,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:58,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:04:58,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:04:58,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:05:00,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 15:05:00,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 15:05:00,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:02,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:05:02,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:02,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:02,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:05:02,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:05:03,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:05:03,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:04,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:06,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:06,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:05:06,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:05:08,222 INFO [train.py:1046] (2/4) Epoch 27, batch 150, loss[loss=0.1759, simple_loss=0.2483, pruned_loss=0.05176, over 23775.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2452, pruned_loss=0.04372, over 2523239.15 frames. ], batch size: 179, lr: 3.83e-03, grad_scale: 16.0 2023-10-02 15:05:08,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=921766.6666666666, ans=0.125 2023-10-02 15:05:09,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:11,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=921766.6666666666, ans=0.0 2023-10-02 15:05:12,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:05:12,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:12,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:15,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:15,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:18,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:05:18,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:20,311 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.65 vs. limit=10.0 2023-10-02 15:05:23,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 15:05:23,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 15:05:23,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 15:05:25,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:05:25,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:05:26,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:05:28,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:05:28,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:29,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:31,379 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 15:05:33,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.40 vs. limit=15.0 2023-10-02 15:05:34,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:39,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:40,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=921900.0, ans=0.1 2023-10-02 15:05:42,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:05:44,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 15:05:47,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:05:47,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:48,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.64 vs. limit=15.0 2023-10-02 15:05:49,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:05:50,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:05:52,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:52,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:05:54,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:54,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 15:06:00,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:01,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:03,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:06:03,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:06:03,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=921966.6666666666, ans=0.125 2023-10-02 15:06:06,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:06,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 15:06:09,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:06:09,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=922033.3333333334, ans=0.125 2023-10-02 15:06:12,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:06:13,448 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=22.5 2023-10-02 15:06:13,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:06:16,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:06:17,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 15:06:17,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:06:17,189 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 15:06:20,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:06:21,854 INFO [train.py:1046] (2/4) Epoch 27, batch 200, loss[loss=0.1673, simple_loss=0.2528, pruned_loss=0.04091, over 24610.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2475, pruned_loss=0.04482, over 3013260.45 frames. ], batch size: 68, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:06:26,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:06:26,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:06:27,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 15:06:28,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:06:30,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:33,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 15:06:34,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:06:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:37,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:40,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:06:40,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:06:40,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:59,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:07:01,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:07:02,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:07:02,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:07:03,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:07:03,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:07:05,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:06,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:07:08,368 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.469e+02 1.865e+02 2.065e+02 2.281e+02 3.557e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 15:07:08,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:07:08,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:07:08,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 15:07:09,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:07:09,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:14,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:07:18,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:07:24,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:24,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:07:31,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:34,355 INFO [train.py:1046] (2/4) Epoch 27, batch 250, loss[loss=0.1719, simple_loss=0.2509, pruned_loss=0.04642, over 23313.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2469, pruned_loss=0.04475, over 3390622.49 frames. ], batch size: 119, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:07:34,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 15:07:35,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:35,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:07:35,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:07:35,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:07:37,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 15:07:39,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:07:39,029 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 15:07:40,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:41,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:07:42,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:42,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:44,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:07:44,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:46,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:07:49,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:07:55,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=922500.0, ans=0.0 2023-10-02 15:07:58,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:08:02,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=922566.6666666666, ans=0.0 2023-10-02 15:08:03,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:08:03,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:08:09,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:08:09,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:08:11,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:08:12,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:08:12,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:08:12,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:08:12,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:08:15,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:08:16,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 15:08:16,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:08:18,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:08:19,021 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:08:20,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:08:20,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:08:20,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:08:21,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:08:21,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:08:24,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.37 vs. limit=6.0 2023-10-02 15:08:24,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:08:26,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:08:27,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:08:29,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=922633.3333333334, ans=0.125 2023-10-02 15:08:29,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=922633.3333333334, ans=0.125 2023-10-02 15:08:32,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:08:36,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:08:37,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:08:43,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:08:43,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:08:44,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=922700.0, ans=0.2 2023-10-02 15:08:47,738 INFO [train.py:1046] (2/4) Epoch 27, batch 300, loss[loss=0.1734, simple_loss=0.2389, pruned_loss=0.05393, over 23725.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2455, pruned_loss=0.0445, over 3693172.93 frames. ], batch size: 232, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:08:47,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 15:08:49,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:08:49,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:08:51,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 15:08:52,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:08:55,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:08:55,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 15:08:59,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:09:00,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:03,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:09:04,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 15:09:04,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:09:05,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:09:05,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 15:09:05,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:10,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:09:14,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:09:14,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 15:09:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 15:09:17,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:20,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:24,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:24,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 15:09:24,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:09:24,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:09:25,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:09:25,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:09:29,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=922900.0, ans=0.125 2023-10-02 15:09:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:09:31,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 15:09:32,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:09:33,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=922966.6666666666, ans=0.1 2023-10-02 15:09:34,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:35,794 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.868e+02 2.076e+02 2.400e+02 4.267e+02, threshold=4.152e+02, percent-clipped=1.0 2023-10-02 15:09:35,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 15:09:37,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:38,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=922966.6666666666, ans=0.1 2023-10-02 15:09:41,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:09:43,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:09:43,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 15:09:47,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:47,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:09:49,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:50,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:09:50,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 15:09:52,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:09:52,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:09:54,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 15:09:55,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:55,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:09:55,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=923033.3333333334, ans=0.125 2023-10-02 15:09:57,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:57,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:58,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:02,727 INFO [train.py:1046] (2/4) Epoch 27, batch 350, loss[loss=0.1748, simple_loss=0.2559, pruned_loss=0.04679, over 23415.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.244, pruned_loss=0.0438, over 3929722.38 frames. ], batch size: 93, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:10:02,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:02,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 15:10:05,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:11,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:10:12,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=923100.0, ans=0.1 2023-10-02 15:10:13,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:14,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:15,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 15:10:18,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:18,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 15:10:21,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:23,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 15:10:24,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:10:27,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 15:10:29,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:10:30,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:10:31,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:10:33,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:10:33,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:10:33,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:10:33,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:33,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:10:34,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=923233.3333333334, ans=0.1 2023-10-02 15:10:35,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:10:35,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:40,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=923233.3333333334, ans=0.125 2023-10-02 15:10:43,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:10:43,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:10:44,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:10:45,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:48,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=923300.0, ans=0.0 2023-10-02 15:10:48,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.92 vs. limit=15.0 2023-10-02 15:10:50,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 15:10:50,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:56,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:56,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:10:56,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:57,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 15:11:00,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:00,946 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 15:11:02,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 15:11:02,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:05,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:11:05,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 15:11:07,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:11,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:11:12,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:14,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:14,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:11:16,746 INFO [train.py:1046] (2/4) Epoch 27, batch 400, loss[loss=0.1684, simple_loss=0.2568, pruned_loss=0.04001, over 24658.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.243, pruned_loss=0.04398, over 4094685.33 frames. ], batch size: 68, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:11:16,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:11:16,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=923433.3333333334, ans=0.2 2023-10-02 15:11:19,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:11:20,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:11:21,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=923433.3333333334, ans=0.125 2023-10-02 15:11:22,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 15:11:22,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:23,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:25,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=923433.3333333334, ans=10.0 2023-10-02 15:11:26,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:11:26,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:29,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:31,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:32,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 15:11:33,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 15:11:33,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:35,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 15:11:35,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:37,462 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:11:38,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=923500.0, ans=0.125 2023-10-02 15:11:39,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:11:39,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:11:39,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 15:11:41,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:11:41,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:41,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:11:43,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:44,662 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 15:11:44,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 15:11:48,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:50,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:50,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 15:11:52,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 15:11:55,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:11:55,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:03,394 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.781e+02 1.952e+02 2.127e+02 3.140e+02, threshold=3.905e+02, percent-clipped=0.0 2023-10-02 15:12:03,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 15:12:03,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=923633.3333333334, ans=0.125 2023-10-02 15:12:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:12:07,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 15:12:07,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=923633.3333333334, ans=0.0 2023-10-02 15:12:08,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.63 vs. limit=15.0 2023-10-02 15:12:09,997 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.42 vs. limit=22.5 2023-10-02 15:12:10,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:12:12,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:12:13,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 15:12:17,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:12:19,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:12:20,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:12:22,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:22,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 15:12:24,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:12:27,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 15:12:29,446 INFO [train.py:1046] (2/4) Epoch 27, batch 450, loss[loss=0.1647, simple_loss=0.243, pruned_loss=0.04323, over 23696.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2436, pruned_loss=0.04392, over 4235494.42 frames. ], batch size: 149, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:12:29,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:12:29,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:12:32,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 15:12:34,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.91 vs. limit=15.0 2023-10-02 15:12:34,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:12:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:12:34,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=923766.6666666666, ans=0.5 2023-10-02 15:12:36,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:12:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 15:12:38,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:12:39,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:12:39,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:12:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 15:12:40,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:12:41,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:12:43,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:12:43,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=923833.3333333334, ans=0.125 2023-10-02 15:12:48,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.73 vs. limit=15.0 2023-10-02 15:12:52,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:52,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:12:54,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 15:12:55,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 15:12:59,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:13:00,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:13:02,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:05,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:13:07,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:13:08,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 15:13:09,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 15:13:10,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=923900.0, ans=0.125 2023-10-02 15:13:11,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 15:13:11,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:12,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:12,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:13:14,682 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 15:13:16,069 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 15:13:16,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:13:17,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:13:18,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 15:13:20,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:13:20,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:13:21,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:13:21,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 15:13:24,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:13:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:13:28,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:13:29,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 15:13:31,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=924033.3333333334, ans=0.125 2023-10-02 15:13:34,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:13:34,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 15:13:35,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 15:13:37,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:13:41,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:13:42,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:13:44,260 INFO [train.py:1046] (2/4) Epoch 27, batch 500, loss[loss=0.1709, simple_loss=0.2447, pruned_loss=0.04853, over 23578.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2453, pruned_loss=0.04514, over 4328994.63 frames. ], batch size: 256, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:13:44,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:13:46,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 15:13:50,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:13:50,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=924100.0, ans=0.125 2023-10-02 15:13:51,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:51,677 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 15:13:53,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 15:13:53,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:55,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:14:00,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 15:14:01,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:14:03,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:14:03,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:14:05,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:14,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:14,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:14:14,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:14:15,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:15,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 15:14:15,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:14:17,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:14:18,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:14:18,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:14:18,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:20,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 15:14:21,654 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 15:14:21,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=924233.3333333334, ans=0.1 2023-10-02 15:14:24,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:25,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:25,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=924233.3333333334, ans=0.125 2023-10-02 15:14:25,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=924233.3333333334, ans=0.0 2023-10-02 15:14:25,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=924233.3333333334, ans=0.95 2023-10-02 15:14:27,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:14:30,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 15:14:31,510 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.886e+02 2.128e+02 2.373e+02 3.584e+02, threshold=4.256e+02, percent-clipped=0.0 2023-10-02 15:14:31,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=924300.0, ans=0.1 2023-10-02 15:14:33,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:14:33,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:36,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=924300.0, ans=0.125 2023-10-02 15:14:38,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:14:39,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:44,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:47,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 15:14:47,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:47,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:47,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=924366.6666666666, ans=0.125 2023-10-02 15:14:51,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 15:14:51,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:14:52,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:54,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=924366.6666666666, ans=0.125 2023-10-02 15:14:55,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=924366.6666666666, ans=0.125 2023-10-02 15:14:57,928 INFO [train.py:1046] (2/4) Epoch 27, batch 550, loss[loss=0.1552, simple_loss=0.2401, pruned_loss=0.03511, over 24478.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2468, pruned_loss=0.04552, over 4417038.91 frames. ], batch size: 66, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:14:58,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 15:15:01,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 15:15:02,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:02,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 15:15:02,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:15:02,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:04,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:06,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:15:06,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:15:09,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:15:09,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 15:15:09,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:15:13,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:13,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:17,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:15:18,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=924500.0, ans=0.125 2023-10-02 15:15:19,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:23,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 15:15:24,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 15:15:25,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:15:28,347 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.05 vs. limit=15.0 2023-10-02 15:15:30,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:15:30,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:15:32,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:15:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:36,962 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 15:15:38,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:38,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:15:41,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:15:43,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:15:43,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:15:44,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:44,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 15:15:46,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=924633.3333333334, ans=0.125 2023-10-02 15:15:47,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 15:15:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:15:47,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:15:48,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:15:48,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:50,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:15:51,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:15:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:15:55,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:55,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 15:15:57,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:15:58,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:15:59,543 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-10-02 15:15:59,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:16:00,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:01,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:16:01,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:16:07,584 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.47 vs. limit=15.0 2023-10-02 15:16:07,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 15:16:10,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 15:16:10,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:16:10,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:16:10,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:12,152 INFO [train.py:1046] (2/4) Epoch 27, batch 600, loss[loss=0.1619, simple_loss=0.2541, pruned_loss=0.03488, over 24665.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2469, pruned_loss=0.04533, over 4488734.92 frames. ], batch size: 73, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:16:18,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:16:20,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:16:22,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 15:16:23,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:16:26,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:16:29,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 15:16:31,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:16:36,231 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=15.0 2023-10-02 15:16:39,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 15:16:43,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:16:43,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:43,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:16:48,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=924900.0, ans=0.125 2023-10-02 15:16:49,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:16:49,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:16:49,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:53,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:16:53,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=924900.0, ans=0.125 2023-10-02 15:16:57,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:57,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:16:57,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:59,384 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.876e+02 2.029e+02 2.331e+02 3.965e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-02 15:17:02,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 15:17:09,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:17:09,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:17:13,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 15:17:15,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:17:15,764 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.08 vs. limit=15.0 2023-10-02 15:17:16,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 15:17:16,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:17:17,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:17:22,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 15:17:22,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:17:25,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:17:25,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:17:26,478 INFO [train.py:1046] (2/4) Epoch 27, batch 650, loss[loss=0.1586, simple_loss=0.2082, pruned_loss=0.05447, over 19205.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2455, pruned_loss=0.04504, over 4531412.42 frames. ], batch size: 388, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:17:26,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:31,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 15:17:32,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:17:37,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=925100.0, ans=0.025 2023-10-02 15:17:38,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:17:38,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:17:41,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:43,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=925166.6666666666, ans=0.125 2023-10-02 15:17:44,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 15:17:46,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:17:46,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=925166.6666666666, ans=0.1 2023-10-02 15:17:47,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:17:51,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:17:51,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:17:53,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:53,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:54,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:17:56,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:57,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:17:57,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:17:57,619 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 15:17:57,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:18:02,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:02,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:18:03,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:05,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:18:05,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 15:18:07,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:18:07,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:18:08,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:18:08,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:18:09,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:18:09,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 15:18:11,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=925300.0, ans=0.1 2023-10-02 15:18:13,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 15:18:13,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:13,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:18:13,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:18:13,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:18:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:18:20,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:21,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:18:23,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:18:23,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=925300.0, ans=0.025 2023-10-02 15:18:25,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:25,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:18:27,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:34,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:18:34,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:18:34,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:18:35,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:18:41,046 INFO [train.py:1046] (2/4) Epoch 27, batch 700, loss[loss=0.1653, simple_loss=0.2349, pruned_loss=0.04783, over 23726.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2438, pruned_loss=0.04472, over 4573366.59 frames. ], batch size: 164, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:18:41,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 15:18:42,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 15:18:47,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 15:18:47,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:48,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:18:50,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 15:18:55,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:18:58,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:18:59,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:19:00,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:19:01,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:19:05,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:19:09,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 15:19:09,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:19:10,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 15:19:13,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 15:19:15,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:19:15,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:19:17,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:19:21,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:19:21,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 15:19:25,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:19:26,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:19:26,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 15:19:27,922 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.868e+02 2.051e+02 2.317e+02 3.367e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 15:19:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:19:33,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:19:34,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:19:41,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:19:42,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 15:19:45,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 15:19:47,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 15:19:48,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:19:49,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=925700.0, ans=0.125 2023-10-02 15:19:50,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:19:50,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:19:51,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:19:51,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 15:19:54,428 INFO [train.py:1046] (2/4) Epoch 27, batch 750, loss[loss=0.1613, simple_loss=0.2428, pruned_loss=0.03995, over 24342.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2444, pruned_loss=0.04496, over 4612178.57 frames. ], batch size: 61, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:19:55,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 15:19:55,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 15:19:55,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 15:19:55,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 15:19:56,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 15:19:57,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:19:57,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 15:19:58,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:20:00,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:20:01,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:02,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:03,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.44 vs. limit=22.5 2023-10-02 15:20:04,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:20:04,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:20:07,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:20:09,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:20:11,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:20:12,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:14,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:14,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 15:20:16,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:20:17,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:20:18,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:20:20,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:20:22,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 15:20:22,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:20:23,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 15:20:24,919 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 15:20:24,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 15:20:24,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:20:25,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:20:25,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.47 vs. limit=22.5 2023-10-02 15:20:26,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:20:32,462 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:20:33,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:20:33,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:20:33,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:20:36,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:37,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:20:39,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 15:20:40,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:20:42,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 15:20:42,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:20:43,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=925966.6666666666, ans=0.0 2023-10-02 15:20:46,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:20:47,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 15:20:47,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:20:51,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:20:53,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:20:53,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:53,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=926033.3333333334, ans=0.125 2023-10-02 15:20:55,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:20:58,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 15:20:58,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:20:58,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:02,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:02,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:04,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:04,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:21:08,208 INFO [train.py:1046] (2/4) Epoch 27, batch 800, loss[loss=0.1666, simple_loss=0.2442, pruned_loss=0.04448, over 23714.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2452, pruned_loss=0.04492, over 4645171.79 frames. ], batch size: 232, lr: 3.82e-03, grad_scale: 32.0 2023-10-02 15:21:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:11,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:14,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:21:14,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:14,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:14,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:17,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:20,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:21,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:21:22,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.12 vs. limit=15.0 2023-10-02 15:21:24,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 15:21:25,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:27,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:27,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:21:27,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:21:27,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 15:21:27,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:28,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 15:21:31,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:32,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:34,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:34,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:21:37,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:43,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:21:45,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:21:45,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 15:21:47,071 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 15:21:47,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 15:21:48,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:21:48,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:49,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:49,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:21:55,121 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.828e+02 2.029e+02 2.331e+02 3.011e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-02 15:21:56,504 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 15:21:56,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 15:21:57,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:21:59,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=926300.0, ans=0.09899494936611666 2023-10-02 15:22:00,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:22:04,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:22:05,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.36 vs. limit=15.0 2023-10-02 15:22:07,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:22:09,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 15:22:09,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:22:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 15:22:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:22:21,343 INFO [train.py:1046] (2/4) Epoch 27, batch 850, loss[loss=0.1589, simple_loss=0.2338, pruned_loss=0.04205, over 20341.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2456, pruned_loss=0.04509, over 4656977.98 frames. ], batch size: 44, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:22:22,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:22:22,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 15:22:24,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:22:24,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:22:25,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 15:22:25,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:26,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:22:28,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:22:29,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:22:31,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:22:31,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 15:22:31,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 15:22:31,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 15:22:32,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:22:32,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:22:35,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:22:35,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:22:35,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:22:35,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=926500.0, ans=0.125 2023-10-02 15:22:39,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:39,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:22:40,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 15:22:44,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 15:22:44,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=926500.0, ans=0.2 2023-10-02 15:22:47,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:49,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 15:22:51,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=926566.6666666666, ans=0.125 2023-10-02 15:22:53,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 15:22:55,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 15:22:56,523 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 15:22:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:22:56,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:22:56,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:23:00,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:01,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:02,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 15:23:03,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:23:04,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:23:06,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:23:06,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:23:08,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:23:10,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:23:10,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 15:23:13,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=926633.3333333334, ans=0.05 2023-10-02 15:23:15,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:23:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:23:16,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:23:16,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:23:16,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:23:20,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:24,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:23:25,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:23:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:23:26,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:23:34,942 INFO [train.py:1046] (2/4) Epoch 27, batch 900, loss[loss=0.1693, simple_loss=0.257, pruned_loss=0.04084, over 24608.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2468, pruned_loss=0.0457, over 4665543.56 frames. ], batch size: 71, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:23:35,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:23:35,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:23:36,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 15:23:37,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:23:37,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:23:39,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 15:23:44,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:23:48,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:23:48,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 15:23:51,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:23:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 15:23:53,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 15:23:53,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=926833.3333333334, ans=0.125 2023-10-02 15:23:54,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:23:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:23:56,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:23:56,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:24:02,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=926833.3333333334, ans=0.125 2023-10-02 15:24:05,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:05,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:24:06,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:24:07,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:24:13,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 15:24:14,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:24:16,613 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=15.0 2023-10-02 15:24:19,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:24:19,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=926966.6666666666, ans=0.1 2023-10-02 15:24:20,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:24:20,799 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 15:24:22,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 15:24:24,176 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.889e+02 2.142e+02 2.586e+02 4.212e+02, threshold=4.284e+02, percent-clipped=1.0 2023-10-02 15:24:25,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=926966.6666666666, ans=0.125 2023-10-02 15:24:27,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:24:27,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:24:27,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:24:31,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=926966.6666666666, ans=0.125 2023-10-02 15:24:34,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:34,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:24:35,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=927033.3333333334, ans=0.125 2023-10-02 15:24:36,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 15:24:36,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:24:39,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 15:24:40,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:24:40,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:42,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-10-02 15:24:43,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:24:43,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:24:46,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 15:24:47,547 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 15:24:49,475 INFO [train.py:1046] (2/4) Epoch 27, batch 950, loss[loss=0.1575, simple_loss=0.2437, pruned_loss=0.03562, over 24288.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2476, pruned_loss=0.04586, over 4670067.71 frames. ], batch size: 61, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:24:49,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:24:49,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 15:24:51,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 15:24:56,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=927100.0, ans=0.125 2023-10-02 15:24:59,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:00,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:02,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:02,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:25:04,993 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 15:25:09,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:10,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:25:10,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:10,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:25:10,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 15:25:11,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:25:13,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:14,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 15:25:15,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:25:16,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=927166.6666666666, ans=0.2 2023-10-02 15:25:19,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:19,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:25:19,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:25:20,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 15:25:23,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 15:25:25,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:25:25,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:25:31,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:25:31,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:34,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 15:25:36,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 15:25:36,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:25:37,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:25:38,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:38,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:25:41,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 15:25:43,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:25:44,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:25:45,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:45,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 15:25:45,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:45,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:25:47,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 15:25:50,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:25:52,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:55,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:25:57,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 15:25:57,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 15:26:01,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:26:04,259 INFO [train.py:1046] (2/4) Epoch 27, batch 1000, loss[loss=0.1727, simple_loss=0.2521, pruned_loss=0.04663, over 23758.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2462, pruned_loss=0.04563, over 4683777.25 frames. ], batch size: 85, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:26:07,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 15:26:07,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:11,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:26:12,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 15:26:12,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 15:26:15,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:15,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:26:17,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:20,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 15:26:23,682 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:26:23,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=927500.0, ans=0.125 2023-10-02 15:26:24,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 15:26:26,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 15:26:26,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:26:29,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 15:26:29,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 15:26:29,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 15:26:32,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:33,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:40,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:42,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:26:43,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:44,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:44,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 15:26:44,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:26:44,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:26:46,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:46,366 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 15:26:49,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 15:26:49,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=927633.3333333334, ans=0.125 2023-10-02 15:26:51,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 15:26:53,525 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.843e+02 1.984e+02 2.193e+02 2.939e+02, threshold=3.969e+02, percent-clipped=0.0 2023-10-02 15:26:54,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 15:26:56,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:27:01,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:02,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:27:02,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:03,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:27:05,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 15:27:05,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:27:06,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 15:27:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 15:27:08,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=927700.0, ans=0.1 2023-10-02 15:27:09,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:27:09,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:27:11,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.40 vs. limit=5.0 2023-10-02 15:27:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:27:13,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:27:15,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:27:18,223 INFO [train.py:1046] (2/4) Epoch 27, batch 1050, loss[loss=0.1469, simple_loss=0.214, pruned_loss=0.0399, over 23642.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2441, pruned_loss=0.04474, over 4689570.13 frames. ], batch size: 232, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:27:18,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:27:18,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=927766.6666666666, ans=0.125 2023-10-02 15:27:19,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:27:21,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:27:21,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:24,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:27:24,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.17 vs. limit=12.0 2023-10-02 15:27:25,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:27:25,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=927766.6666666666, ans=0.0 2023-10-02 15:27:28,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:27:30,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:27:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:27:31,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:27:31,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:27:33,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 15:27:33,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:27:33,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 15:27:36,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:27:36,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 15:27:37,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:27:41,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:43,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:27:43,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:27:45,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 15:27:45,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 15:27:48,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:27:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 15:27:56,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 15:27:58,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:00,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:28:02,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:28:04,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:28:04,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:28:06,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:28:10,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 15:28:12,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 15:28:12,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 15:28:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:28:12,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=927966.6666666666, ans=0.125 2023-10-02 15:28:14,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:28:15,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 15:28:18,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:28:18,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=928033.3333333334, ans=0.0 2023-10-02 15:28:20,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:28:20,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:28:20,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:28:20,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:24,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:24,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 15:28:26,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:28:26,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 15:28:26,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 15:28:26,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=928033.3333333334, ans=0.125 2023-10-02 15:28:28,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:28:28,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=928033.3333333334, ans=0.2 2023-10-02 15:28:28,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=928033.3333333334, ans=0.125 2023-10-02 15:28:28,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=928033.3333333334, ans=0.125 2023-10-02 15:28:31,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:28:32,691 INFO [train.py:1046] (2/4) Epoch 27, batch 1100, loss[loss=0.1662, simple_loss=0.2518, pruned_loss=0.0403, over 24551.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2432, pruned_loss=0.04437, over 4695942.38 frames. ], batch size: 71, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:28:36,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:28:39,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:28:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:28:42,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:28:42,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 15:28:43,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:28:45,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:28:46,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:28:49,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=928166.6666666666, ans=0.125 2023-10-02 15:28:50,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:28:50,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 15:28:51,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:28:53,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:28:53,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:28:53,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=928166.6666666666, ans=0.0 2023-10-02 15:28:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:28:59,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:29:01,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=928233.3333333334, ans=0.0 2023-10-02 15:29:02,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:29:03,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.88 vs. limit=12.0 2023-10-02 15:29:06,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 15:29:06,941 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 15:29:06,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:09,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:09,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:29:09,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:29:11,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 15:29:11,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:29:11,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:29:12,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:29:12,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:12,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 15:29:15,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=928300.0, ans=15.0 2023-10-02 15:29:19,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:29:20,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 15:29:21,249 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.794e+02 1.965e+02 2.259e+02 3.177e+02, threshold=3.930e+02, percent-clipped=0.0 2023-10-02 15:29:21,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:29:25,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=928300.0, ans=0.0 2023-10-02 15:29:27,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:29:31,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 15:29:31,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:29:33,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:35,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:29:36,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:29:37,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 15:29:39,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:29:39,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:29:40,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 15:29:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:29:42,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 15:29:43,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:29:43,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:29:43,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:29:46,241 INFO [train.py:1046] (2/4) Epoch 27, batch 1150, loss[loss=0.1795, simple_loss=0.2512, pruned_loss=0.05385, over 23926.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2446, pruned_loss=0.04488, over 4706911.59 frames. ], batch size: 180, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:29:47,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:29:51,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:29:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:29:54,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:29:54,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 15:29:54,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:29:57,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 15:29:58,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:29:58,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:30:04,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 15:30:06,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:30:09,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:30:09,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:10,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 15:30:10,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:30:10,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:30:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 15:30:15,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:30:16,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:30:26,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.71 vs. limit=22.5 2023-10-02 15:30:27,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:34,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:34,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 15:30:34,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:34,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:36,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=928633.3333333334, ans=0.125 2023-10-02 15:30:40,512 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 15:30:41,008 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.75 vs. limit=10.0 2023-10-02 15:30:43,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:48,731 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 15:30:52,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:30:53,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:30:53,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:30:53,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:30:58,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:01,335 INFO [train.py:1046] (2/4) Epoch 27, batch 1200, loss[loss=0.1729, simple_loss=0.2626, pruned_loss=0.04157, over 24538.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2455, pruned_loss=0.0451, over 4708638.01 frames. ], batch size: 71, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:31:02,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:31:02,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:31:04,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:04,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:04,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:31:08,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:31:10,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:31:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:11,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:31:14,309 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 15:31:17,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 15:31:18,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:31:21,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:31:22,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=928833.3333333334, ans=0.0 2023-10-02 15:31:24,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:26,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:31:26,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 15:31:27,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:36,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:31:36,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:31:36,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 15:31:36,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:31:40,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 15:31:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 15:31:42,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:43,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:31:45,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:31:46,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:31:48,994 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.904e+02 2.120e+02 2.564e+02 3.915e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-02 15:31:49,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:49,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:31:49,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:31:50,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 15:31:50,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:31:51,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:31:51,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:31:54,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:54,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:31:57,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:31:59,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:32:02,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 15:32:06,947 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 15:32:08,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:32:09,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:32:11,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:32:11,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:32:14,402 INFO [train.py:1046] (2/4) Epoch 27, batch 1250, loss[loss=0.165, simple_loss=0.246, pruned_loss=0.04204, over 24478.00 frames. ], tot_loss[loss=0.168, simple_loss=0.246, pruned_loss=0.04503, over 4717436.73 frames. ], batch size: 63, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:32:14,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 15:32:18,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:32:20,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:21,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 15:32:22,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:32:25,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:32:27,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=929166.6666666666, ans=0.2 2023-10-02 15:32:29,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:32:31,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:31,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=929166.6666666666, ans=0.125 2023-10-02 15:32:32,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:32:32,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:32:35,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:32:35,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=929166.6666666666, ans=0.0 2023-10-02 15:32:38,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 15:32:38,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:32:38,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:32:39,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=929166.6666666666, ans=0.125 2023-10-02 15:32:40,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:32:42,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:43,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:32:45,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:32:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 15:32:50,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:32:53,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:32:53,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 15:32:55,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:55,145 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 15:32:55,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:55,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:58,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=929300.0, ans=0.2 2023-10-02 15:32:59,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:33:04,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:33:04,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:33:04,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 15:33:04,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=929300.0, ans=0.125 2023-10-02 15:33:06,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 15:33:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 15:33:09,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:33:10,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 15:33:10,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:33:13,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 15:33:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:33:14,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 15:33:14,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:33:15,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:33:15,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:33:17,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:33:19,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 15:33:19,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=929366.6666666666, ans=0.1 2023-10-02 15:33:19,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=929366.6666666666, ans=0.0 2023-10-02 15:33:22,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:33:23,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:33:24,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:33:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:33:27,266 INFO [train.py:1046] (2/4) Epoch 27, batch 1300, loss[loss=0.1627, simple_loss=0.2423, pruned_loss=0.04155, over 24065.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2463, pruned_loss=0.04549, over 4713756.50 frames. ], batch size: 80, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:33:28,534 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=15.0 2023-10-02 15:33:29,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:33:29,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 15:33:35,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:33:36,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:33:37,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=929433.3333333334, ans=0.2 2023-10-02 15:33:38,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:33:39,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:33:41,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:33:42,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 15:33:45,916 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.97 vs. limit=15.0 2023-10-02 15:33:46,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:33:47,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:33:49,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 15:33:53,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:33:56,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:33:58,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:33:59,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:34:01,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:01,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:34:03,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:34:03,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 15:34:09,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:34:09,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:34:09,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 15:34:11,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:34:12,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:34:14,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=929633.3333333334, ans=0.125 2023-10-02 15:34:15,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:34:16,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 15:34:17,775 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.852e+02 2.104e+02 2.381e+02 3.588e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 15:34:17,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:34:17,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 15:34:19,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:34:20,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:34:20,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:34:24,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 15:34:25,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 15:34:26,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 15:34:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:34:33,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 15:34:34,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:35,199 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.53 vs. limit=15.0 2023-10-02 15:34:40,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=929766.6666666666, ans=0.125 2023-10-02 15:34:41,788 INFO [train.py:1046] (2/4) Epoch 27, batch 1350, loss[loss=0.1692, simple_loss=0.2475, pruned_loss=0.04541, over 23412.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2454, pruned_loss=0.04539, over 4709355.28 frames. ], batch size: 93, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:34:41,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 15:34:44,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:34:47,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:34:50,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:50,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:34:53,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:34:53,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:34:53,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=929766.6666666666, ans=0.125 2023-10-02 15:34:54,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=929833.3333333334, ans=0.2 2023-10-02 15:34:57,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:35:00,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 15:35:02,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.63 vs. limit=15.0 2023-10-02 15:35:02,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:35:02,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:35:05,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 15:35:07,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:35:08,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:35:08,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 15:35:09,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 15:35:11,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 15:35:12,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:12,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 15:35:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:22,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=929900.0, ans=0.0 2023-10-02 15:35:28,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=929966.6666666666, ans=0.125 2023-10-02 15:35:31,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:31,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:31,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 15:35:35,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:37,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 15:35:38,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:35:38,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:35:39,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=12.20 vs. limit=15.0 2023-10-02 15:35:41,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:35:42,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 15:35:45,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:35:49,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 15:35:51,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 15:35:55,181 INFO [train.py:1046] (2/4) Epoch 27, batch 1400, loss[loss=0.1631, simple_loss=0.2481, pruned_loss=0.039, over 24435.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2445, pruned_loss=0.04487, over 4707292.73 frames. ], batch size: 69, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:35:55,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 15:35:57,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:59,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:36:01,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:36:07,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 15:36:07,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 15:36:15,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=930166.6666666666, ans=0.1 2023-10-02 15:36:16,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:36:18,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:36:19,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:36:21,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:36:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:36:26,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 15:36:36,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:36,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:36,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=930233.3333333334, ans=0.0 2023-10-02 15:36:40,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=930300.0, ans=0.125 2023-10-02 15:36:41,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 15:36:41,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:36:42,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:36:43,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.71 vs. limit=15.0 2023-10-02 15:36:44,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:36:44,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:36:45,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:36:45,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:36:46,666 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.831e+02 2.061e+02 2.226e+02 3.360e+02, threshold=4.122e+02, percent-clipped=0.0 2023-10-02 15:36:46,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:36:48,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 15:36:48,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:36:51,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:55,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:37:00,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-10-02 15:37:01,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 15:37:02,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:37:03,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:37:05,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 15:37:06,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:08,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=930433.3333333334, ans=0.0 2023-10-02 15:37:09,949 INFO [train.py:1046] (2/4) Epoch 27, batch 1450, loss[loss=0.1686, simple_loss=0.2514, pruned_loss=0.04289, over 24486.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2437, pruned_loss=0.04494, over 4689769.74 frames. ], batch size: 66, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:37:10,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:37:14,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:37:14,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=930433.3333333334, ans=0.2 2023-10-02 15:37:16,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:37:16,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:16,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 15:37:21,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:22,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:37:22,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:37:23,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 15:37:25,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:37:26,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 15:37:26,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:26,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:26,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 15:37:28,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:37:28,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:37:28,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=930500.0, ans=0.125 2023-10-02 15:37:29,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 15:37:29,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:30,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:37:31,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:34,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:39,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:37:39,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:37:42,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:42,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:43,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:43,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:37:43,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:45,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:37:49,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 15:37:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:37:54,750 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 15:37:56,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:37:57,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:37:58,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:00,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 15:38:05,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:05,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 15:38:07,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 15:38:09,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:12,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:38:13,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:38:15,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 15:38:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 15:38:18,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 15:38:19,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:20,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:38:22,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=930766.6666666666, ans=0.125 2023-10-02 15:38:23,500 INFO [train.py:1046] (2/4) Epoch 27, batch 1500, loss[loss=0.1568, simple_loss=0.2382, pruned_loss=0.03772, over 24580.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.244, pruned_loss=0.04507, over 4703514.18 frames. ], batch size: 60, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:38:29,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 15:38:29,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:38:29,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:38:30,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:31,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:38:31,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:38:32,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.13 vs. limit=22.5 2023-10-02 15:38:33,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 15:38:35,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:38:35,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:38:35,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:38:36,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:38:38,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:38:39,199 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.89 vs. limit=15.0 2023-10-02 15:38:40,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:38:44,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:38:44,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 15:38:45,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:38:47,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:38:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:51,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 15:38:51,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=930900.0, ans=0.125 2023-10-02 15:38:55,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 15:38:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:56,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 15:38:58,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:38:59,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:39:00,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:39:00,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:01,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 15:39:02,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:39:02,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:39:02,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 15:39:04,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:39:06,763 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.62 vs. limit=15.0 2023-10-02 15:39:11,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:39:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 15:39:13,899 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.907e+02 2.123e+02 2.554e+02 3.367e+02, threshold=4.246e+02, percent-clipped=0.0 2023-10-02 15:39:15,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:39:17,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:39:22,130 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 15:39:22,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:22,185 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 15:39:23,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:39:24,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:39:26,363 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 15:39:27,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:39:30,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 15:39:33,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:36,348 INFO [train.py:1046] (2/4) Epoch 27, batch 1550, loss[loss=0.1693, simple_loss=0.2445, pruned_loss=0.04703, over 23576.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2443, pruned_loss=0.04498, over 4717969.67 frames. ], batch size: 134, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:39:37,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:39:37,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:38,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:39:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:38,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:39:41,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 15:39:41,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 15:39:41,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:39:43,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 15:39:43,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 15:39:44,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:46,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:46,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:39:46,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:39:48,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:49,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:50,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 15:39:52,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:39:52,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:39:53,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:39:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:39:54,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 15:39:56,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:56,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 15:39:57,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 15:39:57,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 15:39:58,334 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.28 vs. limit=15.0 2023-10-02 15:39:59,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:00,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:03,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:40:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 15:40:04,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 15:40:13,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=931233.3333333334, ans=0.0 2023-10-02 15:40:14,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:18,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:40:18,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:40:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:40:18,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=931233.3333333334, ans=0.0 2023-10-02 15:40:20,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 15:40:24,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=931300.0, ans=0.125 2023-10-02 15:40:26,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:40:28,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:29,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:40:31,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:40:32,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:32,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 15:40:33,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:40:35,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:40:35,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:35,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 15:40:35,295 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 15:40:38,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:40:43,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 15:40:49,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:40:50,816 INFO [train.py:1046] (2/4) Epoch 27, batch 1600, loss[loss=0.1684, simple_loss=0.2349, pruned_loss=0.05094, over 23966.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2451, pruned_loss=0.04546, over 4714839.06 frames. ], batch size: 196, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:40:50,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:52,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 15:40:52,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:40:53,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:40:53,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:40:53,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:40:54,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:40:59,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:40:59,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 15:41:00,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 15:41:01,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 15:41:03,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:41:04,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 15:41:05,246 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.85 vs. limit=15.0 2023-10-02 15:41:05,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:41:08,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=931500.0, ans=0.2 2023-10-02 15:41:09,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:41:12,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=931500.0, ans=0.09899494936611666 2023-10-02 15:41:14,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:41:18,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 15:41:19,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:41:21,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 15:41:21,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:21,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 15:41:24,755 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.295e-02 2023-10-02 15:41:27,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 15:41:34,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:41:34,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=931633.3333333334, ans=0.125 2023-10-02 15:41:35,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 15:41:35,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:41:35,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:41:35,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:41:36,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.51 vs. limit=6.0 2023-10-02 15:41:38,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 15:41:41,432 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.847e+02 2.065e+02 2.421e+02 3.334e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 15:41:41,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 15:41:43,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:41:43,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:44,179 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.95 vs. limit=12.0 2023-10-02 15:41:44,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:45,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:41:46,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:41:47,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:41:49,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:41:53,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=15.0 2023-10-02 15:41:54,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:56,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:41:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 15:41:58,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:41:59,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 15:42:01,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=931766.6666666666, ans=0.125 2023-10-02 15:42:03,086 INFO [train.py:1046] (2/4) Epoch 27, batch 1650, loss[loss=0.148, simple_loss=0.2395, pruned_loss=0.02826, over 24558.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2457, pruned_loss=0.04541, over 4722651.90 frames. ], batch size: 71, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:42:04,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:05,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:42:07,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:42:07,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 15:42:07,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 15:42:07,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 15:42:09,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 15:42:12,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:42:12,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:42:14,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:42:14,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:42:14,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=931766.6666666666, ans=0.125 2023-10-02 15:42:17,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:18,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 15:42:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:42:19,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:42:19,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:42:20,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:42:21,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 15:42:22,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 15:42:27,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:42:28,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=931833.3333333334, ans=0.0 2023-10-02 15:42:30,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:42:37,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 15:42:37,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:41,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 15:42:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:42:44,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=931900.0, ans=0.125 2023-10-02 15:42:46,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:42:47,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:42:47,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:42:49,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:42:49,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:53,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:53,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:53,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:42:53,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:42:53,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=931966.6666666666, ans=0.0 2023-10-02 15:42:55,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:42:56,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:42:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:42:59,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 15:43:00,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=931966.6666666666, ans=0.2 2023-10-02 15:43:01,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:43:01,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 15:43:01,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=932033.3333333334, ans=0.0 2023-10-02 15:43:02,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 15:43:02,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 15:43:02,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:04,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:43:05,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:43:05,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:43:05,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 15:43:10,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:43:11,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:43:12,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:43:15,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 15:43:18,078 INFO [train.py:1046] (2/4) Epoch 27, batch 1700, loss[loss=0.1631, simple_loss=0.2522, pruned_loss=0.03696, over 24544.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2454, pruned_loss=0.04545, over 4714474.09 frames. ], batch size: 71, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:43:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:43:18,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:43:18,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 15:43:18,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:43:18,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:43:18,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:43:21,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:43:21,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:43:22,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 15:43:25,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:43:33,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:43:37,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:43:38,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=932166.6666666666, ans=0.0 2023-10-02 15:43:43,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:43:43,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:43:45,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:43:45,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:43:47,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 15:43:49,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:43:49,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:50,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:43:52,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:43:53,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 15:43:53,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=932233.3333333334, ans=0.0 2023-10-02 15:43:54,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 15:43:56,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:59,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 15:43:59,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:44:02,845 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-10-02 15:44:06,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:07,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:08,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=932300.0, ans=0.125 2023-10-02 15:44:09,557 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.921e+02 2.104e+02 2.385e+02 3.447e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 15:44:09,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:44:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:44:12,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 15:44:12,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:44:14,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:14,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 15:44:15,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:44:15,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:15,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:15,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:18,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:18,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:44:19,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:19,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:44:19,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:21,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=932366.6666666666, ans=0.2 2023-10-02 15:44:25,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:44:25,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 15:44:27,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:28,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:44:31,721 INFO [train.py:1046] (2/4) Epoch 27, batch 1750, loss[loss=0.1702, simple_loss=0.2404, pruned_loss=0.04997, over 23649.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2441, pruned_loss=0.04487, over 4712802.73 frames. ], batch size: 135, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:44:31,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 15:44:37,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:37,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:38,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:44:38,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 15:44:40,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:40,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=932433.3333333334, ans=0.125 2023-10-02 15:44:43,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:44:43,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:45,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=932500.0, ans=0.125 2023-10-02 15:44:48,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 15:44:49,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:51,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 15:44:51,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:52,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:44:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:44:55,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 15:44:58,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:44:58,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 15:45:03,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=932566.6666666666, ans=0.125 2023-10-02 15:45:05,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:45:05,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=932566.6666666666, ans=0.1 2023-10-02 15:45:08,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:08,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:45:14,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:14,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:45:16,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:45:17,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:20,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:45:20,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:45:22,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 15:45:23,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:45:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 15:45:26,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:45:28,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:45:29,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:45:32,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:45:33,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:45:33,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:35,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:45:37,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:45:39,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:45:41,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:45:41,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=932700.0, ans=0.07 2023-10-02 15:45:43,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 15:45:43,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:44,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:45:44,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:45:44,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:45:45,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=932766.6666666666, ans=0.0 2023-10-02 15:45:46,221 INFO [train.py:1046] (2/4) Epoch 27, batch 1800, loss[loss=0.1859, simple_loss=0.2705, pruned_loss=0.05068, over 24018.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2447, pruned_loss=0.04509, over 4711526.00 frames. ], batch size: 86, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:45:46,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:45:46,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:45:49,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=932766.6666666666, ans=0.1 2023-10-02 15:45:49,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=932766.6666666666, ans=0.0 2023-10-02 15:45:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:45:51,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:53,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:45:55,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:56,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=932766.6666666666, ans=0.0 2023-10-02 15:45:59,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 15:45:59,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:46:02,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:04,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:04,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:04,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:46:07,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:46:07,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 15:46:08,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:12,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:15,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 15:46:18,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 15:46:18,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 15:46:18,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=932900.0, ans=0.125 2023-10-02 15:46:20,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:21,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:21,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:46:21,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:46:28,288 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 15:46:29,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:46:31,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:33,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 15:46:34,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 15:46:34,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:46:35,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:46:37,033 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.417e+02 1.793e+02 1.976e+02 2.310e+02 4.121e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-02 15:46:37,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:46:37,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=932966.6666666666, ans=0.0 2023-10-02 15:46:41,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 15:46:47,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:46:49,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 15:46:50,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:46:50,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:50,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:46:50,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 15:46:53,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:46:53,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:46:56,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 15:46:56,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:57,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:46:57,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:46:57,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:58,968 INFO [train.py:1046] (2/4) Epoch 27, batch 1850, loss[loss=0.1878, simple_loss=0.2562, pruned_loss=0.05972, over 23654.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2451, pruned_loss=0.04528, over 4708196.11 frames. ], batch size: 256, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:47:00,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:47:00,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:47:03,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:47:03,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:47:05,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:47:06,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:47:12,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:47:12,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 15:47:17,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 15:47:19,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 15:47:20,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.35 vs. limit=15.0 2023-10-02 15:47:22,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:47:22,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 15:47:22,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 15:47:32,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:47:33,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 15:47:35,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:47:35,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:47:39,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 15:47:39,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:47:41,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:47:42,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:47:43,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.05 vs. limit=15.0 2023-10-02 15:47:45,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:47:46,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:47:49,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:47:53,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:47:54,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:47:54,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:47:56,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:47:57,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:47:59,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 15:47:59,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:48:00,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.89 vs. limit=12.0 2023-10-02 15:48:03,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:48:03,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:48:03,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 15:48:03,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 15:48:03,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=933366.6666666666, ans=0.0 2023-10-02 15:48:06,009 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 15:48:06,087 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 15:48:08,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:48:09,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:48:09,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:48:09,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:09,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 15:48:09,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:48:10,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:12,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:48:13,061 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:48:14,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:48:15,789 INFO [train.py:1046] (2/4) Epoch 27, batch 1900, loss[loss=0.1594, simple_loss=0.2517, pruned_loss=0.03353, over 24321.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2451, pruned_loss=0.04487, over 4719560.08 frames. ], batch size: 74, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:48:15,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:48:15,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 15:48:19,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:19,162 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 15:48:19,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:48:20,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:48:26,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:48:27,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:48:28,839 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 15:48:30,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 15:48:31,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:48:31,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:48:31,656 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 15:48:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 15:48:31,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=933500.0, ans=10.0 2023-10-02 15:48:35,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 15:48:37,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:48:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 15:48:43,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 15:48:54,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 15:48:57,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 15:48:57,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:58,491 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 15:48:58,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 15:48:58,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 15:49:00,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 15:49:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:04,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=933633.3333333334, ans=0.1 2023-10-02 15:49:05,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 15:49:06,698 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.841e+02 1.943e+02 2.107e+02 2.840e+02, threshold=3.886e+02, percent-clipped=0.0 2023-10-02 15:49:06,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:49:10,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:49:10,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 15:49:12,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:49:15,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 15:49:15,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:49:20,436 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:49:22,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:49:22,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:49:22,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:49:23,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:49:25,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:49:25,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 15:49:26,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:49:29,314 INFO [train.py:1046] (2/4) Epoch 27, batch 1950, loss[loss=0.17, simple_loss=0.2495, pruned_loss=0.04523, over 23349.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2456, pruned_loss=0.04502, over 4723889.72 frames. ], batch size: 105, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:49:30,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:49:30,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:49:33,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:49:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:49:33,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:49:34,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:49:38,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:49:39,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:49:40,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:40,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:49:40,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=933766.6666666666, ans=0.125 2023-10-02 15:49:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 15:49:43,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:49:43,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:44,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:45,708 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.86 vs. limit=6.0 2023-10-02 15:49:46,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:49:46,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:49:46,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:48,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:49:51,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:49:51,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:49:51,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:49:53,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:56,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:57,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:49:57,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:49:57,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:49:57,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 15:49:59,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:49:59,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:50:00,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:03,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:50:04,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:50:08,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:50:10,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=933900.0, ans=0.125 2023-10-02 15:50:11,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:50:11,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:50:12,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 15:50:12,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:50:16,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:50:16,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=933966.6666666666, ans=0.125 2023-10-02 15:50:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:50:17,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:50:25,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=933966.6666666666, ans=0.125 2023-10-02 15:50:26,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:28,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:29,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:31,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:34,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:50:36,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:36,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 15:50:36,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:50:36,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:50:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 15:50:38,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=934033.3333333334, ans=0.1 2023-10-02 15:50:39,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:50:41,903 INFO [train.py:1046] (2/4) Epoch 27, batch 2000, loss[loss=0.1574, simple_loss=0.2275, pruned_loss=0.04361, over 23667.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2458, pruned_loss=0.04517, over 4728393.59 frames. ], batch size: 149, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:50:42,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:50:44,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:50:44,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:50:45,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:50:46,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:49,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 15:50:49,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:50:53,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:50:55,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 15:50:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:50:57,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:51:00,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:51:01,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 15:51:03,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=934166.6666666666, ans=0.125 2023-10-02 15:51:05,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 15:51:05,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:51:07,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 15:51:07,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:51:11,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:51:11,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:51:13,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:13,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:51:13,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:51:13,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=934233.3333333334, ans=0.2 2023-10-02 15:51:15,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 15:51:17,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 15:51:17,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:51:17,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:23,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=934233.3333333334, ans=0.1 2023-10-02 15:51:24,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:25,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:51:25,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:51:26,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:51:28,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:51:28,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:28,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:51:28,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:30,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:33,408 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.846e+02 2.049e+02 2.214e+02 2.926e+02, threshold=4.097e+02, percent-clipped=0.0 2023-10-02 15:51:33,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:51:33,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 15:51:37,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:51:38,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:42,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.75 vs. limit=15.0 2023-10-02 15:51:43,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:43,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:51:45,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=934366.6666666666, ans=0.0 2023-10-02 15:51:45,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=934366.6666666666, ans=0.125 2023-10-02 15:51:48,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:49,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=934366.6666666666, ans=0.125 2023-10-02 15:51:51,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:51:51,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:52,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:51:52,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:51:53,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:55,651 INFO [train.py:1046] (2/4) Epoch 27, batch 2050, loss[loss=0.1496, simple_loss=0.2269, pruned_loss=0.03615, over 24558.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2449, pruned_loss=0.04496, over 4735773.41 frames. ], batch size: 60, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:51:55,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:57,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=934433.3333333334, ans=0.125 2023-10-02 15:51:59,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:52:00,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:52:00,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=934433.3333333334, ans=0.125 2023-10-02 15:52:04,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:52:07,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:52:07,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:52:07,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:52:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 15:52:08,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:52:10,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:52:10,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:52:20,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:52:20,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:52:21,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 15:52:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:52:25,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 15:52:27,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:52:28,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:52:30,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:52:31,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:52:31,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=934566.6666666666, ans=0.1 2023-10-02 15:52:32,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:52:33,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:52:34,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:52:34,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:52:37,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:52:40,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:52:41,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:52:43,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:52:47,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:52:50,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=934633.3333333334, ans=0.0 2023-10-02 15:52:51,216 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.47 vs. limit=10.0 2023-10-02 15:52:52,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:52:53,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 15:52:58,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:52:58,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:53:00,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=934700.0, ans=0.1 2023-10-02 15:53:02,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:53:04,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 15:53:07,149 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 15:53:07,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:08,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:53:08,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=934766.6666666666, ans=0.0 2023-10-02 15:53:09,818 INFO [train.py:1046] (2/4) Epoch 27, batch 2100, loss[loss=0.1601, simple_loss=0.2433, pruned_loss=0.03851, over 24347.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2439, pruned_loss=0.04462, over 4733639.25 frames. ], batch size: 61, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:53:09,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:53:11,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:53:11,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 15:53:11,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 15:53:11,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=934766.6666666666, ans=0.025 2023-10-02 15:53:14,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:53:17,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:53:19,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:53:20,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:20,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:53:20,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 15:53:20,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:53:22,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 15:53:22,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 15:53:24,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:26,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:53:26,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 15:53:26,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:53:30,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=934833.3333333334, ans=0.125 2023-10-02 15:53:31,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 15:53:31,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:53:31,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=934833.3333333334, ans=0.125 2023-10-02 15:53:35,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:53:35,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:53:38,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=934900.0, ans=0.1 2023-10-02 15:53:39,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:53:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 15:53:40,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:40,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:53:42,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 15:53:42,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=934900.0, ans=0.125 2023-10-02 15:53:43,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:43,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 15:53:43,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 15:53:44,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 15:53:47,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:53:49,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:53:51,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:53:51,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=934900.0, ans=0.125 2023-10-02 15:53:52,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:53:54,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:54,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:54,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 15:53:54,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:54,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:55,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:55,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 15:53:57,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 15:53:57,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 15:53:58,496 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.65 vs. limit=8.0 2023-10-02 15:53:59,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=934966.6666666666, ans=0.125 2023-10-02 15:54:00,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:54:02,754 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.820e+02 2.011e+02 2.406e+02 3.767e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-02 15:54:02,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:54:02,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 15:54:08,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:10,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:54:12,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:54:12,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:54:12,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 15:54:12,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:54:13,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:13,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:54:17,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:54:17,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:19,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 15:54:20,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 15:54:20,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:23,074 INFO [train.py:1046] (2/4) Epoch 27, batch 2150, loss[loss=0.1431, simple_loss=0.2231, pruned_loss=0.03152, over 24620.00 frames. ], tot_loss[loss=0.166, simple_loss=0.243, pruned_loss=0.04451, over 4726997.98 frames. ], batch size: 60, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:54:23,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:54:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:54:23,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:54:24,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:54:28,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:54:30,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:32,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:33,831 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:54:34,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:54:34,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:34,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:54:35,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=15.0 2023-10-02 15:54:36,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=935166.6666666666, ans=0.2 2023-10-02 15:54:37,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:39,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:54:39,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:54:40,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=935166.6666666666, ans=0.1 2023-10-02 15:54:43,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:43,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 15:54:47,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:54:49,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:54:51,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:52,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:54:52,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:52,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:54:54,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:54,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:54:54,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=935233.3333333334, ans=0.0 2023-10-02 15:54:55,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:56,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 15:54:57,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=935233.3333333334, ans=0.07 2023-10-02 15:54:58,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:54:58,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:00,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:00,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:55:02,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:55:04,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:04,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:55:07,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:07,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 15:55:08,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:55:09,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:55:11,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:11,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:55:12,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:55:14,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:15,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:15,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 15:55:15,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 15:55:16,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:55:16,876 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 15:55:16,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:16,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:55:18,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 15:55:18,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:55:18,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 15:55:18,293 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 15:55:18,294 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 15:55:20,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 15:55:21,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:22,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:55:24,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:55:24,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:26,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:55:28,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:28,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:35,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:55:35,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 15:55:36,905 INFO [train.py:1046] (2/4) Epoch 27, batch 2200, loss[loss=0.1578, simple_loss=0.2401, pruned_loss=0.03776, over 24492.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2433, pruned_loss=0.04401, over 4730416.86 frames. ], batch size: 66, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:55:40,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:55:45,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:46,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:55:46,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:46,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:55:49,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:50,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:50,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 15:55:55,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 15:55:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:56:02,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 15:56:03,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:56:05,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:56:09,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:56:09,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 15:56:10,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=935566.6666666666, ans=0.0 2023-10-02 15:56:13,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:56:14,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:16,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:56:18,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:56:21,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:56:23,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:56:24,580 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.06 vs. limit=15.0 2023-10-02 15:56:25,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 15:56:28,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:29,896 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.796e+02 1.925e+02 2.148e+02 3.063e+02, threshold=3.850e+02, percent-clipped=0.0 2023-10-02 15:56:30,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 15:56:32,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:32,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:56:32,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:35,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:56:35,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:56:35,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:35,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:36,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:56:38,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:56:38,878 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.94 vs. limit=22.5 2023-10-02 15:56:39,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 15:56:42,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:56:43,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:56:45,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:56:45,168 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 15:56:47,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:56:49,230 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 15:56:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:56:49,363 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 15:56:50,512 INFO [train.py:1046] (2/4) Epoch 27, batch 2250, loss[loss=0.1656, simple_loss=0.2552, pruned_loss=0.03803, over 24309.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2442, pruned_loss=0.04432, over 4724801.29 frames. ], batch size: 74, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:56:51,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:53,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 15:56:55,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:56,525 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 15:56:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:57:01,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:57:06,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:57:08,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:57:10,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:12,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:57:12,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=935833.3333333334, ans=22.5 2023-10-02 15:57:13,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:57:14,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 15:57:14,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:57:14,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:57:16,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 15:57:17,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:57:17,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:19,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:57:23,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:57:24,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 15:57:24,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:57:28,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 15:57:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:29,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:57:35,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:57:36,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=935966.6666666666, ans=0.125 2023-10-02 15:57:37,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:57:37,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:57:38,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:57:40,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:57:41,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:57:41,784 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:57:44,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:57:47,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:57:50,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:57:50,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:57:51,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:57:57,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:58:00,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:58:00,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 15:58:00,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:02,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:58:04,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 15:58:05,940 INFO [train.py:1046] (2/4) Epoch 27, batch 2300, loss[loss=0.1791, simple_loss=0.2589, pruned_loss=0.04964, over 23420.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2449, pruned_loss=0.04508, over 4711003.91 frames. ], batch size: 93, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:58:07,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:58:07,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:12,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:14,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:58:15,551 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 15:58:15,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:22,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:58:22,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:58:22,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:58:23,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:23,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 15:58:25,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:58:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:58:30,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:58:35,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:58:38,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:58:41,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:58:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:58:45,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:47,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:58:51,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:53,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:58:55,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:58:55,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:58:55,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 15:58:58,043 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.878e+02 2.081e+02 2.351e+02 3.500e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-02 15:58:58,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:58:58,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:58:59,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:58:59,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:58:59,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:59:01,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 15:59:01,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:59:03,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 15:59:03,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:59:03,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 15:59:09,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:59:09,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=936366.6666666666, ans=0.1 2023-10-02 15:59:11,423 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.30 vs. limit=12.0 2023-10-02 15:59:12,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:59:15,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:59:16,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:59:16,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:59:17,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:59:17,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:59:19,042 INFO [train.py:1046] (2/4) Epoch 27, batch 2350, loss[loss=0.163, simple_loss=0.2288, pruned_loss=0.04853, over 22600.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2453, pruned_loss=0.04548, over 4710695.27 frames. ], batch size: 322, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:59:19,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=936433.3333333334, ans=0.125 2023-10-02 15:59:20,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:59:20,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 15:59:24,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.66 vs. limit=15.0 2023-10-02 15:59:24,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:59:24,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 15:59:30,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 15:59:34,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:37,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:37,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:37,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:59:37,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:59:37,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=936500.0, ans=0.125 2023-10-02 15:59:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 15:59:42,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:59:48,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 15:59:50,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:59:53,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:59:53,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:59:55,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:59:55,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=936566.6666666666, ans=0.0 2023-10-02 15:59:56,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 15:59:57,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:59:59,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:59:59,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:00:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:00:04,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:00:06,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 16:00:07,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:00:09,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:00:10,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:00:12,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 16:00:13,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:00:13,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=936633.3333333334, ans=0.2 2023-10-02 16:00:16,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 16:00:16,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:00:20,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 16:00:23,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 16:00:24,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:00:24,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:00:24,772 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 16:00:24,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 16:00:26,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=936700.0, ans=0.125 2023-10-02 16:00:27,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 16:00:28,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:00:32,036 INFO [train.py:1046] (2/4) Epoch 27, batch 2400, loss[loss=0.1618, simple_loss=0.2377, pruned_loss=0.04294, over 23633.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2456, pruned_loss=0.04581, over 4701735.66 frames. ], batch size: 149, lr: 3.79e-03, grad_scale: 32.0 2023-10-02 16:00:33,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:00:38,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:00:38,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:00:38,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 16:00:38,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 16:00:45,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:00:45,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:00:48,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 16:00:48,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:00:48,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:00:50,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 16:00:55,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:00:57,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 16:00:57,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.93 vs. limit=15.0 2023-10-02 16:01:03,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:01:07,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 16:01:11,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:01:13,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:17,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:01:17,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 16:01:18,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:01:25,751 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.884e+02 2.076e+02 2.344e+02 3.327e+02, threshold=4.151e+02, percent-clipped=0.0 2023-10-02 16:01:25,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:27,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:01:30,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:01:30,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:01:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:01:31,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:01:31,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:33,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:01:33,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:01:35,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=937033.3333333334, ans=0.125 2023-10-02 16:01:35,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.68 vs. limit=22.5 2023-10-02 16:01:36,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:01:36,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:01:36,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 16:01:38,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 16:01:39,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:01:39,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:39,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 16:01:41,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 16:01:41,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 16:01:41,438 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 16:01:42,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 16:01:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:01:44,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:44,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:01:45,654 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 16:01:46,937 INFO [train.py:1046] (2/4) Epoch 27, batch 2450, loss[loss=0.1527, simple_loss=0.1983, pruned_loss=0.05358, over 19047.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2436, pruned_loss=0.04558, over 4681116.41 frames. ], batch size: 388, lr: 3.79e-03, grad_scale: 32.0 2023-10-02 16:01:47,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:47,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:01:47,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.25 vs. limit=15.0 2023-10-02 16:01:49,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:01:51,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:01:53,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:01:53,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:01:55,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 16:01:57,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=937100.0, ans=0.05 2023-10-02 16:01:59,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:02:00,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:03,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:02:03,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:02:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:02:04,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 16:02:08,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:09,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:02:10,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:02:13,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:02:13,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:15,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:15,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:02:16,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 16:02:18,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:02:19,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=937233.3333333334, ans=0.0 2023-10-02 16:02:26,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:27,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:27,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:02:28,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:02:28,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:30,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:02:31,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 16:02:34,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:34,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:02:37,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:02:37,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:02:43,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:02:44,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 16:02:45,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=937366.6666666666, ans=0.125 2023-10-02 16:02:46,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:02:47,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:02:48,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 16:02:48,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:02:49,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:02:52,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=15.0 2023-10-02 16:02:53,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:02:55,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:55,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:02:59,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 16:03:00,700 INFO [train.py:1046] (2/4) Epoch 27, batch 2500, loss[loss=0.1547, simple_loss=0.2092, pruned_loss=0.05013, over 19099.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2427, pruned_loss=0.04532, over 4679425.27 frames. ], batch size: 388, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:03:01,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:03:05,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:03:07,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=937433.3333333334, ans=0.125 2023-10-02 16:03:10,277 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.05 vs. limit=22.5 2023-10-02 16:03:14,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:03:16,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:03:17,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:03:17,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 16:03:20,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-10-02 16:03:24,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=937500.0, ans=0.2 2023-10-02 16:03:25,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:03:25,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:03:28,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:03:28,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:03:28,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 16:03:30,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:30,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:03:31,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 16:03:31,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:31,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 16:03:33,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:36,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:03:37,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:03:39,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:03:40,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 16:03:40,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:03:43,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:46,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:51,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:53,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:03:55,143 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.821e+02 2.040e+02 2.416e+02 3.469e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 16:03:58,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:04:01,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 16:04:01,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:04:01,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:04:04,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:04:04,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:04:05,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 16:04:05,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 16:04:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 16:04:09,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:04:10,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 16:04:10,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 16:04:11,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:04:13,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 16:04:14,537 INFO [train.py:1046] (2/4) Epoch 27, batch 2550, loss[loss=0.1616, simple_loss=0.2527, pruned_loss=0.03524, over 24417.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2437, pruned_loss=0.04507, over 4691669.59 frames. ], batch size: 69, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:04:16,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 16:04:19,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:04:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:04:21,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:04:22,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:04:24,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 16:04:24,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:04:27,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 16:04:28,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:04:30,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:33,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:04:34,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 16:04:34,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:04:35,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:04:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:04:38,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:04:38,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 16:04:40,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:04:40,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:40,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 16:04:44,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=937900.0, ans=0.125 2023-10-02 16:04:51,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:04:57,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:04:57,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:57,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:04:58,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:05:03,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:05:04,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=937966.6666666666, ans=0.125 2023-10-02 16:05:07,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:05:07,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:05:07,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:05:08,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 16:05:08,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:05:13,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:05:13,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=938033.3333333334, ans=0.125 2023-10-02 16:05:14,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:05:19,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:05:19,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 16:05:19,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:05:20,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:05:20,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:05:22,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:05:22,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:05:27,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=938033.3333333334, ans=0.0 2023-10-02 16:05:28,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:05:29,797 INFO [train.py:1046] (2/4) Epoch 27, batch 2600, loss[loss=0.1674, simple_loss=0.2401, pruned_loss=0.04737, over 23792.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2441, pruned_loss=0.04499, over 4695736.56 frames. ], batch size: 179, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:05:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:05:32,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 16:05:35,911 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 16:05:35,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:05:37,247 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 16:05:37,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 16:05:37,351 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 16:05:40,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:05:40,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 16:05:40,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=938100.0, ans=0.1 2023-10-02 16:05:41,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 16:05:41,935 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 16:05:44,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:05:45,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 16:05:47,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 16:05:49,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:05:49,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 16:05:51,880 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 16:05:51,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 16:05:52,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=938166.6666666666, ans=0.125 2023-10-02 16:05:59,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:01,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:01,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:06:01,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 16:06:02,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:06:07,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 16:06:08,999 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.56 vs. limit=22.5 2023-10-02 16:06:13,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:13,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:14,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 16:06:15,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:06:15,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:06:15,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 16:06:18,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:06:19,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:06:21,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:24,356 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.903e+02 2.122e+02 2.565e+02 3.470e+02, threshold=4.244e+02, percent-clipped=0.0 2023-10-02 16:06:24,542 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 16:06:24,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:25,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:06:29,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:06:29,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:06:29,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 16:06:31,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:34,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:06:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:06:40,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 16:06:41,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:44,154 INFO [train.py:1046] (2/4) Epoch 27, batch 2650, loss[loss=0.1776, simple_loss=0.2525, pruned_loss=0.0513, over 23149.00 frames. ], tot_loss[loss=0.168, simple_loss=0.245, pruned_loss=0.04549, over 4693918.73 frames. ], batch size: 105, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:06:44,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:06:47,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 16:06:47,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:48,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:06:50,338 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 16:06:50,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:06:51,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:53,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:06:54,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:06:57,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:57,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 16:06:57,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:06:59,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:07:02,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 16:07:03,989 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 16:07:06,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:10,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 16:07:10,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:10,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 16:07:14,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:14,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:07:14,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:14,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 16:07:19,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 16:07:21,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:07:23,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=938566.6666666666, ans=0.125 2023-10-02 16:07:26,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 16:07:26,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:26,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:26,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:07:27,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:07:27,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:30,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:07:33,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:07:33,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:07:35,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:07:35,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:07:38,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:38,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:07:39,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:39,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:07:41,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:07:43,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:44,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:07:45,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.60 vs. limit=10.0 2023-10-02 16:07:46,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:46,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 16:07:48,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:50,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:52,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:53,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:07:53,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:07:53,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:07:56,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:07:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 16:07:59,331 INFO [train.py:1046] (2/4) Epoch 27, batch 2700, loss[loss=0.234, simple_loss=0.2996, pruned_loss=0.08419, over 19667.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2464, pruned_loss=0.04609, over 4692942.86 frames. ], batch size: 388, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:07:59,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:08:00,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 16:08:03,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:08:03,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:03,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:05,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:08:05,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:08:05,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:08:05,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 16:08:05,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 16:08:06,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:08:09,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:08:11,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:08:11,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:08:14,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:08:17,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 16:08:17,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:08:21,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:08:21,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:08:21,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=938833.3333333334, ans=0.125 2023-10-02 16:08:27,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:08:27,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:08:27,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:08:27,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:08:31,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:08:31,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=938900.0, ans=0.125 2023-10-02 16:08:35,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:08:35,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:08:35,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:08:39,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:08:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:08:47,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:08:48,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=938966.6666666666, ans=0.125 2023-10-02 16:08:51,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:08:51,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:08:53,642 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.852e+02 1.992e+02 2.201e+02 2.706e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 16:08:55,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=938966.6666666666, ans=0.125 2023-10-02 16:08:56,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:56,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:08:57,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:08:57,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:08:59,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:59,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=939033.3333333334, ans=0.5 2023-10-02 16:09:00,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:09:02,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:09:03,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:09:03,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:09:03,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=939033.3333333334, ans=0.1 2023-10-02 16:09:07,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 16:09:08,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:11,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:09:11,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 16:09:12,512 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.09 vs. limit=15.0 2023-10-02 16:09:13,016 INFO [train.py:1046] (2/4) Epoch 27, batch 2750, loss[loss=0.162, simple_loss=0.2205, pruned_loss=0.0517, over 22716.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2452, pruned_loss=0.04562, over 4698690.72 frames. ], batch size: 322, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:09:13,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=939100.0, ans=0.125 2023-10-02 16:09:14,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 16:09:14,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:14,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=939100.0, ans=0.0 2023-10-02 16:09:17,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:18,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:09:21,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:21,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:09:21,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:25,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:09:25,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:09:27,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:09:27,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:27,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 16:09:27,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:09:27,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:32,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 16:09:34,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:09:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:09:35,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:09:35,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:09:36,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:09:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:37,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:42,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:09:42,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:09:44,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:09:44,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=939233.3333333334, ans=0.1 2023-10-02 16:09:45,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:45,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:09:49,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:52,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:09:52,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:09:58,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:58,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:09:58,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:10:04,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:10:05,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:10:05,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 16:10:08,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:12,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 16:10:16,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:10:19,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:10:19,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 16:10:20,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:10:22,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:10:22,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=939366.6666666666, ans=0.0 2023-10-02 16:10:23,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 16:10:23,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:10:25,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=939433.3333333334, ans=0.125 2023-10-02 16:10:26,191 INFO [train.py:1046] (2/4) Epoch 27, batch 2800, loss[loss=0.1552, simple_loss=0.2336, pruned_loss=0.03836, over 23278.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2437, pruned_loss=0.04476, over 4703030.97 frames. ], batch size: 105, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:10:26,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 16:10:26,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:28,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:10:29,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 16:10:29,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:10:29,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:30,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:10:31,034 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 16:10:31,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 16:10:33,367 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.89 vs. limit=8.0 2023-10-02 16:10:35,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:35,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:10:35,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:10:38,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:10:38,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=939433.3333333334, ans=0.2 2023-10-02 16:10:40,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 16:10:43,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 16:10:43,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=939500.0, ans=0.0 2023-10-02 16:10:44,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 16:10:46,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:46,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=939500.0, ans=0.125 2023-10-02 16:10:47,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:10:47,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:10:50,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:10:51,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:51,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:10:51,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:10:57,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:10:59,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:11:02,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:02,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:11:03,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:08,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:11:08,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 16:11:09,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:09,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:11:09,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:11:16,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:16,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:18,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:11:21,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:11:21,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:21,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:11:22,775 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.891e+02 2.118e+02 2.532e+02 5.316e+02, threshold=4.237e+02, percent-clipped=2.0 2023-10-02 16:11:22,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:11:24,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:11:24,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=939700.0, ans=0.125 2023-10-02 16:11:25,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:11:25,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 16:11:25,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:11:27,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=939700.0, ans=0.1 2023-10-02 16:11:28,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:11:28,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:11:31,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 16:11:32,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:32,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:11:32,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:11:35,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 16:11:37,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=939700.0, ans=10.0 2023-10-02 16:11:40,357 INFO [train.py:1046] (2/4) Epoch 27, batch 2850, loss[loss=0.1717, simple_loss=0.2632, pruned_loss=0.04006, over 24585.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2427, pruned_loss=0.04419, over 4702475.62 frames. ], batch size: 71, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:11:41,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:11:41,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:11:41,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:11:44,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:11:47,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:11:48,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:11:48,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:49,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:51,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:52,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:11:52,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=939766.6666666666, ans=0.05 2023-10-02 16:11:53,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 16:11:56,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=939833.3333333334, ans=0.0 2023-10-02 16:11:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 16:11:59,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:01,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 16:12:01,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:04,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 16:12:05,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 16:12:06,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:11,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=939900.0, ans=0.2 2023-10-02 16:12:17,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:12:19,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:12:19,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:12:20,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:12:20,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:12:20,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:12:23,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:12:23,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 16:12:24,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:12:24,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:12:25,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=939966.6666666666, ans=0.1 2023-10-02 16:12:26,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:12:26,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:29,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:12:29,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:12:30,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:32,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:12:35,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:12:35,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:36,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:38,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:12:43,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:12:43,881 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.47 vs. limit=22.5 2023-10-02 16:12:44,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 16:12:44,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 16:12:47,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:12:47,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:12:47,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 16:12:49,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:12:49,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:12:50,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:12:50,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:12:50,466 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 16:12:50,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 16:12:50,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:12:50,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=940033.3333333334, ans=0.125 2023-10-02 16:12:51,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:54,586 INFO [train.py:1046] (2/4) Epoch 27, batch 2900, loss[loss=0.1765, simple_loss=0.256, pruned_loss=0.04849, over 23329.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2425, pruned_loss=0.04418, over 4714192.75 frames. ], batch size: 105, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:12:56,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:12:56,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:12:56,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:12:59,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 16:13:03,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:13:03,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 16:13:04,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 16:13:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:13:06,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:13:08,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:13:10,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:13:11,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=940166.6666666666, ans=0.04949747468305833 2023-10-02 16:13:15,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:13:15,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:13:17,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:13:18,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 16:13:18,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:13:19,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:21,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 16:13:22,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 16:13:25,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:13:25,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 16:13:25,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:13:27,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:13:27,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:13:31,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:13:31,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:36,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:13:38,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:13:43,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 16:13:43,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 16:13:43,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:13:45,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:13:48,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 16:13:48,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=940300.0, ans=0.2 2023-10-02 16:13:49,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:13:53,109 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.791e+02 2.001e+02 2.222e+02 3.379e+02, threshold=4.002e+02, percent-clipped=0.0 2023-10-02 16:13:54,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:14:02,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:14:02,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:14:02,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 16:14:06,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:06,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 16:14:07,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:14:07,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:14:08,792 INFO [train.py:1046] (2/4) Epoch 27, batch 2950, loss[loss=0.1687, simple_loss=0.2446, pruned_loss=0.0464, over 23710.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2436, pruned_loss=0.04433, over 4723380.94 frames. ], batch size: 212, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:14:10,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=940433.3333333334, ans=0.125 2023-10-02 16:14:13,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:14:16,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 16:14:17,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:14:17,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:17,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:14:19,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:14:21,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 16:14:21,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 16:14:22,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:14:22,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:14:28,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:14:29,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:14:30,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=940500.0, ans=0.0 2023-10-02 16:14:32,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:14:32,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:14:37,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:14:37,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:14:37,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=940566.6666666666, ans=0.0 2023-10-02 16:14:40,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:41,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:41,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:14:42,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 16:14:47,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 16:14:47,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 16:14:48,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:14:50,417 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 16:14:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 16:14:52,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:14:52,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:14:52,341 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 16:14:53,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-10-02 16:14:53,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:14:53,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=940633.3333333334, ans=0.0 2023-10-02 16:14:56,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 16:14:56,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:14:58,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:15:00,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:15:02,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:15:02,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:02,527 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 16:15:03,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:15:03,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 16:15:04,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=940633.3333333334, ans=0.125 2023-10-02 16:15:09,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:15:11,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 16:15:11,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:15:11,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-10-02 16:15:12,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 16:15:15,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:15:17,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:15:17,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:15:18,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:20,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:15:20,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.58 vs. limit=15.0 2023-10-02 16:15:21,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:15:22,832 INFO [train.py:1046] (2/4) Epoch 27, batch 3000, loss[loss=0.1727, simple_loss=0.2509, pruned_loss=0.04728, over 23154.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.244, pruned_loss=0.04444, over 4733248.18 frames. ], batch size: 105, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:15:22,832 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 16:15:31,793 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.4617, 2.9295, 4.2063, 4.0154], device='cuda:2') 2023-10-02 16:15:34,581 INFO [train.py:1078] (2/4) Epoch 27, validation: loss=0.3322, simple_loss=0.2706, pruned_loss=0.197, over 1125622.00 frames. 2023-10-02 16:15:34,582 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 16:15:34,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:34,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:15:34,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:15:34,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:15:36,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:15:37,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:37,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 16:15:38,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=940766.6666666666, ans=0.07 2023-10-02 16:15:40,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:40,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=940766.6666666666, ans=0.035 2023-10-02 16:15:41,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:15:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:15:46,692 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 16:15:46,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 16:15:48,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:15:49,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:15:49,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 16:15:49,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:15:57,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:16:06,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:16:10,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 16:16:12,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:16:15,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:16:15,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:16:15,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:16:15,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=940900.0, ans=0.125 2023-10-02 16:16:15,691 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.54 vs. limit=10.0 2023-10-02 16:16:17,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:16:17,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 16:16:20,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 16:16:20,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:16:21,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:16:23,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:16:23,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:16:25,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:25,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:16:29,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:16:29,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:16:29,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:16:30,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:16:31,979 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.934e+02 2.111e+02 2.505e+02 3.384e+02, threshold=4.221e+02, percent-clipped=0.0 2023-10-02 16:16:33,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 16:16:34,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:16:34,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:16:34,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:16:38,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:39,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:39,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 16:16:39,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 16:16:39,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:16:41,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 16:16:42,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:16:45,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 16:16:46,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:16:48,215 INFO [train.py:1046] (2/4) Epoch 27, batch 3050, loss[loss=0.1575, simple_loss=0.2427, pruned_loss=0.0362, over 24627.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2451, pruned_loss=0.04467, over 4735242.05 frames. ], batch size: 68, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:16:48,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:16:49,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 16:16:49,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 16:16:49,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:16:51,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:16:52,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:52,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:16:52,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:16:52,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:16:53,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 16:16:57,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:16:58,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=941100.0, ans=0.2 2023-10-02 16:17:00,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:00,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:17:04,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:06,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 16:17:09,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=941166.6666666666, ans=0.1 2023-10-02 16:17:13,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 16:17:14,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 16:17:15,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:16,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:17:19,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:19,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:19,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:22,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:17:23,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:17:23,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:23,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:23,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:25,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:26,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:28,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:30,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 16:17:30,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:30,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:17:33,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:17:33,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:17:34,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:17:35,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:40,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:42,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:48,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:48,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:17:48,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=941366.6666666666, ans=0.04949747468305833 2023-10-02 16:17:49,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:51,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:17:52,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:17:52,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:17:53,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 16:17:55,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:17:55,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:56,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 16:17:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:59,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.37 vs. limit=15.0 2023-10-02 16:18:02,755 INFO [train.py:1046] (2/4) Epoch 27, batch 3100, loss[loss=0.161, simple_loss=0.2483, pruned_loss=0.03687, over 24676.00 frames. ], tot_loss[loss=0.167, simple_loss=0.245, pruned_loss=0.04453, over 4729193.67 frames. ], batch size: 73, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:18:04,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:05,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:18:07,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:18:09,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 16:18:13,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 16:18:13,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 16:18:14,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:18:17,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:18:17,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:20,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 16:18:24,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:28,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 16:18:34,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:18:34,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:34,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:18:35,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:18:36,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 16:18:38,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:18:38,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 16:18:38,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:18:38,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:41,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 16:18:41,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:18:41,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=941566.6666666666, ans=0.125 2023-10-02 16:18:44,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:18:45,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 16:18:47,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 16:18:48,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:48,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:50,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:18:50,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:51,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:18:53,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.57 vs. limit=22.5 2023-10-02 16:18:54,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:18:54,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:18:55,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:18:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:55,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:18:59,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:19:00,436 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.900e+02 2.040e+02 2.286e+02 3.067e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 16:19:00,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 16:19:03,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:19:03,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 16:19:04,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:04,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:04,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 16:19:16,592 INFO [train.py:1046] (2/4) Epoch 27, batch 3150, loss[loss=0.167, simple_loss=0.2454, pruned_loss=0.04429, over 24626.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2437, pruned_loss=0.04422, over 4729441.85 frames. ], batch size: 60, lr: 3.78e-03, grad_scale: 8.0 2023-10-02 16:19:16,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 16:19:18,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:18,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:18,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=941766.6666666666, ans=0.0 2023-10-02 16:19:18,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=22.5 2023-10-02 16:19:20,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:19:20,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:19:20,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 16:19:22,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:22,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:19:23,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 16:19:24,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:26,827 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 16:19:30,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 16:19:30,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:19:31,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 16:19:32,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 16:19:34,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 16:19:36,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 16:19:36,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 16:19:36,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:36,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:19:37,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 16:19:40,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:41,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:41,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:19:44,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:19:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 16:19:47,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:19:50,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:19:50,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:19:51,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 16:19:53,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 16:19:54,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:19:55,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:19:55,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:19:57,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:57,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:19:57,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=941900.0, ans=0.1 2023-10-02 16:20:00,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:20:00,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:20:00,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 16:20:02,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:20:02,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:04,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:20:05,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:20:05,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 16:20:05,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:06,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=941966.6666666666, ans=0.0 2023-10-02 16:20:08,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 16:20:08,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:09,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 16:20:09,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 16:20:09,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=941966.6666666666, ans=0.1 2023-10-02 16:20:11,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:20:11,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:11,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 16:20:12,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 16:20:12,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:20:15,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:20:16,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:16,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:20:22,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:20:23,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:25,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 16:20:30,372 INFO [train.py:1046] (2/4) Epoch 27, batch 3200, loss[loss=0.1782, simple_loss=0.2542, pruned_loss=0.05106, over 23379.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2436, pruned_loss=0.04413, over 4721568.26 frames. ], batch size: 93, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:20:30,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:20:30,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 16:20:33,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:33,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=942100.0, ans=0.0 2023-10-02 16:20:35,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:20:35,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 16:20:36,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:40,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=942100.0, ans=15.0 2023-10-02 16:20:40,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:20:44,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:53,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:21:02,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 16:21:03,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:21:05,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 16:21:06,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:21:09,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:21:10,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:21:12,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:21:12,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=942300.0, ans=0.125 2023-10-02 16:21:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 16:21:16,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 16:21:19,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 16:21:20,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 16:21:25,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:21:25,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.00 vs. limit=15.0 2023-10-02 16:21:26,757 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.873e+02 2.063e+02 2.304e+02 3.218e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-02 16:21:30,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:21:31,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:21:31,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:21:31,719 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 16:21:31,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:21:35,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:21:39,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 16:21:39,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 16:21:39,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 16:21:40,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 16:21:43,728 INFO [train.py:1046] (2/4) Epoch 27, batch 3250, loss[loss=0.1653, simple_loss=0.2379, pruned_loss=0.0464, over 23801.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2441, pruned_loss=0.04413, over 4728311.04 frames. ], batch size: 179, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:21:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:21:45,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:21:45,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 16:21:46,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:21:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:21:48,089 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 16:21:53,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:21:56,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:21:58,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=942500.0, ans=0.0 2023-10-02 16:22:04,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:04,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 16:22:05,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:05,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:22:05,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:22:07,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:22:08,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:22:09,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:10,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:22:11,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:11,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:11,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:11,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:22:15,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:16,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:22:17,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:17,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:18,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=942566.6666666666, ans=0.2 2023-10-02 16:22:18,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=942566.6666666666, ans=0.125 2023-10-02 16:22:20,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:20,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:22:20,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:22:23,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=942566.6666666666, ans=0.1 2023-10-02 16:22:24,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 16:22:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:22:26,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:22:26,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:27,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:22:29,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=942633.3333333334, ans=0.125 2023-10-02 16:22:33,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:22:39,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:22:41,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:41,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 16:22:41,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:22:41,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:22:41,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:44,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 16:22:44,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 16:22:44,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:22:46,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:47,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:49,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 16:22:49,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:53,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:22:53,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:22:53,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 16:22:54,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:22:55,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:22:55,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 16:22:57,200 INFO [train.py:1046] (2/4) Epoch 27, batch 3300, loss[loss=0.1774, simple_loss=0.2494, pruned_loss=0.05266, over 22773.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.245, pruned_loss=0.04439, over 4714232.91 frames. ], batch size: 322, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:22:57,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:22:58,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 16:23:00,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 16:23:00,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=942766.6666666666, ans=0.0 2023-10-02 16:23:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 16:23:01,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:06,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:23:06,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:23:06,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:07,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:23:07,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:23:10,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:10,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=942833.3333333334, ans=0.0 2023-10-02 16:23:12,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:23:15,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 16:23:16,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:23:17,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:18,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:19,960 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 16:23:20,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:23:21,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:23:22,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:23:22,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:23:22,759 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 16:23:26,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:26,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:23:29,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:29,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 16:23:31,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 16:23:31,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:33,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:23:34,438 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 16:23:35,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 16:23:37,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:23:38,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 16:23:40,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=942966.6666666666, ans=0.0 2023-10-02 16:23:41,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:23:42,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=942966.6666666666, ans=0.04949747468305833 2023-10-02 16:23:44,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:23:45,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:23:47,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:23:48,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:48,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:23:49,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=942966.6666666666, ans=0.125 2023-10-02 16:23:50,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:23:50,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:51,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:23:51,878 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 16:23:53,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 16:23:54,393 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.925e+02 2.169e+02 2.567e+02 3.728e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-02 16:23:54,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:23:55,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:23:55,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:23:57,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:57,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:23:58,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:23:58,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:23:58,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:24:00,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:24:03,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:24:05,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 16:24:06,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:08,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:24:08,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:24:10,144 INFO [train.py:1046] (2/4) Epoch 27, batch 3350, loss[loss=0.168, simple_loss=0.2511, pruned_loss=0.04249, over 24480.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2461, pruned_loss=0.04513, over 4715628.10 frames. ], batch size: 63, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:24:10,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:11,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:24:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:11,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=943100.0, ans=0.125 2023-10-02 16:24:14,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:24:16,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:18,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:24:21,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:23,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:24:23,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:25,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:24:26,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 16:24:27,922 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 16:24:27,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:30,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 16:24:30,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 16:24:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:24:32,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:24:35,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:35,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 16:24:35,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:35,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:24:38,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:40,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:41,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.76 vs. limit=15.0 2023-10-02 16:24:41,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:43,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:24:45,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:24:47,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:49,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:24:49,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.47 vs. limit=15.0 2023-10-02 16:24:52,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:24:52,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:52,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=943233.3333333334, ans=0.125 2023-10-02 16:24:55,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:56,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:56,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=943300.0, ans=0.1 2023-10-02 16:24:57,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:00,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 16:25:00,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:25:02,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 16:25:02,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:25:02,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 16:25:03,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:05,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:25:11,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:11,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 16:25:11,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=943366.6666666666, ans=0.0 2023-10-02 16:25:12,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:25:14,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:25:15,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:25:18,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:25:20,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 16:25:21,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:25:21,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:25:23,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:23,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 16:25:24,638 INFO [train.py:1046] (2/4) Epoch 27, batch 3400, loss[loss=0.1632, simple_loss=0.2541, pruned_loss=0.03621, over 24449.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2468, pruned_loss=0.04518, over 4720963.79 frames. ], batch size: 69, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:25:24,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:24,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 16:25:26,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:25:26,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:25:27,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:25:28,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:25:28,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 16:25:32,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 16:25:32,842 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 16:25:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:25:37,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:25:37,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:25:37,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:25:40,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:25:45,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:25:49,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 16:25:51,816 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.53 vs. limit=15.0 2023-10-02 16:25:52,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:25:55,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:25:55,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:56,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:26:02,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:26:03,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=943566.6666666666, ans=0.125 2023-10-02 16:26:05,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 16:26:12,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:26:12,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=943633.3333333334, ans=0.125 2023-10-02 16:26:13,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=943633.3333333334, ans=0.125 2023-10-02 16:26:14,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:26:14,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 16:26:15,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:26:15,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:26:15,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:26:15,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:26:20,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:26:21,335 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.815e+02 1.942e+02 2.172e+02 3.090e+02, threshold=3.885e+02, percent-clipped=0.0 2023-10-02 16:26:22,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:26:22,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:26:27,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:26:28,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=943700.0, ans=0.0 2023-10-02 16:26:30,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 16:26:34,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:26:37,399 INFO [train.py:1046] (2/4) Epoch 27, batch 3450, loss[loss=0.1587, simple_loss=0.2304, pruned_loss=0.04353, over 23843.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2465, pruned_loss=0.04506, over 4723538.86 frames. ], batch size: 195, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:26:39,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 16:26:42,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 16:26:42,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:26:44,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:26:44,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 16:26:45,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:26:50,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:26:56,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:26:56,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:26:56,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:26:57,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:26:59,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:27:04,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 16:27:09,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=943900.0, ans=0.125 2023-10-02 16:27:10,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 16:27:10,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:27:11,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:27:11,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:18,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 16:27:18,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:27:21,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:27:21,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:27:23,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:27:24,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:27:26,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 16:27:26,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:27:27,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:27:29,836 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.16 vs. limit=22.5 2023-10-02 16:27:30,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:27:33,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 16:27:36,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:27:38,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=944033.3333333334, ans=0.125 2023-10-02 16:27:41,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:27:42,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:46,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:27:50,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:50,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:27:50,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:27:51,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:27:53,140 INFO [train.py:1046] (2/4) Epoch 27, batch 3500, loss[loss=0.1491, simple_loss=0.2127, pruned_loss=0.04279, over 23464.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2443, pruned_loss=0.04477, over 4707499.51 frames. ], batch size: 285, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:27:53,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:27:56,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:27:57,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 16:27:59,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:28:01,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:28:05,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:28:05,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 16:28:09,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:28:11,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:28:12,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:28:12,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:13,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:28:13,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:13,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:28:13,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=944166.6666666666, ans=0.125 2023-10-02 16:28:15,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 16:28:18,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:18,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:28:19,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:28:22,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:22,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 16:28:22,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:28:25,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:28:26,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=944233.3333333334, ans=0.1 2023-10-02 16:28:27,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:28:28,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:29,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:28:29,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:28:32,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 16:28:34,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 16:28:34,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 16:28:34,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=944233.3333333334, ans=0.0 2023-10-02 16:28:35,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:28:37,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:37,425 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:28:38,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:38,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:28:41,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:28:42,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:28:46,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:28:48,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 16:28:48,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 16:28:48,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:28:50,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:28:52,057 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.840e+02 2.099e+02 2.420e+02 3.438e+02, threshold=4.198e+02, percent-clipped=0.0 2023-10-02 16:28:52,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:28:53,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:55,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 16:28:55,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:28:56,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:57,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 16:28:58,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=944366.6666666666, ans=0.2 2023-10-02 16:28:59,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 16:29:01,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:01,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=944366.6666666666, ans=0.09899494936611666 2023-10-02 16:29:02,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:29:02,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:02,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:05,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:29:07,052 INFO [train.py:1046] (2/4) Epoch 27, batch 3550, loss[loss=0.1478, simple_loss=0.2195, pruned_loss=0.03804, over 23879.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2421, pruned_loss=0.04456, over 4711524.28 frames. ], batch size: 195, lr: 3.78e-03, grad_scale: 8.0 2023-10-02 16:29:13,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:15,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 16:29:17,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:29:19,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:29:20,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:22,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:29:22,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:29:25,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:29:25,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:29:26,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:26,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:29:26,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:29:30,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.75 vs. limit=22.5 2023-10-02 16:29:32,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:29:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:29:35,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:29:35,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:35,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:29:35,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 16:29:35,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:36,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:38,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:29:42,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:42,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:29:44,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:46,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 16:29:46,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:29:47,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 16:29:48,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:29:50,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:29:50,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:29:52,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=944633.3333333334, ans=0.125 2023-10-02 16:29:54,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 16:29:55,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:02,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:02,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 16:30:03,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:06,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:30:07,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 16:30:13,272 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.59 vs. limit=10.0 2023-10-02 16:30:14,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 16:30:14,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:30:15,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:30:18,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:18,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:18,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:30:21,526 INFO [train.py:1046] (2/4) Epoch 27, batch 3600, loss[loss=0.1664, simple_loss=0.2538, pruned_loss=0.03949, over 24447.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2427, pruned_loss=0.04418, over 4714459.23 frames. ], batch size: 69, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:30:22,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:30:25,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:30:26,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:30:27,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:27,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 16:30:31,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:30:31,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:35,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:30:35,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=944833.3333333334, ans=0.2 2023-10-02 16:30:38,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:30:38,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:30:39,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:30:39,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 16:30:39,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:30:43,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:44,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:30:45,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:49,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:30:49,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:30:50,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 16:30:56,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:30:57,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:30:58,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 16:30:58,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=944900.0, ans=0.125 2023-10-02 16:31:01,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:31:05,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=944966.6666666666, ans=0.125 2023-10-02 16:31:06,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:08,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:13,419 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:31:14,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:31:14,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:31:14,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 16:31:15,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 16:31:16,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=944966.6666666666, ans=0.2 2023-10-02 16:31:17,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 16:31:19,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:31:19,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:31:20,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 16:31:22,202 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.857e+02 2.015e+02 2.444e+02 4.517e+02, threshold=4.030e+02, percent-clipped=2.0 2023-10-02 16:31:22,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:31:22,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:31:22,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:31:23,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 16:31:24,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 16:31:25,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=945033.3333333334, ans=0.05 2023-10-02 16:31:26,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=945033.3333333334, ans=0.1 2023-10-02 16:31:28,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:30,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 16:31:36,506 INFO [train.py:1046] (2/4) Epoch 27, batch 3650, loss[loss=0.1411, simple_loss=0.2201, pruned_loss=0.03106, over 16399.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2436, pruned_loss=0.04468, over 4690822.24 frames. ], batch size: 35, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:31:36,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 16:31:36,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:31:39,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 16:31:42,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 16:31:46,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:31:46,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:31:46,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:31:49,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:31:51,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:31:51,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=945166.6666666666, ans=0.07 2023-10-02 16:31:52,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 16:31:52,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:31:52,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=945166.6666666666, ans=0.125 2023-10-02 16:31:54,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:31:54,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 16:31:56,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:31:56,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:31:56,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:31:57,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.36 vs. limit=15.0 2023-10-02 16:31:57,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:31:59,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=945166.6666666666, ans=0.0 2023-10-02 16:32:00,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 16:32:00,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 16:32:02,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:32:03,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 16:32:06,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:32:06,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:32:11,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:32:13,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:32:13,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:32:15,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:32:16,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:32:18,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:32:21,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:32:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:22,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:32:25,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:32:25,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:32:25,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:32:32,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=945300.0, ans=0.0 2023-10-02 16:32:32,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=945300.0, ans=0.2 2023-10-02 16:32:33,416 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 16:32:36,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:32:36,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:32:37,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:32:37,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:37,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=945366.6666666666, ans=0.0 2023-10-02 16:32:39,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:32:41,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:42,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 16:32:42,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:42,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=945366.6666666666, ans=10.0 2023-10-02 16:32:45,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:32:48,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:32:49,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:32:50,908 INFO [train.py:1046] (2/4) Epoch 27, batch 3700, loss[loss=0.1725, simple_loss=0.2524, pruned_loss=0.0463, over 24099.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2446, pruned_loss=0.04452, over 4705401.26 frames. ], batch size: 80, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:32:52,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:52,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 16:32:52,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:52,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=945433.3333333334, ans=0.1 2023-10-02 16:32:53,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:32:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:32:58,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:33:01,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:01,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:03,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:33:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:33:04,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:33:04,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:06,080 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 16:33:06,356 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.520e-03 2023-10-02 16:33:07,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=945500.0, ans=0.0 2023-10-02 16:33:15,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:33:15,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:33:17,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:33:19,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 16:33:19,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:33:22,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:23,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 16:33:24,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:26,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:33:28,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:28,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:33:29,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:33:35,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:33:35,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 16:33:35,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:37,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 16:33:40,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:33:41,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:33:42,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:44,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 16:33:46,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:33:46,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:33:46,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:33:47,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:50,529 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.788e+02 1.952e+02 2.175e+02 3.581e+02, threshold=3.904e+02, percent-clipped=0.0 2023-10-02 16:33:50,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:33:50,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 16:33:52,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 16:33:53,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:33:53,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:33:54,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=945700.0, ans=0.1 2023-10-02 16:33:56,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:33:57,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:33:57,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=945700.0, ans=0.1 2023-10-02 16:33:58,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:34:00,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:34:01,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:03,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 16:34:04,684 INFO [train.py:1046] (2/4) Epoch 27, batch 3750, loss[loss=0.1625, simple_loss=0.2508, pruned_loss=0.0371, over 24432.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2452, pruned_loss=0.04474, over 4702093.06 frames. ], batch size: 69, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:34:04,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 16:34:08,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:34:09,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 16:34:10,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:34:12,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:34:13,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:34:14,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:34:15,969 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.74 vs. limit=15.0 2023-10-02 16:34:17,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:34:22,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:34:22,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:34:24,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.69 vs. limit=15.0 2023-10-02 16:34:25,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:34:25,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=945833.3333333334, ans=0.0 2023-10-02 16:34:28,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:34:29,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 16:34:29,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:34:31,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:34:31,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:34:35,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 16:34:37,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-02 16:34:38,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 16:34:39,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:34:41,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:34:44,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:34:44,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=945900.0, ans=0.2 2023-10-02 16:34:49,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:50,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:34:53,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 16:34:56,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:59,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:34:59,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:35:02,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:35:06,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:35:08,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:35:10,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:35:12,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:35:14,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:35:15,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=946033.3333333334, ans=0.0 2023-10-02 16:35:18,928 INFO [train.py:1046] (2/4) Epoch 27, batch 3800, loss[loss=0.1749, simple_loss=0.2351, pruned_loss=0.05736, over 19519.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2449, pruned_loss=0.0448, over 4695651.40 frames. ], batch size: 388, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:35:20,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:35:24,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:26,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:35:26,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 16:35:28,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:35:30,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:35:31,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:35:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 16:35:31,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:33,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:35:34,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:35:36,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:35:36,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:36,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 16:35:39,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 16:35:40,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:35:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:35:46,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:35:46,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:35:49,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:35:49,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:50,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:52,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:55,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:35:55,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 16:35:57,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:36:01,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=946300.0, ans=0.2 2023-10-02 16:36:02,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:36:06,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=946300.0, ans=0.1 2023-10-02 16:36:10,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:36:11,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 16:36:13,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 16:36:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:36:16,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:36:16,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=946366.6666666666, ans=0.125 2023-10-02 16:36:17,639 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.874e+02 2.041e+02 2.512e+02 4.276e+02, threshold=4.082e+02, percent-clipped=2.0 2023-10-02 16:36:17,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:19,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 16:36:23,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 16:36:23,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 16:36:23,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:25,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:36:28,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=946366.6666666666, ans=0.0 2023-10-02 16:36:29,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:36:29,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:36:32,744 INFO [train.py:1046] (2/4) Epoch 27, batch 3850, loss[loss=0.1705, simple_loss=0.2605, pruned_loss=0.04026, over 24553.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2433, pruned_loss=0.04467, over 4685936.08 frames. ], batch size: 71, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:36:32,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:36:34,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 16:36:35,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:36:37,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:40,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:36:40,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=946433.3333333334, ans=0.125 2023-10-02 16:36:41,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=946433.3333333334, ans=0.125 2023-10-02 16:36:43,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:36:45,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:36:46,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 16:36:47,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=946500.0, ans=0.125 2023-10-02 16:36:51,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:36:54,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:55,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:36:55,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:36:58,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:36:58,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:36:58,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=946500.0, ans=0.05 2023-10-02 16:37:00,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:37:01,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:04,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:05,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:05,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:37:06,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 16:37:06,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 16:37:07,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:37:08,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:10,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:11,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:11,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 16:37:11,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=946566.6666666666, ans=0.125 2023-10-02 16:37:15,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 16:37:15,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:17,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=946633.3333333334, ans=0.2 2023-10-02 16:37:18,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 16:37:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:37:25,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:25,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:27,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=946633.3333333334, ans=0.1 2023-10-02 16:37:30,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:31,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 16:37:33,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 16:37:35,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:35,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:36,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=946700.0, ans=0.2 2023-10-02 16:37:39,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:37:40,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:37:40,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:41,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:41,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:37:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 16:37:43,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:37:43,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 16:37:45,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:45,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:47,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:37:47,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:48,815 INFO [train.py:1046] (2/4) Epoch 27, batch 3900, loss[loss=0.1735, simple_loss=0.2424, pruned_loss=0.05233, over 23891.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.242, pruned_loss=0.0442, over 4681979.66 frames. ], batch size: 195, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:37:48,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:37:48,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:48,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:48,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:37:50,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 16:37:50,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:54,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:37:54,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:37:54,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:37:56,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:37:57,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:37:57,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:59,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:38:00,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 16:38:00,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:38:01,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 16:38:01,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:38:03,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 16:38:04,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 16:38:09,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:38:09,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:38:10,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:38:10,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:15,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:38:18,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:38:20,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:38:21,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:38:22,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:38:28,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:38:28,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:38:37,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:38:38,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:38:41,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=946966.6666666666, ans=0.0 2023-10-02 16:38:47,388 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.883e+02 2.029e+02 2.294e+02 3.792e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 16:38:47,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:38:50,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:52,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 16:38:52,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 16:38:52,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:54,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 16:38:54,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.29 vs. limit=15.0 2023-10-02 16:38:56,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:38:57,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=947033.3333333334, ans=0.125 2023-10-02 16:38:58,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 16:39:02,572 INFO [train.py:1046] (2/4) Epoch 27, batch 3950, loss[loss=0.1502, simple_loss=0.2285, pruned_loss=0.03596, over 20379.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2421, pruned_loss=0.04399, over 4682084.18 frames. ], batch size: 44, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:39:02,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:39:02,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=947100.0, ans=0.125 2023-10-02 16:39:04,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 16:39:04,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:39:04,823 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.66 vs. limit=15.0 2023-10-02 16:39:07,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:39:09,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:39:15,345 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 16:39:15,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:39:15,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 16:39:16,716 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 16:39:16,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:39:19,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:39:21,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:39:21,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:39:21,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=947166.6666666666, ans=0.1 2023-10-02 16:39:24,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 16:39:24,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=947166.6666666666, ans=0.1 2023-10-02 16:39:28,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:39:28,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:39:28,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=947166.6666666666, ans=0.1 2023-10-02 16:39:29,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:39:29,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=947166.6666666666, ans=0.1 2023-10-02 16:39:31,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:39:31,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:39:40,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:39:40,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:39:41,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=947233.3333333334, ans=0.125 2023-10-02 16:39:41,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=947233.3333333334, ans=10.0 2023-10-02 16:39:45,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 16:39:52,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 16:39:52,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 16:39:52,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:39:53,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:40:02,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:40:02,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:40:03,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:40:03,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:40:03,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 16:40:09,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:40:09,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:40:13,120 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.84 vs. limit=15.0 2023-10-02 16:40:13,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 16:40:16,472 INFO [train.py:1046] (2/4) Epoch 27, batch 4000, loss[loss=0.1475, simple_loss=0.2272, pruned_loss=0.03392, over 24618.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2425, pruned_loss=0.04416, over 4694623.62 frames. ], batch size: 60, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:40:23,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:31,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:36,304 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:40:37,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:40:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:40:38,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:38,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 16:40:40,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:40:40,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 16:40:40,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:40:40,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 16:40:41,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:40:44,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:40:44,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:40:44,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:40:44,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=947566.6666666666, ans=0.1 2023-10-02 16:40:45,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:40:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:40:47,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:40:47,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=947566.6666666666, ans=0.125 2023-10-02 16:40:47,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=947566.6666666666, ans=0.2 2023-10-02 16:40:48,935 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 16:40:49,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:40:49,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=947566.6666666666, ans=0.2 2023-10-02 16:40:50,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:40:50,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=947566.6666666666, ans=0.09899494936611666 2023-10-02 16:40:51,938 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 16:40:53,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:40:53,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:41:00,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 16:41:00,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:41:00,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=947633.3333333334, ans=0.05 2023-10-02 16:41:02,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=947633.3333333334, ans=0.0 2023-10-02 16:41:02,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=947633.3333333334, ans=0.0 2023-10-02 16:41:03,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:41:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 16:41:04,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=947633.3333333334, ans=0.1 2023-10-02 16:41:05,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=947633.3333333334, ans=0.125 2023-10-02 16:41:06,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:41:06,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 16:41:06,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:41:08,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:41:08,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:41:11,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:41:11,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:41:11,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:41:14,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 16:41:14,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:41:15,413 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.843e+02 2.082e+02 2.345e+02 3.565e+02, threshold=4.163e+02, percent-clipped=0.0 2023-10-02 16:41:15,550 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 16:41:19,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:41:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 16:41:24,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:41:24,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:41:24,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=947700.0, ans=0.125 2023-10-02 16:41:25,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:41:27,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:41:30,554 INFO [train.py:1046] (2/4) Epoch 27, batch 4050, loss[loss=0.1544, simple_loss=0.2352, pruned_loss=0.03682, over 21582.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2435, pruned_loss=0.04456, over 4699717.87 frames. ], batch size: 47, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:41:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:41:35,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:41:35,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 16:41:36,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:41:36,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:41:38,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:41:39,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:41:41,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:41:44,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:41:46,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:41:46,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:41:49,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:41:49,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:41:52,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:41:55,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:41:58,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 16:41:59,659 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.08 vs. limit=15.0 2023-10-02 16:42:01,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 16:42:01,650 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 16:42:03,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:42:10,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 16:42:12,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:42:14,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:42:17,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:42:17,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:42:17,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:42:21,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:42:26,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 16:42:26,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:42:26,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:42:29,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 16:42:31,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=948033.3333333334, ans=0.1 2023-10-02 16:42:33,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:42:39,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 16:42:39,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=948033.3333333334, ans=0.125 2023-10-02 16:42:42,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:42:42,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:42:42,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 16:42:42,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 16:42:42,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:42:45,016 INFO [train.py:1046] (2/4) Epoch 27, batch 4100, loss[loss=0.1722, simple_loss=0.2549, pruned_loss=0.04477, over 23967.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2447, pruned_loss=0.04471, over 4702062.30 frames. ], batch size: 80, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:42:45,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:42:45,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:45,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:42:51,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 16:42:53,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=948100.0, ans=0.125 2023-10-02 16:42:54,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 16:42:56,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 16:42:57,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 16:42:57,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:42:57,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:59,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:59,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:43:00,751 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 16:43:04,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:43:05,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:43:05,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:43:06,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:43:10,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:43:11,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:43:11,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:43:11,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 16:43:13,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:43:13,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:43:13,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:43:13,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:43:13,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 16:43:13,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=948233.3333333334, ans=0.0 2023-10-02 16:43:17,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:17,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=948233.3333333334, ans=0.125 2023-10-02 16:43:18,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 16:43:18,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:43:20,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=948233.3333333334, ans=0.125 2023-10-02 16:43:21,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:43:21,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 16:43:22,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:43:22,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:43:22,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:43:24,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 16:43:25,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:43:27,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:43:28,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 16:43:29,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:43:29,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:43:33,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:38,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=948300.0, ans=0.125 2023-10-02 16:43:39,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:43:42,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:43:42,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:43:45,429 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.836e+02 1.976e+02 2.197e+02 2.879e+02, threshold=3.952e+02, percent-clipped=0.0 2023-10-02 16:43:48,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:43:48,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:51,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:43:51,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.whiten.whitening_limit, batch_count=948366.6666666666, ans=12.0 2023-10-02 16:43:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:43:56,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:43:58,460 INFO [train.py:1046] (2/4) Epoch 27, batch 4150, loss[loss=0.1847, simple_loss=0.2625, pruned_loss=0.05349, over 23807.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2449, pruned_loss=0.04498, over 4701800.14 frames. ], batch size: 85, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:43:58,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:44:00,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:44:00,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:44:03,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 16:44:03,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:44:04,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 16:44:04,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 16:44:04,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 16:44:07,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:44:09,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=948433.3333333334, ans=0.125 2023-10-02 16:44:13,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:44:13,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:44:17,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:44:18,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=948500.0, ans=0.125 2023-10-02 16:44:19,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:44:19,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:44:21,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:44:21,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:44:23,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:44:26,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:44:29,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:44:31,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 16:44:34,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 16:44:34,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:44:36,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 16:44:36,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:44:36,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:44:36,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=948566.6666666666, ans=0.125 2023-10-02 16:44:36,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=948566.6666666666, ans=0.125 2023-10-02 16:44:37,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:44:37,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:44:43,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 16:44:46,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:44:47,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:44:49,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 16:44:49,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:44:50,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 16:44:53,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:44:54,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:44:56,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:44:57,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 16:44:57,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:44:57,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:44:58,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:45:00,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 16:45:00,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:45:01,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:45:01,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:45:02,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 16:45:02,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:45:03,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:45:05,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:45:06,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:45:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 16:45:06,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:45:11,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:45:12,418 INFO [train.py:1046] (2/4) Epoch 27, batch 4200, loss[loss=0.1611, simple_loss=0.2292, pruned_loss=0.04656, over 23749.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2436, pruned_loss=0.04452, over 4697237.23 frames. ], batch size: 232, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:45:12,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 16:45:13,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:45:15,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:45:18,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:45:19,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:45:19,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:45:20,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 16:45:23,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 16:45:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:25,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:45:26,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:45:30,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:45:33,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:45:33,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:33,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 16:45:33,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:45:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:36,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:45:36,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:45:36,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:45:38,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=948833.3333333334, ans=0.125 2023-10-02 16:45:39,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 16:45:39,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:45,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:45:45,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:45:48,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:45:49,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:45:49,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=948900.0, ans=0.125 2023-10-02 16:45:50,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:45:50,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 16:45:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:45:52,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:45:54,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:45:57,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:46:04,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:46:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 16:46:10,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:46:12,983 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.962e+02 2.274e+02 2.740e+02 4.088e+02, threshold=4.548e+02, percent-clipped=1.0 2023-10-02 16:46:13,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=949033.3333333334, ans=0.0 2023-10-02 16:46:14,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:46:14,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:15,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 16:46:22,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:46:25,304 INFO [train.py:1046] (2/4) Epoch 27, batch 4250, loss[loss=0.1703, simple_loss=0.2424, pruned_loss=0.04912, over 23662.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2426, pruned_loss=0.04432, over 4701225.17 frames. ], batch size: 232, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:46:26,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:46:26,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:46:30,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:35,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:46:35,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 16:46:35,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=949100.0, ans=0.125 2023-10-02 16:46:35,755 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.03 vs. limit=15.0 2023-10-02 16:46:36,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:46:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:41,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.27 vs. limit=15.0 2023-10-02 16:46:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:46:45,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:45,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:46:46,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:46:48,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:46:49,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:50,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:46:52,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:54,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:46:56,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:46:56,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 16:46:57,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=949233.3333333334, ans=0.0 2023-10-02 16:47:02,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 16:47:02,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:47:02,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:02,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:47:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:47:03,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:05,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:47:08,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 16:47:09,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:47:14,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:47:15,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:17,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 16:47:17,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:47:17,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 16:47:18,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:47:20,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:47:21,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:21,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:47:21,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=949300.0, ans=0.125 2023-10-02 16:47:24,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 16:47:25,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:47:27,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:47:31,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:33,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:35,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:47:35,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:47:38,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:47:39,723 INFO [train.py:1046] (2/4) Epoch 27, batch 4300, loss[loss=0.1628, simple_loss=0.228, pruned_loss=0.0488, over 22737.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2421, pruned_loss=0.04425, over 4691178.11 frames. ], batch size: 322, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:47:39,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:47:40,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=949433.3333333334, ans=0.125 2023-10-02 16:47:41,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:47:41,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 16:47:41,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:44,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=949433.3333333334, ans=0.0 2023-10-02 16:47:46,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:47:46,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:47:50,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:51,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=949433.3333333334, ans=0.0 2023-10-02 16:47:58,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:58,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 16:47:59,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:48:01,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:48:01,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:48:03,261 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 16:48:06,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:48:08,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:48:11,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 16:48:11,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:48:11,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 16:48:13,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:48:17,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:48:18,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:48:18,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:48:20,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:48:20,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=949566.6666666666, ans=0.0 2023-10-02 16:48:21,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:48:21,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=949566.6666666666, ans=0.125 2023-10-02 16:48:23,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:48:23,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 16:48:24,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 16:48:25,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=949633.3333333334, ans=0.125 2023-10-02 16:48:27,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:48:27,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=949633.3333333334, ans=0.125 2023-10-02 16:48:28,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:28,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:48:28,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:28,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:48:28,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 16:48:28,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 16:48:29,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=15.0 2023-10-02 16:48:29,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 16:48:29,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:48:31,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 16:48:31,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 16:48:34,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:48:34,777 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 16:48:36,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:48:38,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:48:38,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:48:39,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=949700.0, ans=0.125 2023-10-02 16:48:40,905 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.825e+02 1.972e+02 2.208e+02 2.993e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-02 16:48:42,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 16:48:42,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:48:44,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:44,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:48:44,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:48:45,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:48:47,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:48:49,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:48:51,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:51,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:48:54,090 INFO [train.py:1046] (2/4) Epoch 27, batch 4350, loss[loss=0.1807, simple_loss=0.2611, pruned_loss=0.05019, over 24073.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2428, pruned_loss=0.04427, over 4701723.48 frames. ], batch size: 86, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:48:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 16:48:57,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:49:00,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=949766.6666666666, ans=0.125 2023-10-02 16:49:01,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:02,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:49:05,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.19 vs. limit=15.0 2023-10-02 16:49:05,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:49:05,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:49:09,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=949833.3333333334, ans=0.125 2023-10-02 16:49:12,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:49:16,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:49:18,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=949833.3333333334, ans=0.125 2023-10-02 16:49:19,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:49:19,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:49:23,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:49:25,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:49:25,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:49:30,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 16:49:30,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:31,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=949900.0, ans=0.1 2023-10-02 16:49:31,782 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=12.0 2023-10-02 16:49:32,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:35,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:37,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 16:49:42,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:49:44,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:49:46,374 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 16:49:48,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:49:48,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:49:49,709 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 16:49:49,773 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 16:49:49,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:49:49,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:49,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:49:51,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:49:51,920 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.17 vs. limit=10.0 2023-10-02 16:49:52,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:49:52,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:49:55,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 16:49:55,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:55,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:49:55,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:56,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 16:49:58,086 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 16:49:58,090 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 16:49:58,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 16:50:00,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:50:02,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:50:02,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:02,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:50:04,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 16:50:06,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 16:50:06,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:08,101 INFO [train.py:1046] (2/4) Epoch 27, batch 4400, loss[loss=0.1663, simple_loss=0.2393, pruned_loss=0.04666, over 23795.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2432, pruned_loss=0.04433, over 4714916.35 frames. ], batch size: 212, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:50:11,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:50:11,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:13,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:50:15,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 16:50:15,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 16:50:16,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 16:50:16,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 16:50:18,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:50:18,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:50:20,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 16:50:22,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:22,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:22,282 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 16:50:24,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:24,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 16:50:24,997 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 16:50:28,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 16:50:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 16:50:29,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 16:50:30,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:30,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:50:31,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:50:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:50:35,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 16:50:35,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 16:50:36,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:38,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:50:38,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:40,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:42,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:42,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 16:50:42,321 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 16:50:47,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:47,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=950233.3333333334, ans=0.2 2023-10-02 16:50:48,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=950233.3333333334, ans=0.0 2023-10-02 16:50:48,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=950233.3333333334, ans=0.0 2023-10-02 16:50:50,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=950233.3333333334, ans=0.125 2023-10-02 16:50:51,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:50:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 16:50:56,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=950300.0, ans=0.125 2023-10-02 16:50:58,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:51:00,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=950300.0, ans=0.2 2023-10-02 16:51:01,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:51:01,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=950300.0, ans=0.5 2023-10-02 16:51:02,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:51:04,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 16:51:04,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:51:04,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:51:04,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:51:05,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:51:08,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.837e+02 2.009e+02 2.278e+02 3.254e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 16:51:10,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 16:51:11,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 16:51:13,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 16:51:14,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:14,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 16:51:14,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:51:19,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:51:19,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=950366.6666666666, ans=0.125 2023-10-02 16:51:20,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 16:51:22,142 INFO [train.py:1046] (2/4) Epoch 27, batch 4450, loss[loss=0.1816, simple_loss=0.265, pruned_loss=0.04913, over 24404.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.245, pruned_loss=0.04488, over 4718293.58 frames. ], batch size: 77, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:51:23,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:51:26,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:26,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:51:32,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:51:32,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:51:34,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:36,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:51:37,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=950500.0, ans=0.0 2023-10-02 16:51:39,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:51:40,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:40,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 16:51:40,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:51:42,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:42,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:51:42,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:51:47,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:51:53,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:51:53,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:51:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:51:55,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:56,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=950566.6666666666, ans=0.125 2023-10-02 16:51:57,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:51:58,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.59 vs. limit=12.0 2023-10-02 16:52:01,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:52:03,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 16:52:03,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 16:52:03,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:52:05,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:52:06,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 16:52:10,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:52:13,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:52:13,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 16:52:15,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:15,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:52:15,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:52:15,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:52:17,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:52:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:52:21,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 16:52:23,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:52:24,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:52:25,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:52:28,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:52:31,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:52:34,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 16:52:35,448 INFO [train.py:1046] (2/4) Epoch 27, batch 4500, loss[loss=0.1574, simple_loss=0.2377, pruned_loss=0.03853, over 24623.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2455, pruned_loss=0.0451, over 4713123.55 frames. ], batch size: 60, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:52:35,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:52:38,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:52:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 16:52:39,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 16:52:41,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:52:45,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:46,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:52:46,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:52:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:52:48,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:52:49,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:52:51,901 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:52:58,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=950833.3333333334, ans=0.0 2023-10-02 16:52:59,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:01,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:53:04,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:53:05,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:53:05,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:53:11,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:53:13,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=950900.0, ans=0.0 2023-10-02 16:53:14,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:53:19,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:53:22,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:53:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 16:53:23,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:25,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:53:28,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:53:28,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:53:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:53:30,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 16:53:30,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:53:30,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:34,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:53:34,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:53:37,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:38,921 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.864e+02 2.019e+02 2.246e+02 3.268e+02, threshold=4.037e+02, percent-clipped=0.0 2023-10-02 16:53:39,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:53:39,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:53:42,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 16:53:42,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=951033.3333333334, ans=0.125 2023-10-02 16:53:45,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 16:53:45,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 16:53:49,797 INFO [train.py:1046] (2/4) Epoch 27, batch 4550, loss[loss=0.1523, simple_loss=0.2327, pruned_loss=0.03598, over 24600.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2445, pruned_loss=0.04469, over 4717635.85 frames. ], batch size: 60, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:53:49,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 16:53:51,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 16:53:53,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:53:54,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:56,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:56,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=951100.0, ans=0.0 2023-10-02 16:53:58,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:01,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:54:03,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:54:05,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:54:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:08,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:08,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:54:09,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=951166.6666666666, ans=0.0 2023-10-02 16:54:13,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:54:16,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 16:54:17,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 16:54:18,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:54:19,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 16:54:24,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 16:54:24,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:54:26,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.49 vs. limit=15.0 2023-10-02 16:54:26,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 16:54:28,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:54:31,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:31,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:32,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:54:33,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 16:54:36,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:54:39,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:39,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:54:40,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:42,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 16:54:42,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 16:54:42,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:54:43,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 16:54:45,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 16:54:45,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:46,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:46,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:54:48,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:48,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:54:50,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:54:52,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 16:54:53,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:54:54,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 16:54:54,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 16:54:54,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:54:54,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 16:54:57,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:54:57,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:54:58,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:54:58,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:59,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:55:01,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:55:02,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:55:03,696 INFO [train.py:1046] (2/4) Epoch 27, batch 4600, loss[loss=0.1501, simple_loss=0.2251, pruned_loss=0.03751, over 23445.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2433, pruned_loss=0.0444, over 4708561.36 frames. ], batch size: 285, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:55:05,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:06,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:55:09,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:55:09,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:55:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:11,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 16:55:14,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:55:15,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=951433.3333333334, ans=0.125 2023-10-02 16:55:16,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:55:17,043 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:55:18,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:20,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:28,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 16:55:28,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=951500.0, ans=0.0 2023-10-02 16:55:29,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:31,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:34,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:55:34,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:39,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 16:55:39,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:55:39,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:55:44,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:45,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:55:47,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:55:47,946 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.52 vs. limit=15.0 2023-10-02 16:55:50,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 16:55:52,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:55:56,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:55:56,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:55:59,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:55:59,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 16:56:00,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:00,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 16:56:00,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:00,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:03,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:03,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:56:04,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:05,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 16:56:06,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 16:56:06,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 16:56:06,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:07,496 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.372e+02 1.867e+02 2.097e+02 2.357e+02 3.231e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 16:56:07,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:56:07,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:10,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:18,267 INFO [train.py:1046] (2/4) Epoch 27, batch 4650, loss[loss=0.1481, simple_loss=0.2354, pruned_loss=0.03044, over 24495.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2427, pruned_loss=0.04438, over 4714578.41 frames. ], batch size: 63, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:56:20,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:56:24,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:56:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:25,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:56:25,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:26,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:56:27,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 16:56:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:56:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 16:56:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:56:37,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 16:56:37,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:56:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 16:56:38,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 16:56:38,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:38,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:56:41,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:56:41,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:56:43,347 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 16:56:45,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:56:47,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 16:56:49,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:49,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:56:51,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 16:56:52,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:56:58,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:57:01,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:05,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:07,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:57:07,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:08,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:57:11,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 16:57:12,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 16:57:14,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 16:57:14,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 16:57:16,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:23,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:57:23,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:57:23,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 16:57:23,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:23,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:57:25,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:57:27,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:57:27,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=952033.3333333334, ans=0.0 2023-10-02 16:57:28,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:57:28,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:57:28,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:57:31,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:57:31,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:57:32,757 INFO [train.py:1046] (2/4) Epoch 27, batch 4700, loss[loss=0.1768, simple_loss=0.2488, pruned_loss=0.05245, over 23838.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2433, pruned_loss=0.04435, over 4718321.11 frames. ], batch size: 195, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:57:32,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 16:57:34,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:57:36,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 16:57:43,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:45,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:45,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:57:46,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:57:47,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:57:49,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=952166.6666666666, ans=0.0 2023-10-02 16:57:52,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 16:57:52,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 16:57:52,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=952166.6666666666, ans=0.125 2023-10-02 16:57:54,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:55,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:57:55,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:59,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:58:06,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:58:07,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:58:09,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:58:14,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 16:58:15,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=952300.0, ans=0.2 2023-10-02 16:58:16,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:58:18,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:21,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 16:58:24,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:58:28,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:58:28,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 16:58:31,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:31,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:58:33,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:58:34,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:58:34,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 16:58:36,115 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.791e+02 1.934e+02 2.224e+02 4.121e+02, threshold=3.867e+02, percent-clipped=0.0 2023-10-02 16:58:36,195 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 16:58:37,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:58:41,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 16:58:41,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:45,874 INFO [train.py:1046] (2/4) Epoch 27, batch 4750, loss[loss=0.1511, simple_loss=0.23, pruned_loss=0.03614, over 19943.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2445, pruned_loss=0.04456, over 4726108.13 frames. ], batch size: 43, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:58:45,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 16:58:47,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:58:48,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:58:53,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:58:53,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:58:55,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 16:58:55,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:01,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 16:59:02,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:59:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:59:03,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:05,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=952500.0, ans=0.125 2023-10-02 16:59:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 16:59:12,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:59:14,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 16:59:14,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=952566.6666666666, ans=0.125 2023-10-02 16:59:15,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:18,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:59:18,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:59:18,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:59:19,832 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 16:59:19,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 16:59:25,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 16:59:27,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.65 vs. limit=15.0 2023-10-02 16:59:28,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:29,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=952633.3333333334, ans=0.0 2023-10-02 16:59:30,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:59:33,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:59:33,106 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 16:59:33,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:59:35,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:59:37,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:59:40,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 16:59:40,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 16:59:40,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:59:40,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:59:40,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:42,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:59:42,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 16:59:44,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 16:59:48,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:59:50,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:59:50,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 16:59:51,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:53,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:59:54,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:59:56,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:59:56,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:59:59,415 INFO [train.py:1046] (2/4) Epoch 27, batch 4800, loss[loss=0.1584, simple_loss=0.2379, pruned_loss=0.0395, over 24663.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.245, pruned_loss=0.04491, over 4728304.41 frames. ], batch size: 65, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 16:59:59,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:00,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 17:00:00,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 17:00:02,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=952766.6666666666, ans=0.125 2023-10-02 17:00:03,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 17:00:04,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:00:04,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:05,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.10 vs. limit=15.0 2023-10-02 17:00:06,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 17:00:06,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=952766.6666666666, ans=0.0 2023-10-02 17:00:06,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=952766.6666666666, ans=0.125 2023-10-02 17:00:10,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:10,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:12,937 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.92 vs. limit=15.0 2023-10-02 17:00:16,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:00:17,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:17,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:18,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 17:00:19,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:00:19,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:00:22,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:00:22,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=952833.3333333334, ans=0.125 2023-10-02 17:00:25,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:26,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:28,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:00:28,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:28,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=952900.0, ans=0.2 2023-10-02 17:00:30,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 17:00:30,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:32,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:33,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:33,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=952900.0, ans=0.125 2023-10-02 17:00:36,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:36,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:36,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:00:37,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 17:00:39,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:42,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 17:00:42,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 17:00:43,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:43,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:00:44,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:00:44,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:00:44,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:00:46,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:00:46,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:00:48,490 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:00:51,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:51,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=952966.6666666666, ans=0.125 2023-10-02 17:00:52,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:55,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:00:57,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-10-02 17:00:58,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 17:00:58,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:58,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:00,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:01:00,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:01:00,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=953033.3333333334, ans=0.0 2023-10-02 17:01:02,771 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.815e+02 2.074e+02 2.431e+02 3.640e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 17:01:04,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:01:04,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:01:05,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:05,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:01:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:01:06,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:01:10,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:12,196 INFO [train.py:1046] (2/4) Epoch 27, batch 4850, loss[loss=0.1567, simple_loss=0.2445, pruned_loss=0.03444, over 24644.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2454, pruned_loss=0.04516, over 4715432.73 frames. ], batch size: 68, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:01:12,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:12,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:01:14,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 17:01:15,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 17:01:15,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:01:15,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:01:16,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:01:16,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:19,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:01:24,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 17:01:27,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:32,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:01:32,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:01:33,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:36,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:37,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:01:38,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:01:39,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 17:01:39,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=953166.6666666666, ans=0.1 2023-10-02 17:01:43,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:01:45,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:01:45,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=953233.3333333334, ans=0.125 2023-10-02 17:01:45,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.64 vs. limit=22.5 2023-10-02 17:01:46,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:01:46,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:01:46,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 17:01:48,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.44 vs. limit=12.0 2023-10-02 17:01:50,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:01:50,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:01:52,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:01:52,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 17:01:52,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 17:01:54,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:01:54,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-10-02 17:02:02,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:02:03,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 17:02:04,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:02:04,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:02:06,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:02:07,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 17:02:07,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:02:07,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 17:02:09,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:09,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:02:09,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 17:02:11,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=953366.6666666666, ans=0.0 2023-10-02 17:02:18,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:02:23,259 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:02:24,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:02:24,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:02:26,968 INFO [train.py:1046] (2/4) Epoch 27, batch 4900, loss[loss=0.1614, simple_loss=0.2261, pruned_loss=0.04829, over 23860.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2441, pruned_loss=0.04517, over 4704945.88 frames. ], batch size: 212, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:02:29,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 17:02:29,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:02:33,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=953433.3333333334, ans=0.07 2023-10-02 17:02:35,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:02:36,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:36,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:02:38,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 17:02:38,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=953433.3333333334, ans=0.0 2023-10-02 17:02:43,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 17:02:43,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=953500.0, ans=0.1 2023-10-02 17:02:46,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 17:02:48,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 17:02:48,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:02:49,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:49,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:02:49,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:02:49,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:02:49,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 17:02:54,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 17:02:56,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:02:57,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:02:57,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:02:58,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:03:00,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:00,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:00,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 17:03:03,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:03:04,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:03:04,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 17:03:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 17:03:08,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 17:03:10,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:03:12,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:03:12,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:03:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:14,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:03:14,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:03:14,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 17:03:17,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:17,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=953633.3333333334, ans=0.07 2023-10-02 17:03:18,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:03:21,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:03:24,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 17:03:24,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:03:25,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=953700.0, ans=0.0 2023-10-02 17:03:26,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 17:03:26,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 17:03:30,815 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.829e+02 2.030e+02 2.368e+02 3.684e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 17:03:34,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:03:34,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:03:35,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 17:03:36,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:03:36,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:03:38,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:41,093 INFO [train.py:1046] (2/4) Epoch 27, batch 4950, loss[loss=0.1657, simple_loss=0.235, pruned_loss=0.04819, over 23667.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2436, pruned_loss=0.04449, over 4724100.59 frames. ], batch size: 232, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:03:41,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:03:41,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:03:41,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:03:42,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 17:03:42,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:03:45,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:03:45,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:03:48,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 17:03:48,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 17:03:49,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:03:49,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 17:03:49,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:49,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:03:50,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:03:50,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:03:53,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:54,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:03:55,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:03:56,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:03:57,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:57,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:03:59,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=953833.3333333334, ans=0.1 2023-10-02 17:04:00,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:04:06,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:07,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:04:09,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:10,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:10,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=953900.0, ans=0.125 2023-10-02 17:04:12,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:04:12,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 17:04:13,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 17:04:16,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:17,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:04:17,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:04:17,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:04:17,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:04:19,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:04:23,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:04:24,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:04:25,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:04:28,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:28,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:29,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 17:04:29,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:04:31,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:04:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:04:37,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:04:37,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:04:38,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:38,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:04:38,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:04:41,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:04:41,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:04:41,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:04:43,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 17:04:43,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=954033.3333333334, ans=0.1 2023-10-02 17:04:47,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:04:52,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 17:04:52,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:04:55,558 INFO [train.py:1046] (2/4) Epoch 27, batch 5000, loss[loss=0.1543, simple_loss=0.2352, pruned_loss=0.03664, over 24308.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2423, pruned_loss=0.04458, over 4694472.81 frames. ], batch size: 61, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:04:56,183 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.61 vs. limit=15.0 2023-10-02 17:04:58,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:58,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:05:00,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 17:05:01,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 17:05:04,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:05:05,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 17:05:05,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:05:05,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:05:07,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 17:05:08,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:09,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:05:09,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 17:05:09,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:05:11,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:05:12,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 17:05:13,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 17:05:14,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:05:15,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 17:05:15,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:05:16,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:16,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:05:16,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 17:05:16,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 17:05:18,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 17:05:18,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:19,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:21,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 17:05:21,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:05:21,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=954166.6666666666, ans=0.0 2023-10-02 17:05:22,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:23,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=954233.3333333334, ans=0.05 2023-10-02 17:05:24,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:05:26,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 17:05:27,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 17:05:27,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:05:29,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:05:33,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 17:05:37,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:05:38,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:38,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:05:41,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 17:05:41,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:42,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:05:42,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:05:44,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 17:05:44,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:05:48,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:05:49,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:05:51,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=954300.0, ans=0.125 2023-10-02 17:05:52,056 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.88 vs. limit=22.5 2023-10-02 17:05:56,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 17:05:59,917 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.877e+02 2.124e+02 2.613e+02 3.895e+02, threshold=4.248e+02, percent-clipped=0.0 2023-10-02 17:06:00,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:07,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=954433.3333333334, ans=10.0 2023-10-02 17:06:09,329 INFO [train.py:1046] (2/4) Epoch 27, batch 5050, loss[loss=0.1576, simple_loss=0.2278, pruned_loss=0.04373, over 23423.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2423, pruned_loss=0.04434, over 4695458.30 frames. ], batch size: 134, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:06:09,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:06:09,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=954433.3333333334, ans=0.1 2023-10-02 17:06:10,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:10,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:06:10,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:06:10,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:06:12,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:06:12,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:16,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:16,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 17:06:16,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:06:19,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:06:20,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:06:22,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 17:06:23,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:06:23,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:06:25,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:06:26,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:06:26,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:06:27,434 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.57 vs. limit=15.0 2023-10-02 17:06:35,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 17:06:37,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:06:37,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:06:37,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.66 vs. limit=15.0 2023-10-02 17:06:38,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 17:06:38,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:06:38,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=954566.6666666666, ans=0.1 2023-10-02 17:06:40,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:40,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:06:41,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:06:41,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 17:06:41,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 17:06:43,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:45,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:06:47,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=954566.6666666666, ans=0.1 2023-10-02 17:06:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:49,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 17:06:51,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:06:54,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 17:06:55,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:06:55,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:06:57,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:06:57,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:06:59,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:07:00,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:07:02,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:03,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:07:03,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:07:03,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 17:07:05,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:07:05,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:07:08,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:07:08,377 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 17:07:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:07:11,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:07:11,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:11,155 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 17:07:12,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.62 vs. limit=15.0 2023-10-02 17:07:14,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:07:14,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 17:07:14,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:07:18,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:18,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 17:07:20,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 17:07:23,381 INFO [train.py:1046] (2/4) Epoch 27, batch 5100, loss[loss=0.1678, simple_loss=0.2445, pruned_loss=0.04556, over 23574.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2437, pruned_loss=0.04502, over 4688821.71 frames. ], batch size: 256, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:07:23,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:23,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:07:23,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:07:26,298 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 17:07:29,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:07:32,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 17:07:32,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 17:07:33,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:34,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:07:37,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:07:37,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 17:07:39,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 17:07:41,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=954833.3333333334, ans=0.125 2023-10-02 17:07:42,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:07:42,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:07:45,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:46,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.45 vs. limit=22.5 2023-10-02 17:07:47,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 17:07:48,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:07:49,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:49,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 17:07:53,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:54,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:54,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 17:07:56,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 17:07:57,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:58,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 17:07:59,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 17:08:03,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:08:11,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:13,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 17:08:13,508 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 17:08:15,287 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 17:08:16,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 17:08:16,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:08:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 17:08:22,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 17:08:23,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 17:08:25,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:08:28,127 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.816e+02 1.951e+02 2.163e+02 3.190e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-02 17:08:28,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 17:08:31,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:08:31,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 17:08:36,952 INFO [train.py:1046] (2/4) Epoch 27, batch 5150, loss[loss=0.1555, simple_loss=0.2401, pruned_loss=0.03542, over 24470.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2445, pruned_loss=0.04481, over 4705493.95 frames. ], batch size: 66, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:08:37,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:08:37,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:08:37,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:08:38,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:08:38,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:08:40,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:08:40,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 17:08:40,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 17:08:41,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 17:08:41,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:08:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 17:08:43,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:43,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 17:08:44,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:08:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:08:50,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:08:52,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 17:08:52,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=955166.6666666666, ans=0.125 2023-10-02 17:08:53,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:53,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:08:56,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:08:56,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:08:56,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:08:56,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:08:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:08:58,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 17:08:59,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:09:01,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:09:01,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:09:02,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 17:09:03,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=955166.6666666666, ans=0.125 2023-10-02 17:09:04,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:09:09,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:09:11,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 17:09:14,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:09:19,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:09:20,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=955300.0, ans=0.125 2023-10-02 17:09:21,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:09:22,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.42 vs. limit=15.0 2023-10-02 17:09:24,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:09:26,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:09:28,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 17:09:28,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=955300.0, ans=0.125 2023-10-02 17:09:32,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:09:32,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:09:32,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:09:36,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:09:38,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:09:38,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 17:09:41,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:09:44,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:09:45,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.71 vs. limit=6.0 2023-10-02 17:09:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:09:45,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:09:47,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:09:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:09:47,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:09:49,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:09:51,792 INFO [train.py:1046] (2/4) Epoch 27, batch 5200, loss[loss=0.1839, simple_loss=0.2638, pruned_loss=0.05198, over 24023.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.245, pruned_loss=0.04472, over 4703305.24 frames. ], batch size: 86, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:09:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:09:53,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:09:54,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=955433.3333333334, ans=0.0 2023-10-02 17:09:55,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:02,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 17:10:02,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:10:02,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:05,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:06,038 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.09 vs. limit=6.0 2023-10-02 17:10:06,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:10:06,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:08,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 17:10:08,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=955500.0, ans=0.125 2023-10-02 17:10:10,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:10:10,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:11,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=955500.0, ans=0.125 2023-10-02 17:10:14,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 17:10:14,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=955500.0, ans=0.1 2023-10-02 17:10:16,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:10:18,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:10:18,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 17:10:18,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 17:10:21,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 17:10:22,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:22,921 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 17:10:22,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:25,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:25,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:10:26,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 17:10:27,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:10:29,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:33,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 17:10:34,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 17:10:34,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 17:10:40,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 17:10:40,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:10:43,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:10:43,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:10:44,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 17:10:44,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:46,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:10:46,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:10:50,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:10:53,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:10:55,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=955700.0, ans=0.125 2023-10-02 17:10:56,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:57,478 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.823e+02 2.038e+02 2.325e+02 3.987e+02, threshold=4.077e+02, percent-clipped=1.0 2023-10-02 17:10:57,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:10:57,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:11:01,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 17:11:04,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:11:05,095 INFO [train.py:1046] (2/4) Epoch 27, batch 5250, loss[loss=0.1849, simple_loss=0.2682, pruned_loss=0.05083, over 23954.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2445, pruned_loss=0.04423, over 4719194.96 frames. ], batch size: 80, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:11:05,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:11:05,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:06,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:11:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:11:09,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:11:12,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:11:13,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:11:13,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:11:18,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:11:20,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:11:21,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:11:24,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:11:24,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=955833.3333333334, ans=0.2 2023-10-02 17:11:26,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 17:11:26,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:11:26,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:43,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=955900.0, ans=0.2 2023-10-02 17:11:43,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=955900.0, ans=0.125 2023-10-02 17:12:12,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=956100.0, ans=0.1 2023-10-02 17:12:13,998 INFO [train.py:1046] (2/4) Epoch 27, batch 5300, loss[loss=0.1617, simple_loss=0.2421, pruned_loss=0.04067, over 24275.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.244, pruned_loss=0.04427, over 4711489.98 frames. ], batch size: 61, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:12:28,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:12:28,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 17:12:28,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 17:12:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:28,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:28,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:28,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:12:28,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:28,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:12:29,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:12:29,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 17:12:29,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 17:12:29,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 17:12:29,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:12:29,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 17:12:29,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 17:12:29,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:30,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:30,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:12:30,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:12:30,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:12:30,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:12:30,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:30,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:30,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:12:30,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:30,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:12:30,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:12:31,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 17:12:31,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:12:32,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:32,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 17:12:32,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 17:12:32,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:12:32,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:12:32,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 17:12:32,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 17:12:32,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:12:32,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:12:32,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:12:33,027 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 17:12:33,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 17:12:33,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:12:33,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:33,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 17:12:33,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 17:12:33,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 17:12:33,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:12:40,107 INFO [train.py:1046] (2/4) Epoch 28, batch 0, loss[loss=0.153, simple_loss=0.238, pruned_loss=0.03406, over 24424.00 frames. ], tot_loss[loss=0.153, simple_loss=0.238, pruned_loss=0.03406, over 24424.00 frames. ], batch size: 66, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:12:40,107 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 17:12:52,151 INFO [train.py:1078] (2/4) Epoch 28, validation: loss=0.3134, simple_loss=0.267, pruned_loss=0.1799, over 1125622.00 frames. 2023-10-02 17:12:52,152 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 17:12:54,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 17:12:56,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:12:57,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:13:04,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:04,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:13:04,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:05,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 17:13:07,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 17:13:08,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:09,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:12,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:12,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:13,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:13:13,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:13:14,474 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.18 vs. limit=15.0 2023-10-02 17:13:15,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 17:13:16,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:13:24,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:13:24,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:26,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 17:13:28,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=956313.3333333334, ans=0.1 2023-10-02 17:13:30,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:13:30,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:13:33,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:13:36,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:13:39,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:13:40,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=956380.0, ans=0.0 2023-10-02 17:13:41,016 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.877e+02 2.089e+02 2.395e+02 3.641e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 17:13:45,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 17:13:46,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 17:13:48,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:13:48,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:13:49,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:13:49,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:51,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 17:13:54,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:13:56,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:14:00,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:14:03,384 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 17:14:04,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:14:06,216 INFO [train.py:1046] (2/4) Epoch 28, batch 50, loss[loss=0.163, simple_loss=0.2473, pruned_loss=0.03939, over 24337.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2475, pruned_loss=0.04361, over 1067454.44 frames. ], batch size: 61, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:14:06,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:14:09,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:14:09,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 17:14:09,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:14:09,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:14:12,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:14:12,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=956513.3333333334, ans=0.0 2023-10-02 17:14:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:14:16,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:14:19,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 17:14:19,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:25,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:14:27,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 17:14:30,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 17:14:31,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:14:33,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:14:33,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:34,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:14:35,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:14:37,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:14:37,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:40,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=956646.6666666666, ans=0.125 2023-10-02 17:14:46,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:14:46,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:14:46,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:14:46,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 17:14:49,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:14:49,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:14:49,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 17:14:50,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:14:52,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 17:14:59,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:14:59,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:15:01,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:03,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:15:03,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:15:04,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 17:15:04,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 17:15:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:07,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:15:07,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=956780.0, ans=0.125 2023-10-02 17:15:08,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:15:08,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:15:08,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 17:15:09,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 17:15:10,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 17:15:12,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:12,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:15:13,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 17:15:13,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 17:15:14,302 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.25 vs. limit=6.0 2023-10-02 17:15:15,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:15,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:15:17,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:15:17,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:15:20,552 INFO [train.py:1046] (2/4) Epoch 28, batch 100, loss[loss=0.1633, simple_loss=0.2495, pruned_loss=0.03859, over 24629.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2466, pruned_loss=0.04379, over 1883514.34 frames. ], batch size: 68, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:15:20,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:15:22,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=956846.6666666666, ans=0.125 2023-10-02 17:15:22,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=956846.6666666666, ans=0.2 2023-10-02 17:15:23,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=956846.6666666666, ans=0.125 2023-10-02 17:15:23,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=956846.6666666666, ans=0.0 2023-10-02 17:15:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:15:26,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:15:30,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 17:15:30,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:30,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.41 vs. limit=15.0 2023-10-02 17:15:33,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:15:34,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:15:34,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:15:34,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:15:34,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:15:36,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 17:15:36,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:15:37,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:37,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:37,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:15:41,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 17:15:42,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:44,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:45,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:15:47,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:15:50,258 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 17:15:50,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 17:15:51,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:15:51,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:15:55,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:15:57,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:59,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:01,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=956980.0, ans=0.1 2023-10-02 17:16:04,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:05,617 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 17:16:08,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 17:16:10,931 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.862e+02 2.053e+02 2.349e+02 3.571e+02, threshold=4.105e+02, percent-clipped=0.0 2023-10-02 17:16:12,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:16:12,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:16:13,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:16,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=957046.6666666666, ans=0.0 2023-10-02 17:16:18,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:21,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:16:22,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:16:24,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=957113.3333333334, ans=0.0 2023-10-02 17:16:25,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:25,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:27,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:27,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:16:28,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:29,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 17:16:30,002 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 17:16:30,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:31,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:16:33,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:33,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:33,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 17:16:33,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:16:33,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:16:33,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:34,568 INFO [train.py:1046] (2/4) Epoch 28, batch 150, loss[loss=0.1677, simple_loss=0.2476, pruned_loss=0.04396, over 24691.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2476, pruned_loss=0.04466, over 2514221.64 frames. ], batch size: 65, lr: 3.69e-03, grad_scale: 8.0 2023-10-02 17:16:34,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:36,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:36,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:16:36,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:16:38,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=957180.0, ans=0.125 2023-10-02 17:16:40,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:42,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:16:42,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:16:43,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:44,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:45,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:45,613 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.98 vs. limit=15.0 2023-10-02 17:16:48,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:16:48,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:53,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 17:16:53,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 17:16:53,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 17:16:56,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:16:56,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:16:56,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:16:57,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:57,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:59,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:59,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:17:00,488 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 17:17:02,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:17:09,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:17:11,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:17:11,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 17:17:12,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=957313.3333333334, ans=0.125 2023-10-02 17:17:13,799 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.56 vs. limit=15.0 2023-10-02 17:17:15,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:17:15,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:17:16,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:17:17,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:17:18,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:17:19,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:17:20,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:20,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 17:17:25,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:27,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:17:27,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:17:27,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:17:30,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:30,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=957380.0, ans=0.125 2023-10-02 17:17:31,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 17:17:33,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:17:36,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:17:39,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:17:42,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:17:42,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 17:17:42,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:17:42,143 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 17:17:46,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:17:47,801 INFO [train.py:1046] (2/4) Epoch 28, batch 200, loss[loss=0.1778, simple_loss=0.256, pruned_loss=0.0498, over 23523.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2476, pruned_loss=0.0451, over 3010616.19 frames. ], batch size: 106, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:17:47,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:17:47,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:17:51,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=957513.3333333334, ans=0.125 2023-10-02 17:17:52,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 17:17:52,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:17:53,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:17:56,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 17:17:57,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:17:59,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:00,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:02,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:18:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:18:02,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:11,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=957580.0, ans=0.125 2023-10-02 17:18:24,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:18:24,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:18:26,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:18:27,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:18:29,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:18:29,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:18:31,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:32,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.64 vs. limit=6.0 2023-10-02 17:18:33,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:18:33,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:18:33,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:18:34,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 17:18:36,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:18:36,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:37,241 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.956e+02 2.202e+02 2.604e+02 4.152e+02, threshold=4.404e+02, percent-clipped=1.0 2023-10-02 17:18:40,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:18:44,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:18:52,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:52,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:18:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:00,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 17:19:00,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:19:00,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:19:00,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:19:01,475 INFO [train.py:1046] (2/4) Epoch 28, batch 250, loss[loss=0.182, simple_loss=0.2668, pruned_loss=0.04867, over 23949.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2476, pruned_loss=0.04547, over 3384678.08 frames. ], batch size: 86, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:19:01,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:19:02,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 17:19:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:19:04,322 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 17:19:05,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:07,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:19:07,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:09,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:19:11,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:19:11,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:12,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:19:14,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-10-02 17:19:17,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:19:18,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=957913.3333333334, ans=0.125 2023-10-02 17:19:18,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=957913.3333333334, ans=0.2 2023-10-02 17:19:26,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:19:26,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=957913.3333333334, ans=0.1 2023-10-02 17:19:28,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:19:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:19:38,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:19:38,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:19:39,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:19:39,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:19:41,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:19:41,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:19:41,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:19:41,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=957980.0, ans=0.125 2023-10-02 17:19:44,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:19:46,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 17:19:46,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:19:47,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:19:48,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:19:48,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:19:48,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:19:50,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:19:50,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:19:51,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:19:52,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:19:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:19:57,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:20:01,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:20:04,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:20:09,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:20:12,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:20:14,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 17:20:15,504 INFO [train.py:1046] (2/4) Epoch 28, batch 300, loss[loss=0.1613, simple_loss=0.2506, pruned_loss=0.03595, over 24651.00 frames. ], tot_loss[loss=0.167, simple_loss=0.245, pruned_loss=0.04452, over 3679844.13 frames. ], batch size: 68, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:20:15,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:20:17,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:20:18,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 17:20:18,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:20:20,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:20:20,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 17:20:20,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=958180.0, ans=0.2 2023-10-02 17:20:25,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-10-02 17:20:25,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:20:25,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:20:28,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:20:28,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 17:20:30,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:20:32,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:20:32,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 17:20:32,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:20:35,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:20:35,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=958246.6666666666, ans=0.0 2023-10-02 17:20:41,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:20:43,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 17:20:43,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=958313.3333333334, ans=0.125 2023-10-02 17:20:44,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=958313.3333333334, ans=0.125 2023-10-02 17:20:46,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 17:20:46,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:20:47,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:20:49,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:20:49,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 17:20:49,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:20:50,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:20:53,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:20:53,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:20:57,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:20:57,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 17:20:58,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:21:01,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:02,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 17:21:03,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:05,028 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.790e+02 1.923e+02 2.144e+02 2.937e+02, threshold=3.846e+02, percent-clipped=0.0 2023-10-02 17:21:09,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:21:10,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:21:10,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 17:21:12,974 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:21:15,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:15,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:21:18,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:18,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:21:20,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 17:21:20,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.35 vs. limit=15.0 2023-10-02 17:21:21,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:21:21,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:22,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 17:21:24,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:24,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:25,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:21:25,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:27,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:29,219 INFO [train.py:1046] (2/4) Epoch 28, batch 350, loss[loss=0.1533, simple_loss=0.2327, pruned_loss=0.03697, over 24594.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2433, pruned_loss=0.04412, over 3907398.35 frames. ], batch size: 60, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:21:30,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=958513.3333333334, ans=0.0 2023-10-02 17:21:31,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:21:31,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 17:21:33,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=958513.3333333334, ans=0.125 2023-10-02 17:21:34,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:21:42,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:43,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:45,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 17:21:46,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:21:46,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 17:21:48,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=958580.0, ans=0.125 2023-10-02 17:21:49,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:49,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 17:21:50,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:54,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 17:21:56,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:21:57,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:58,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:22:00,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:00,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:00,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:22:01,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:01,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:22:03,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:22:03,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:22:09,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:22:12,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:22:12,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:15,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 17:22:15,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:22:20,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:20,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:22,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:22:22,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 17:22:25,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:26,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 17:22:26,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 17:22:27,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:31,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:22:31,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 17:22:33,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:34,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:22:34,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=958780.0, ans=0.0 2023-10-02 17:22:37,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:37,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:37,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:40,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:40,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=958780.0, ans=0.2 2023-10-02 17:22:43,230 INFO [train.py:1046] (2/4) Epoch 28, batch 400, loss[loss=0.1507, simple_loss=0.2274, pruned_loss=0.03697, over 23468.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.243, pruned_loss=0.04387, over 4086077.07 frames. ], batch size: 134, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:22:43,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:22:45,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:22:46,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 17:22:46,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:46,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=958846.6666666666, ans=0.1 2023-10-02 17:22:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:49,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:22:49,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:22:50,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=958846.6666666666, ans=0.125 2023-10-02 17:22:52,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:53,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:22:56,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 17:22:58,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 17:22:58,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:00,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=958913.3333333334, ans=10.0 2023-10-02 17:23:01,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 17:23:01,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:23:04,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:23:04,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:04,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 17:23:05,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:23:05,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:23:05,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:05,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:23:08,577 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 17:23:08,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 17:23:13,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:13,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:23:14,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 17:23:16,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 17:23:18,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:23:21,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:23:27,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 17:23:27,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=959046.6666666666, ans=0.125 2023-10-02 17:23:30,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:23:31,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=959046.6666666666, ans=0.2 2023-10-02 17:23:32,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 17:23:34,764 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.847e+02 2.073e+02 2.548e+02 3.934e+02, threshold=4.147e+02, percent-clipped=1.0 2023-10-02 17:23:34,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:34,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:23:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 17:23:39,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:23:43,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:23:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:23:43,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=959113.3333333334, ans=0.1 2023-10-02 17:23:44,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=959113.3333333334, ans=0.0 2023-10-02 17:23:46,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:23:47,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 17:23:48,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:23:48,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 17:23:50,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:23:50,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:23:50,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=959113.3333333334, ans=0.125 2023-10-02 17:23:51,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 17:23:53,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:23:54,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:23:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:23:57,450 INFO [train.py:1046] (2/4) Epoch 28, batch 450, loss[loss=0.1706, simple_loss=0.2442, pruned_loss=0.04853, over 22851.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2438, pruned_loss=0.04384, over 4223406.58 frames. ], batch size: 322, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:23:57,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 17:23:57,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:23:57,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:59,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:23:59,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 17:24:00,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:24:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:24:03,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:24:11,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:12,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:24:15,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 17:24:15,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 17:24:18,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=959246.6666666666, ans=0.0 2023-10-02 17:24:18,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=959246.6666666666, ans=0.125 2023-10-02 17:24:19,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:24:22,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:23,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:24:28,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:24:29,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:24:31,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 17:24:31,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 17:24:34,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 17:24:34,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:24:34,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:24:36,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:24:37,876 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 17:24:37,886 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 17:24:37,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:39,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:24:40,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 17:24:40,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=959380.0, ans=0.125 2023-10-02 17:24:42,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=959380.0, ans=0.125 2023-10-02 17:24:45,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:24:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:24:46,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:24:46,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 17:24:49,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:24:51,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.97 vs. limit=15.0 2023-10-02 17:24:52,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:24:52,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:24:53,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 17:24:55,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=959446.6666666666, ans=0.125 2023-10-02 17:24:56,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:24:56,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 17:24:57,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 17:24:59,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:25:03,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:25:04,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:25:05,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:25:05,794 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 17:25:10,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:25:11,642 INFO [train.py:1046] (2/4) Epoch 28, batch 500, loss[loss=0.1748, simple_loss=0.2446, pruned_loss=0.05251, over 23672.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2446, pruned_loss=0.04412, over 4339874.74 frames. ], batch size: 232, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:25:11,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:25:11,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:25:11,734 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 17:25:13,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 17:25:13,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:25:15,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.25 vs. limit=15.0 2023-10-02 17:25:16,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:25:19,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:25:20,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:25:22,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:25:22,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:25:23,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:34,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:34,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:25:35,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:25:35,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:37,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 17:25:37,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:25:38,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:25:40,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:25:40,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:25:41,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:41,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 17:25:43,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=959646.6666666666, ans=0.125 2023-10-02 17:25:47,691 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 17:25:49,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:25:50,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:51,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:51,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:52,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:25:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 17:25:56,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:25:57,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:25:59,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=959713.3333333334, ans=0.1 2023-10-02 17:26:01,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=959713.3333333334, ans=0.0 2023-10-02 17:26:03,512 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.887e+02 2.063e+02 2.289e+02 3.276e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-02 17:26:03,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:06,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:26:11,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:26:15,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 17:26:15,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:15,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:26:18,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 17:26:20,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:26:22,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:25,368 INFO [train.py:1046] (2/4) Epoch 28, batch 550, loss[loss=0.1779, simple_loss=0.2645, pruned_loss=0.04569, over 24076.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2458, pruned_loss=0.04471, over 4412991.25 frames. ], batch size: 80, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:26:25,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 17:26:26,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 17:26:28,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:28,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 17:26:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:26:29,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:29,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:31,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:31,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:26:32,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:26:34,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:35,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 17:26:35,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:26:40,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:26:40,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:43,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:26:45,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:47,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 17:26:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 17:26:51,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:26:54,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=959980.0, ans=0.2 2023-10-02 17:26:55,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:26:55,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:26:56,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:27:02,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:03,338 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 17:27:04,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:27:04,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 17:27:08,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:27:09,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:27:09,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:27:10,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:12,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 17:27:12,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 17:27:13,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:13,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:27:13,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:27:13,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:27:16,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:27:18,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:27:21,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:27:21,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 17:27:24,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:27:27,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:27,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:27:27,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=960113.3333333334, ans=0.125 2023-10-02 17:27:28,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:29,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:27:29,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 17:27:36,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 17:27:38,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 17:27:40,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:27:41,849 INFO [train.py:1046] (2/4) Epoch 28, batch 600, loss[loss=0.1779, simple_loss=0.2638, pruned_loss=0.046, over 23987.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2459, pruned_loss=0.04477, over 4477441.51 frames. ], batch size: 86, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:27:41,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:27:41,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:49,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:27:50,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:27:52,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 17:27:54,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:27:55,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:27:58,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:01,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 17:28:01,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:28:05,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 17:28:09,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:28:09,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:28:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:28:16,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:28:16,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:28:21,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=960313.3333333334, ans=0.125 2023-10-02 17:28:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:28:26,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=960380.0, ans=0.0 2023-10-02 17:28:29,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:28:29,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:28:29,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:30,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=960380.0, ans=0.0 2023-10-02 17:28:32,648 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.25 vs. limit=15.0 2023-10-02 17:28:33,182 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.831e+02 2.042e+02 2.282e+02 3.339e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 17:28:36,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 17:28:40,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:28:40,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:28:45,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 17:28:47,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:28:48,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 17:28:49,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:28:49,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:28:55,749 INFO [train.py:1046] (2/4) Epoch 28, batch 650, loss[loss=0.1564, simple_loss=0.2428, pruned_loss=0.03505, over 24639.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2455, pruned_loss=0.04453, over 4530385.92 frames. ], batch size: 68, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:28:55,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:28:57,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:28:57,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:28:58,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:28:58,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=960513.3333333334, ans=0.1 2023-10-02 17:29:01,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:03,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 17:29:05,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:29:05,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=960513.3333333334, ans=0.2 2023-10-02 17:29:06,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=960513.3333333334, ans=0.2 2023-10-02 17:29:10,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:29:10,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:14,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:18,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 17:29:19,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:29:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:23,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:29:23,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 17:29:23,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=960646.6666666666, ans=0.95 2023-10-02 17:29:26,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:26,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:27,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:29:27,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:28,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:29:32,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:29:32,968 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 17:29:32,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:32,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:29:35,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:35,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:29:37,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:29:37,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:29:38,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 17:29:38,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:29:40,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:29:41,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:29:41,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:29:43,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:29:44,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 17:29:46,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 17:29:46,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:48,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:29:48,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:29:48,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:29:51,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:55,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:55,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:29:57,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:58,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=960780.0, ans=0.0 2023-10-02 17:29:59,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:29:59,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:30:01,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:30:08,008 INFO [train.py:1046] (2/4) Epoch 28, batch 700, loss[loss=0.1745, simple_loss=0.254, pruned_loss=0.04751, over 23480.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.243, pruned_loss=0.04374, over 4576830.70 frames. ], batch size: 120, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:30:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:30:08,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:08,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:30:09,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:15,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 17:30:15,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 17:30:18,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 17:30:18,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:21,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:30:24,543 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.36 vs. limit=15.0 2023-10-02 17:30:24,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 17:30:27,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:30:29,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:30:30,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:31,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.23 vs. limit=15.0 2023-10-02 17:30:32,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:30:32,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:30:34,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:36,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 17:30:37,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:30:37,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 17:30:40,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 17:30:45,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:30:45,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:30:47,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:30:47,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=960980.0, ans=0.125 2023-10-02 17:30:51,042 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.15 vs. limit=15.0 2023-10-02 17:30:52,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:30:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 17:30:56,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:56,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:30:56,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 17:31:00,697 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.837e+02 1.990e+02 2.293e+02 3.578e+02, threshold=3.980e+02, percent-clipped=0.0 2023-10-02 17:31:00,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:31:02,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:04,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.48 vs. limit=15.0 2023-10-02 17:31:04,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:09,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:31:09,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 17:31:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 17:31:14,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 17:31:18,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:19,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:31:20,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:31:22,853 INFO [train.py:1046] (2/4) Epoch 28, batch 750, loss[loss=0.1529, simple_loss=0.2316, pruned_loss=0.0371, over 24353.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04337, over 4613889.86 frames. ], batch size: 61, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:31:22,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:22,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 17:31:27,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 17:31:27,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 17:31:27,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 17:31:29,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 17:31:29,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 17:31:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:31:31,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 17:31:31,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:31,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:31:33,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:31:36,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:36,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:31:36,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:31:38,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:31:40,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:31:40,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:31:41,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:31:43,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:43,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=961246.6666666666, ans=0.125 2023-10-02 17:31:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 17:31:45,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=961246.6666666666, ans=0.2 2023-10-02 17:31:46,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:31:48,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:48,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:48,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=961246.6666666666, ans=0.125 2023-10-02 17:31:49,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:31:49,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 17:31:49,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:31:53,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 17:31:53,055 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 17:31:54,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 17:31:54,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:31:54,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:31:56,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:32:03,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:32:04,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:04,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:32:06,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:32:07,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 17:32:09,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:32:10,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=961380.0, ans=0.025 2023-10-02 17:32:11,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 17:32:12,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:32:14,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=961380.0, ans=0.5 2023-10-02 17:32:15,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:32:15,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 17:32:16,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:23,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:32:25,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:25,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=961446.6666666666, ans=0.0 2023-10-02 17:32:26,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=961446.6666666666, ans=0.1 2023-10-02 17:32:28,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:32:29,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 17:32:29,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:32:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:32:33,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:32:34,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:37,109 INFO [train.py:1046] (2/4) Epoch 28, batch 800, loss[loss=0.1655, simple_loss=0.2583, pruned_loss=0.03631, over 24430.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2439, pruned_loss=0.04406, over 4631143.69 frames. ], batch size: 69, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:32:37,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:37,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:32:37,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=961513.3333333334, ans=0.0 2023-10-02 17:32:40,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-02 17:32:44,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:44,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:46,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:32:46,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:47,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:47,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:48,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:53,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:54,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:32:57,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 17:32:57,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:59,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:59,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:32:59,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:32:59,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 17:32:59,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:00,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 17:33:03,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:06,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:09,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:33:09,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:33:12,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:12,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:15,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:33:15,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:33:17,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 17:33:18,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 17:33:19,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 17:33:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:33:19,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:33:21,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:21,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:33:27,250 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 17:33:27,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 17:33:28,511 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.943e+02 2.202e+02 2.642e+02 5.405e+02, threshold=4.403e+02, percent-clipped=5.0 2023-10-02 17:33:29,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:33:30,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:33:33,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:33:37,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:39,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 17:33:39,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:33:39,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=961780.0, ans=0.0 2023-10-02 17:33:40,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=961780.0, ans=0.2 2023-10-02 17:33:41,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 17:33:46,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:33:49,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:33:49,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 17:33:49,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=961846.6666666666, ans=0.2 2023-10-02 17:33:50,495 INFO [train.py:1046] (2/4) Epoch 28, batch 850, loss[loss=0.156, simple_loss=0.2301, pruned_loss=0.04093, over 23681.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2446, pruned_loss=0.04417, over 4649498.30 frames. ], batch size: 149, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:33:50,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:33:50,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:52,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 17:33:52,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:33:54,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:55,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:33:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:33:57,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=961846.6666666666, ans=0.125 2023-10-02 17:33:58,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:33:59,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 17:33:59,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 17:33:59,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 17:34:01,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:34:01,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:34:04,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:04,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:34:05,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:34:09,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:34:09,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:09,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 17:34:13,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 17:34:16,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:34:17,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 17:34:23,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 17:34:23,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 17:34:26,383 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 17:34:26,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:34:26,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:34:26,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 17:34:29,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:30,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:30,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 17:34:30,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=961980.0, ans=0.125 2023-10-02 17:34:33,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:34:35,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:35,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:34:36,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:34:37,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:34:39,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:34:40,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 17:34:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:34:43,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:34:44,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:34:44,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:34:46,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:50,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:52,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:34:52,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:34:54,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:34:54,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:34:59,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:35:01,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:35:01,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 17:35:01,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:35:01,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:35:04,354 INFO [train.py:1046] (2/4) Epoch 28, batch 900, loss[loss=0.1682, simple_loss=0.2596, pruned_loss=0.03836, over 24436.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.245, pruned_loss=0.04409, over 4680847.33 frames. ], batch size: 69, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:35:04,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 17:35:11,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:35:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:35:13,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 17:35:15,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:35:15,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 17:35:16,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 17:35:19,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:35:19,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:35:19,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:35:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:35:29,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:35:29,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:35:29,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=962246.6666666666, ans=0.0 2023-10-02 17:35:30,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:35:31,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.32 vs. limit=15.0 2023-10-02 17:35:33,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:35:38,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 17:35:40,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:35:42,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=962313.3333333334, ans=0.125 2023-10-02 17:35:46,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:35:46,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:35:46,360 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 17:35:47,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 17:35:53,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:35:53,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:35:53,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:35:57,653 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.851e+02 2.081e+02 2.371e+02 3.484e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-02 17:35:59,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:35:59,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:01,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 17:36:01,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:36:01,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 17:36:03,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:36:03,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=962446.6666666666, ans=0.125 2023-10-02 17:36:03,587 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:36:04,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:06,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:36:06,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:10,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 17:36:12,103 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 17:36:13,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:36:13,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 17:36:14,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:15,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=962446.6666666666, ans=0.0 2023-10-02 17:36:18,166 INFO [train.py:1046] (2/4) Epoch 28, batch 950, loss[loss=0.1551, simple_loss=0.2377, pruned_loss=0.03625, over 24305.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2456, pruned_loss=0.04456, over 4690802.90 frames. ], batch size: 61, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:36:18,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 17:36:23,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=962513.3333333334, ans=0.125 2023-10-02 17:36:24,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:26,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:27,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:27,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:36:30,346 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 17:36:33,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:34,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:36:35,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:35,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:36:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 17:36:37,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:36:39,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:40,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 17:36:42,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:44,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:44,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:44,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:46,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 17:36:49,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:36:49,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:36:51,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:36:57,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:36:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:37:01,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 17:37:02,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=962713.3333333334, ans=0.125 2023-10-02 17:37:03,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 17:37:03,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:37:04,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:05,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:05,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:37:10,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 17:37:13,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:37:14,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:14,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:14,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 17:37:14,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:37:14,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:37:15,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 17:37:16,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.31 vs. limit=22.5 2023-10-02 17:37:17,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=962780.0, ans=10.0 2023-10-02 17:37:18,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:37:19,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:37:25,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:37:27,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 17:37:28,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 17:37:32,591 INFO [train.py:1046] (2/4) Epoch 28, batch 1000, loss[loss=0.1628, simple_loss=0.2374, pruned_loss=0.04404, over 23630.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2449, pruned_loss=0.04445, over 4703889.51 frames. ], batch size: 149, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:37:32,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:36,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 17:37:36,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:37:43,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:37:44,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 17:37:44,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 17:37:48,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:37:48,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:37:49,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:51,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 17:37:56,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 17:37:59,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 17:37:59,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:01,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 17:38:01,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=962980.0, ans=0.125 2023-10-02 17:38:03,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 17:38:04,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 17:38:05,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:05,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=962980.0, ans=0.0 2023-10-02 17:38:06,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:13,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:38:13,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:38:15,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:17,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:17,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 17:38:17,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:18,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:38:18,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:38:19,939 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 17:38:23,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 17:38:23,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=963046.6666666666, ans=0.0 2023-10-02 17:38:24,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 17:38:25,895 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.974e+02 2.112e+02 2.590e+02 4.842e+02, threshold=4.225e+02, percent-clipped=1.0 2023-10-02 17:38:25,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 17:38:27,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:38:34,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:34,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:38:34,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:35,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:38:38,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 17:38:38,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=963113.3333333334, ans=0.125 2023-10-02 17:38:39,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:38:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 17:38:39,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 17:38:42,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:38:42,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:43,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:38:46,982 INFO [train.py:1046] (2/4) Epoch 28, batch 1050, loss[loss=0.1413, simple_loss=0.2212, pruned_loss=0.03068, over 24439.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2437, pruned_loss=0.04388, over 4705739.50 frames. ], batch size: 58, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:38:47,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:38:48,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=963180.0, ans=0.5 2023-10-02 17:38:49,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:51,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:38:53,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:38:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:38:54,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:56,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=963180.0, ans=0.125 2023-10-02 17:38:57,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:39:00,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:39:01,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:39:03,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:39:03,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:39:03,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:39:05,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:39:06,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 17:39:07,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:39:09,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 17:39:10,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:39:10,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 17:39:10,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:39:14,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:39:16,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:39:16,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:39:19,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 17:39:19,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 17:39:19,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:39:24,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 17:39:26,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 17:39:28,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:31,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 17:39:33,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 17:39:35,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:39:35,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:39:39,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:39:41,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 17:39:43,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 17:39:43,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=963380.0, ans=0.125 2023-10-02 17:39:44,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 17:39:44,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:39:44,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:39:46,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 17:39:47,214 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-10-02 17:39:49,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:39:51,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:39:51,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:39:52,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:39:52,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:56,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=963446.6666666666, ans=0.125 2023-10-02 17:39:57,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:57,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 17:39:57,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=963446.6666666666, ans=0.1 2023-10-02 17:39:58,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:39:58,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 17:39:58,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 17:40:00,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:40:01,626 INFO [train.py:1046] (2/4) Epoch 28, batch 1100, loss[loss=0.1371, simple_loss=0.219, pruned_loss=0.02759, over 24307.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2428, pruned_loss=0.04345, over 4708302.91 frames. ], batch size: 56, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:40:03,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:40:09,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:40:13,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:40:15,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:40:15,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:40:16,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 17:40:16,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:40:18,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:40:21,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:40:21,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=963580.0, ans=0.125 2023-10-02 17:40:22,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:40:22,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 17:40:24,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:40:25,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:40:25,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:40:28,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:40:30,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:40:33,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:40:36,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=963646.6666666666, ans=0.0 2023-10-02 17:40:38,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 17:40:39,462 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 17:40:39,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:40,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:41,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=963646.6666666666, ans=0.125 2023-10-02 17:40:42,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:40:42,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:40:43,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 17:40:45,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:40:45,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:40:45,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:40:46,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:46,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 17:40:51,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:40:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 17:40:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:40:54,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=963713.3333333334, ans=0.125 2023-10-02 17:40:56,022 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.777e+02 1.908e+02 2.106e+02 3.203e+02, threshold=3.817e+02, percent-clipped=0.0 2023-10-02 17:40:59,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:41:02,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 17:41:02,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:41:03,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:04,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=963780.0, ans=0.1 2023-10-02 17:41:06,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:06,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:41:08,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 17:41:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:41:08,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:41:09,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 17:41:09,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:41:10,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 17:41:12,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:41:13,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:41:14,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:41:16,118 INFO [train.py:1046] (2/4) Epoch 28, batch 1150, loss[loss=0.1676, simple_loss=0.2575, pruned_loss=0.03879, over 24289.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2439, pruned_loss=0.04347, over 4717406.82 frames. ], batch size: 74, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:41:17,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:20,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:41:22,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:41:22,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 17:41:22,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:41:25,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 17:41:27,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:27,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:41:27,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=963846.6666666666, ans=0.0 2023-10-02 17:41:30,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 17:41:31,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:36,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:36,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 17:41:36,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:41:36,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:41:40,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 17:41:41,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:43,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:46,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=963980.0, ans=0.0 2023-10-02 17:41:52,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:57,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=963980.0, ans=0.1 2023-10-02 17:41:58,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:58,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 17:42:00,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:00,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:03,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=964046.6666666666, ans=10.0 2023-10-02 17:42:08,702 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 17:42:08,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:09,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=964046.6666666666, ans=0.125 2023-10-02 17:42:15,713 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 17:42:21,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:22,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:42:22,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:42:22,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:42:27,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:42:30,689 INFO [train.py:1046] (2/4) Epoch 28, batch 1200, loss[loss=0.1913, simple_loss=0.2644, pruned_loss=0.05913, over 23275.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2442, pruned_loss=0.04356, over 4729180.48 frames. ], batch size: 93, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:42:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:42:32,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:42:34,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:42:34,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:34,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:42:37,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:42:39,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:42:40,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:42:40,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:43,514 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 17:42:44,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 17:42:46,564 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:42:47,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:42:50,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:42:52,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:42:52,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=964246.6666666666, ans=0.2 2023-10-02 17:42:53,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:42:53,832 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 17:42:55,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:55,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.83 vs. limit=15.0 2023-10-02 17:43:01,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:43:01,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:43:01,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 17:43:03,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:43:08,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 17:43:14,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 17:43:14,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:43:15,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:43:17,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:43:17,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:43:18,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:43:18,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:43:18,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:43:20,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 17:43:20,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:43:21,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:43:21,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:43:21,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=964380.0, ans=0.125 2023-10-02 17:43:22,956 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.818e+02 2.090e+02 2.388e+02 3.203e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 17:43:23,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:43:23,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:43:27,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:43:30,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:43:33,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 17:43:36,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 17:43:39,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:43:42,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:43:43,959 INFO [train.py:1046] (2/4) Epoch 28, batch 1250, loss[loss=0.1487, simple_loss=0.2242, pruned_loss=0.03659, over 24458.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2458, pruned_loss=0.04445, over 4717607.52 frames. ], batch size: 58, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:43:44,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:43:44,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=964513.3333333334, ans=0.1 2023-10-02 17:43:45,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:43:46,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 17:43:51,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:43:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:43:53,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 17:43:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:43:57,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:44:01,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:44:01,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:44:02,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=964580.0, ans=0.95 2023-10-02 17:44:03,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:44:03,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:44:04,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:44:10,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:44:10,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:44:10,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:11,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:44:12,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:14,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:16,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:44:21,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 17:44:21,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:44:22,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:44:23,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 17:44:23,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:44:23,922 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 17:44:23,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:23,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:28,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:33,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:33,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:44:35,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 17:44:35,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 17:44:35,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 17:44:38,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:44:39,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 17:44:39,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:42,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 17:44:42,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:44:44,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 17:44:44,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:44:46,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:44:46,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 17:44:48,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:44:48,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 17:44:51,796 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.96 vs. limit=6.0 2023-10-02 17:44:52,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:52,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=964780.0, ans=0.1 2023-10-02 17:44:53,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:44:54,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:44:56,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:44:57,479 INFO [train.py:1046] (2/4) Epoch 28, batch 1300, loss[loss=0.1737, simple_loss=0.2521, pruned_loss=0.0477, over 23381.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2461, pruned_loss=0.04428, over 4730753.20 frames. ], batch size: 93, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:44:59,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:59,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 17:45:02,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:45:04,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:45:05,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:45:07,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:45:08,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:45:10,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 17:45:15,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=964913.3333333334, ans=0.95 2023-10-02 17:45:16,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:45:17,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:45:18,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 17:45:21,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:45:25,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:45:26,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=964980.0, ans=0.125 2023-10-02 17:45:27,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:45:28,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:45:30,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:45:30,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:45:32,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:45:32,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 17:45:38,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:45:38,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:45:39,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 17:45:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:45:41,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:45:42,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:45:43,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 17:45:45,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:45:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 17:45:47,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:45:49,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=965046.6666666666, ans=0.125 2023-10-02 17:45:50,982 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.887e+02 2.082e+02 2.300e+02 3.182e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 17:45:51,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:45:51,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:45:54,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 17:45:55,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 17:45:55,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 17:45:59,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:46:03,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 17:46:03,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=965113.3333333334, ans=0.2 2023-10-02 17:46:04,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:46:12,132 INFO [train.py:1046] (2/4) Epoch 28, batch 1350, loss[loss=0.1702, simple_loss=0.2441, pruned_loss=0.0482, over 23483.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2449, pruned_loss=0.04418, over 4721298.15 frames. ], batch size: 120, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:46:13,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 17:46:16,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:46:19,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:22,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:46:22,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:46:25,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:46:25,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:46:25,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=965246.6666666666, ans=0.1 2023-10-02 17:46:27,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:46:29,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 17:46:31,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:46:32,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:46:34,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 17:46:34,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:46:36,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:46:36,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 17:46:38,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 17:46:40,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 17:46:42,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:42,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 17:46:45,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=965313.3333333334, ans=0.125 2023-10-02 17:46:52,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=965313.3333333334, ans=0.125 2023-10-02 17:46:55,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:47:02,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:47:02,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:04,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 17:47:04,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=965380.0, ans=0.2 2023-10-02 17:47:06,534 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.80 vs. limit=10.0 2023-10-02 17:47:07,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:07,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 17:47:07,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:47:07,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.22 vs. limit=6.0 2023-10-02 17:47:08,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:47:10,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=965446.6666666666, ans=0.0 2023-10-02 17:47:10,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=965446.6666666666, ans=0.125 2023-10-02 17:47:11,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:47:13,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 17:47:14,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:47:16,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=965446.6666666666, ans=0.125 2023-10-02 17:47:19,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 17:47:20,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 17:47:22,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=965446.6666666666, ans=0.125 2023-10-02 17:47:25,897 INFO [train.py:1046] (2/4) Epoch 28, batch 1400, loss[loss=0.1808, simple_loss=0.2528, pruned_loss=0.05443, over 23420.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.243, pruned_loss=0.04413, over 4709146.75 frames. ], batch size: 120, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:47:26,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 17:47:27,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:30,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:47:31,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:47:31,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=965513.3333333334, ans=0.2 2023-10-02 17:47:36,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 17:47:37,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 17:47:47,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:47:49,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:47:50,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=965580.0, ans=0.125 2023-10-02 17:47:51,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:47:51,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:47:54,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:47:55,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 17:48:06,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:07,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:11,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 17:48:12,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:48:12,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:48:12,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:48:13,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:48:15,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:48:15,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:48:15,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:48:15,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 17:48:16,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:48:19,941 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.826e+02 2.070e+02 2.525e+02 5.054e+02, threshold=4.140e+02, percent-clipped=1.0 2023-10-02 17:48:20,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:24,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:48:27,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=965780.0, ans=0.2 2023-10-02 17:48:28,512 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:48:31,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 17:48:33,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:48:33,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:48:36,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 17:48:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:48:39,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:48:40,600 INFO [train.py:1046] (2/4) Epoch 28, batch 1450, loss[loss=0.1665, simple_loss=0.2389, pruned_loss=0.04707, over 23491.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2424, pruned_loss=0.04349, over 4713032.19 frames. ], batch size: 256, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:48:43,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:48:45,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:48:45,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:45,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 17:48:51,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:48:51,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:48:52,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:48:54,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 17:48:54,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:48:55,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 17:48:57,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:58,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:48:58,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 17:48:59,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:48:59,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:48:59,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 17:48:59,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:01,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:49:03,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=965913.3333333334, ans=0.0 2023-10-02 17:49:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:06,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:08,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=965980.0, ans=0.2 2023-10-02 17:49:10,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:49:10,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:49:12,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:49:13,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:14,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:14,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:49:14,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:16,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:16,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.98 vs. limit=15.0 2023-10-02 17:49:20,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 17:49:23,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:49:25,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=966046.6666666666, ans=0.0 2023-10-02 17:49:27,816 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 17:49:27,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:49:29,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:49:30,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:49:30,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 17:49:34,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:34,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 17:49:36,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 17:49:38,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:49:42,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:49:42,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:49:44,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 17:49:46,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 17:49:46,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 17:49:48,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:49,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:49:54,981 INFO [train.py:1046] (2/4) Epoch 28, batch 1500, loss[loss=0.1876, simple_loss=0.2596, pruned_loss=0.0578, over 23690.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2427, pruned_loss=0.04362, over 4714804.15 frames. ], batch size: 232, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:49:59,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=966180.0, ans=0.125 2023-10-02 17:50:00,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 17:50:00,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:50:00,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:50:00,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:50:01,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:50:01,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:50:03,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 17:50:05,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:50:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:50:05,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:50:06,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:50:09,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:50:09,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:50:13,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:50:13,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 17:50:15,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:50:15,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:50:17,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:50:20,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 17:50:24,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 17:50:26,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:50:26,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 17:50:30,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:50:31,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:50:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:50:31,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:50:33,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 17:50:35,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:50:35,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:50:36,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 17:50:36,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:50:41,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.65 vs. limit=15.0 2023-10-02 17:50:42,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:50:42,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 17:50:46,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:50:47,939 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.911e+02 2.171e+02 2.439e+02 3.664e+02, threshold=4.341e+02, percent-clipped=0.0 2023-10-02 17:50:48,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:50:51,394 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 17:50:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:50:52,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 17:50:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:50:55,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:50:56,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 17:50:57,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:50:58,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 17:51:00,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:01,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:51:01,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:03,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:51:03,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:04,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:51:06,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 17:51:07,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 17:51:07,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:51:07,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 17:51:08,836 INFO [train.py:1046] (2/4) Epoch 28, batch 1550, loss[loss=0.1664, simple_loss=0.2357, pruned_loss=0.0486, over 23681.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2439, pruned_loss=0.04416, over 4710904.25 frames. ], batch size: 232, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:51:10,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 17:51:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:51:12,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:13,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:51:13,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=966513.3333333334, ans=0.125 2023-10-02 17:51:14,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:51:15,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:17,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:21,122 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 17:51:21,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:21,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:51:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:51:23,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:51:23,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 17:51:25,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:51:26,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 17:51:27,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 17:51:27,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 17:51:27,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:30,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:51:34,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:51:35,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 17:51:35,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 17:51:43,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:51:46,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:51:46,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:51:46,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:51:47,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 17:51:52,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:51:55,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:58,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:52:01,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:52:01,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:52:02,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 17:52:02,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:52:04,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:52:04,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:52:05,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 17:52:05,485 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 17:52:05,699 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:52:08,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:14,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 17:52:19,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:52:20,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:52:20,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 17:52:22,668 INFO [train.py:1046] (2/4) Epoch 28, batch 1600, loss[loss=0.1571, simple_loss=0.2459, pruned_loss=0.03417, over 24512.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2446, pruned_loss=0.04443, over 4717622.43 frames. ], batch size: 71, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:52:24,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:52:25,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:52:25,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:52:25,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:52:26,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:52:30,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:32,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 17:52:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 17:52:33,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 17:52:34,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.41 vs. limit=15.0 2023-10-02 17:52:34,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:52:35,394 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.60 vs. limit=22.5 2023-10-02 17:52:36,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 17:52:36,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:52:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:52:40,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=966913.3333333334, ans=0.0 2023-10-02 17:52:44,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:52:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 17:52:48,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:52:50,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 17:52:50,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:51,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 17:52:56,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 17:53:05,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:53:05,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 17:53:06,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:53:06,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:53:06,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:53:09,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 17:53:12,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 17:53:14,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:53:15,487 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.365e+02 1.816e+02 2.018e+02 2.318e+02 3.311e+02, threshold=4.036e+02, percent-clipped=0.0 2023-10-02 17:53:15,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:16,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:16,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:53:18,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:53:19,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:53:20,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:53:25,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=967113.3333333334, ans=0.125 2023-10-02 17:53:26,246 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.05 vs. limit=6.0 2023-10-02 17:53:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:28,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:53:32,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 17:53:32,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:53:32,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 17:53:36,102 INFO [train.py:1046] (2/4) Epoch 28, batch 1650, loss[loss=0.1778, simple_loss=0.2404, pruned_loss=0.05755, over 23428.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2451, pruned_loss=0.04484, over 4709760.20 frames. ], batch size: 285, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:53:36,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=967180.0, ans=0.125 2023-10-02 17:53:37,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:53:38,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:53:39,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:53:39,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 17:53:40,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 17:53:40,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 17:53:40,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 17:53:43,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:43,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:53:43,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:53:45,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:53:47,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:53:49,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 17:53:51,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:53:51,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:53:51,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:53:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:53:53,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 17:53:53,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 17:53:59,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=967246.6666666666, ans=0.125 2023-10-02 17:54:00,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:54:03,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:54:09,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 17:54:11,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:13,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 17:54:16,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:19,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:54:19,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:54:19,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:20,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:54:20,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:23,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:54:25,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:25,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:54:26,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:54:28,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:54:28,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:54:32,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:54:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 17:54:33,806 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.86 vs. limit=15.0 2023-10-02 17:54:34,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:54:34,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 17:54:36,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 17:54:37,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 17:54:37,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:54:37,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:54:37,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=967446.6666666666, ans=0.0 2023-10-02 17:54:38,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:38,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:38,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 17:54:42,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:44,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:54:45,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:46,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 17:54:49,265 INFO [train.py:1046] (2/4) Epoch 28, batch 1700, loss[loss=0.166, simple_loss=0.2523, pruned_loss=0.03982, over 23529.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2445, pruned_loss=0.04457, over 4705904.78 frames. ], batch size: 94, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:54:50,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:50,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:54:50,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 17:54:50,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:54:52,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:54:52,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:54:53,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:54:53,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:54:53,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 17:54:56,684 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.43 vs. limit=15.0 2023-10-02 17:54:57,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:54:59,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.65 vs. limit=15.0 2023-10-02 17:55:04,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:07,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:55:13,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:55:13,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:55:14,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:55:14,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:55:17,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 17:55:20,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:55:20,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:21,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:55:25,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:55:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 17:55:26,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 17:55:28,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:28,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=967646.6666666666, ans=0.0 2023-10-02 17:55:29,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 17:55:31,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:55:34,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=967713.3333333334, ans=0.125 2023-10-02 17:55:38,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:38,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:55:40,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:55:41,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:55:41,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 17:55:42,881 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.838e+02 2.056e+02 2.244e+02 3.312e+02, threshold=4.111e+02, percent-clipped=0.0 2023-10-02 17:55:42,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:55:44,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:44,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 17:55:46,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:55:46,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:55:46,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:46,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:55:47,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:55:47,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:55:49,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:55:50,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:55:50,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:53,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=967780.0, ans=0.04949747468305833 2023-10-02 17:55:54,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:57,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 17:55:58,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:59,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:56:01,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 17:56:01,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=967780.0, ans=0.125 2023-10-02 17:56:04,363 INFO [train.py:1046] (2/4) Epoch 28, batch 1750, loss[loss=0.1483, simple_loss=0.2152, pruned_loss=0.04069, over 23458.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2435, pruned_loss=0.04449, over 4705538.61 frames. ], batch size: 285, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:56:05,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:09,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:09,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:56:09,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 17:56:09,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:56:13,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:56:13,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:17,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 17:56:19,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:21,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 17:56:21,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:56:23,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:56:25,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 17:56:27,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 17:56:29,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:56:29,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 17:56:38,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:56:39,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:56:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:56:40,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=967980.0, ans=0.125 2023-10-02 17:56:42,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:42,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:56:44,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=967980.0, ans=0.0 2023-10-02 17:56:45,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:56:48,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:49,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:56:49,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:56:51,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 17:56:53,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:56:56,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 17:56:57,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:56:59,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:59,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:57:00,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=968046.6666666666, ans=0.0 2023-10-02 17:57:00,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=968046.6666666666, ans=0.0 2023-10-02 17:57:03,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:57:04,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:57:06,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:57:07,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:57:08,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=968113.3333333334, ans=0.125 2023-10-02 17:57:08,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=968113.3333333334, ans=0.0 2023-10-02 17:57:09,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=968113.3333333334, ans=0.2 2023-10-02 17:57:11,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:57:14,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:57:14,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=968113.3333333334, ans=0.0 2023-10-02 17:57:16,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:57:16,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 17:57:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:57:17,446 INFO [train.py:1046] (2/4) Epoch 28, batch 1800, loss[loss=0.1803, simple_loss=0.2644, pruned_loss=0.04808, over 24306.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2432, pruned_loss=0.04412, over 4714555.52 frames. ], batch size: 77, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:57:17,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:57:17,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:17,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:57:17,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:57:19,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:57:22,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:57:24,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:57:25,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:57:28,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:57:33,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 17:57:33,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:57:33,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=968246.6666666666, ans=0.125 2023-10-02 17:57:36,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:57:39,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:39,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:39,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=968246.6666666666, ans=0.1 2023-10-02 17:57:41,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:57:42,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:57:42,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 17:57:43,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:57:44,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.27 vs. limit=15.0 2023-10-02 17:57:46,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:57:46,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=968313.3333333334, ans=0.125 2023-10-02 17:57:49,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 17:57:52,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 17:57:52,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 17:57:54,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:57:54,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:54,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:57:55,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:58:02,340 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 17:58:05,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:58:06,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:09,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 17:58:09,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 17:58:09,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:58:10,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:58:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:58:12,306 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.835e+02 1.992e+02 2.220e+02 3.215e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 17:58:16,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 17:58:20,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:58:21,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 17:58:22,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:58:22,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:58:23,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:58:23,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 17:58:26,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:58:27,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:58:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 17:58:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:58:32,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:58:33,426 INFO [train.py:1046] (2/4) Epoch 28, batch 1850, loss[loss=0.1909, simple_loss=0.2649, pruned_loss=0.05849, over 23891.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2435, pruned_loss=0.04456, over 4705435.22 frames. ], batch size: 195, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:58:33,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:58:33,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:34,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:34,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:58:36,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.79 vs. limit=10.0 2023-10-02 17:58:37,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:58:37,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:58:40,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:58:42,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:58:46,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:58:46,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 17:58:48,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=968580.0, ans=0.125 2023-10-02 17:58:51,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 17:58:53,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 17:58:57,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:58:57,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 17:58:57,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:59:08,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:59:08,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 17:59:12,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:59:12,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:59:16,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 17:59:16,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:18,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 17:59:19,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:59:21,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:59:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:59:25,771 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.99 vs. limit=15.0 2023-10-02 17:59:26,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:59:26,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:28,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 17:59:28,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:30,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:59:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:59:33,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=968780.0, ans=0.125 2023-10-02 17:59:36,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 17:59:36,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:59:39,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:59:40,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:59:40,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 17:59:40,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 17:59:42,288 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 17:59:42,362 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 17:59:45,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:59:45,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:59:45,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:59:45,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:45,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 17:59:46,910 INFO [train.py:1046] (2/4) Epoch 28, batch 1900, loss[loss=0.159, simple_loss=0.2442, pruned_loss=0.03689, over 24412.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2445, pruned_loss=0.04469, over 4706497.25 frames. ], batch size: 69, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:59:46,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:59:47,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:48,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:59:49,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:59:51,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:59:51,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 17:59:53,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:53,707 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 17:59:53,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:59:55,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:59,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:00:01,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:00:02,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 18:00:04,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 18:00:06,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:00:06,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:00:06,154 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 18:00:07,527 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 18:00:11,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 18:00:13,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:00:18,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 18:00:18,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=968980.0, ans=0.1 2023-10-02 18:00:19,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 18:00:26,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 18:00:29,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 18:00:29,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:00:29,369 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 18:00:29,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 18:00:31,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 18:00:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 18:00:31,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:00:35,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 18:00:36,781 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:00:37,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:00:40,419 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.863e+02 2.033e+02 2.281e+02 3.695e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 18:00:41,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:00:41,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 18:00:41,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=969046.6666666666, ans=0.125 2023-10-02 18:00:44,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:00:46,227 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:00:47,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 18:00:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:00:51,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:00:52,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:00:52,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:00:53,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:00:54,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:00:54,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:00:55,150 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.73 vs. limit=10.0 2023-10-02 18:00:56,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:00:58,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:00:58,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:01:00,731 INFO [train.py:1046] (2/4) Epoch 28, batch 1950, loss[loss=0.1792, simple_loss=0.251, pruned_loss=0.05367, over 22853.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2447, pruned_loss=0.04423, over 4715291.04 frames. ], batch size: 322, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:01:00,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:01:00,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:01:02,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:01:03,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:01:05,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:01:05,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=969180.0, ans=0.035 2023-10-02 18:01:08,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:01:08,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:08,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:01:11,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 18:01:11,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 18:01:11,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:12,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:15,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:01:15,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:01:15,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:17,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:01:20,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:01:20,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:01:20,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:01:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:25,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:26,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=969246.6666666666, ans=10.0 2023-10-02 18:01:28,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:01:28,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:01:30,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:01:30,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 18:01:32,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:01:32,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:01:32,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:35,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=969313.3333333334, ans=0.125 2023-10-02 18:01:37,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:38,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:01:43,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:01:47,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:01:47,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:01:49,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 18:01:49,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:01:53,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:01:53,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:01:53,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.09 vs. limit=6.0 2023-10-02 18:01:54,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:01:56,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=15.0 2023-10-02 18:02:01,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:03,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:05,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:06,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:02:09,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:02:09,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:02:10,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 18:02:11,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:02:12,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:02:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 18:02:15,242 INFO [train.py:1046] (2/4) Epoch 28, batch 2000, loss[loss=0.1726, simple_loss=0.2573, pruned_loss=0.04395, over 24427.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2456, pruned_loss=0.04443, over 4718777.20 frames. ], batch size: 69, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:02:15,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:02:18,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:02:18,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=969513.3333333334, ans=0.0 2023-10-02 18:02:19,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:02:19,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:02:22,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:02:23,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:25,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 18:02:25,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:02:29,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:02:31,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 18:02:32,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:02:32,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:02:37,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:02:37,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 18:02:37,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=969580.0, ans=0.125 2023-10-02 18:02:38,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 18:02:40,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:02:42,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=969580.0, ans=0.1 2023-10-02 18:02:43,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 18:02:44,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:02:47,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:02:48,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:02:48,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:48,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:02:51,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:02:52,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 18:02:53,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 18:02:53,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:02:53,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:02:57,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:59,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:02:59,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:02:59,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:03:01,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:03:03,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:04,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:03:04,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:05,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:07,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:03:08,506 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.888e+02 2.044e+02 2.349e+02 3.109e+02, threshold=4.088e+02, percent-clipped=0.0 2023-10-02 18:03:08,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 18:03:08,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=969713.3333333334, ans=0.125 2023-10-02 18:03:13,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:03:14,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:18,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:18,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:03:23,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:26,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:03:26,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:26,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:03:26,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:03:28,968 INFO [train.py:1046] (2/4) Epoch 28, batch 2050, loss[loss=0.1484, simple_loss=0.2335, pruned_loss=0.03161, over 24430.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2441, pruned_loss=0.04419, over 4717026.95 frames. ], batch size: 63, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:03:30,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:03:35,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:39,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:03:41,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:03:41,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:42,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:03:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 18:03:44,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:03:47,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:47,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:03:56,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:03:56,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:57,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 18:04:00,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:04:01,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 18:04:01,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:04:04,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:04:07,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:09,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:04:09,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:04:09,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=969980.0, ans=0.2 2023-10-02 18:04:10,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:04:11,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:04:11,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:04:13,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=970046.6666666666, ans=0.125 2023-10-02 18:04:15,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:17,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:04:20,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:04:20,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:04:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:04:30,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:04:30,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 18:04:34,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=970113.3333333334, ans=0.125 2023-10-02 18:04:37,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:04:38,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:04:39,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:04:41,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 18:04:42,478 INFO [train.py:1046] (2/4) Epoch 28, batch 2100, loss[loss=0.1861, simple_loss=0.2547, pruned_loss=0.05875, over 23789.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2432, pruned_loss=0.04445, over 4713531.79 frames. ], batch size: 164, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:04:44,400 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 18:04:44,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:04:44,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:45,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:04:46,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.12 vs. limit=10.0 2023-10-02 18:04:47,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:04:47,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 18:04:47,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 18:04:48,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:04:48,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=970180.0, ans=0.0 2023-10-02 18:04:51,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:04:52,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:04:54,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:04:54,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=970180.0, ans=0.125 2023-10-02 18:04:55,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:04:55,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 18:04:56,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:04:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 18:04:56,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 18:04:58,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:00,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:05:00,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 18:05:00,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:05:00,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=970246.6666666666, ans=0.125 2023-10-02 18:05:03,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 18:05:03,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:05:03,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=970246.6666666666, ans=0.125 2023-10-02 18:05:08,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:05:09,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:05:12,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:05:13,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 18:05:13,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:13,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 18:05:15,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 18:05:15,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:15,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 18:05:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 18:05:15,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 18:05:18,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:05:20,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=970313.3333333334, ans=0.0 2023-10-02 18:05:21,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:05:21,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=970313.3333333334, ans=0.0 2023-10-02 18:05:22,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:05:22,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=970313.3333333334, ans=0.125 2023-10-02 18:05:24,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:05:25,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:26,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:26,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 18:05:26,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:26,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:27,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=970380.0, ans=0.2 2023-10-02 18:05:28,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:28,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 18:05:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 18:05:30,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=970380.0, ans=0.125 2023-10-02 18:05:32,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 18:05:34,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.40 vs. limit=22.5 2023-10-02 18:05:35,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:05:38,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:05:39,456 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.839e+02 2.053e+02 2.400e+02 3.677e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-02 18:05:39,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 18:05:43,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=970446.6666666666, ans=0.125 2023-10-02 18:05:44,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:44,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=970446.6666666666, ans=0.2 2023-10-02 18:05:45,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=970446.6666666666, ans=0.125 2023-10-02 18:05:47,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:05:47,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:05:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:05:47,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=970446.6666666666, ans=0.125 2023-10-02 18:05:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 18:05:48,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:05:48,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=970446.6666666666, ans=0.0 2023-10-02 18:05:49,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:49,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:05:49,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:05:49,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:52,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 18:05:54,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 18:05:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:05:56,851 INFO [train.py:1046] (2/4) Epoch 28, batch 2150, loss[loss=0.1625, simple_loss=0.2318, pruned_loss=0.04654, over 23433.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2414, pruned_loss=0.04414, over 4700514.14 frames. ], batch size: 285, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:05:56,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:56,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:05:57,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:05:58,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:06:04,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 18:06:06,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:08,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:11,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:06:11,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:11,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:06:14,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=970580.0, ans=0.0 2023-10-02 18:06:15,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:16,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:06:16,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:06:19,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:20,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 18:06:21,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.49 vs. limit=15.0 2023-10-02 18:06:23,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:25,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:06:25,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:25,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:26,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:26,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:06:26,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:26,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:06:27,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:06:29,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 18:06:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:06:33,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:33,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:34,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:06:37,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:06:39,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:39,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=970646.6666666666, ans=0.125 2023-10-02 18:06:40,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:06:40,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:41,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 18:06:42,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:06:45,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:45,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:47,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:48,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:06:49,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:49,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:49,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 18:06:51,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 18:06:52,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:06:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 18:06:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:52,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:06:53,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 18:06:53,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:06:53,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 18:06:54,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 18:06:54,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 18:06:55,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 18:06:56,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:58,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:58,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:06:58,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:59,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:07:00,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.80 vs. limit=22.5 2023-10-02 18:07:00,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:07:00,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:09,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:07:10,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 18:07:11,931 INFO [train.py:1046] (2/4) Epoch 28, batch 2200, loss[loss=0.1451, simple_loss=0.2184, pruned_loss=0.03589, over 21504.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2415, pruned_loss=0.04426, over 4707474.03 frames. ], batch size: 47, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:07:14,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:07:17,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:18,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:07:20,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:07:20,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:07:24,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:07:24,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:07:24,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 18:07:27,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.28 vs. limit=15.0 2023-10-02 18:07:27,928 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.75 vs. limit=15.0 2023-10-02 18:07:28,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 18:07:30,534 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.12 vs. limit=10.0 2023-10-02 18:07:31,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:07:37,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 18:07:39,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:40,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:07:40,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:07:44,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:07:44,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 18:07:47,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:07:48,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:49,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 18:07:53,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:07:54,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:07:55,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:07:57,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:07:57,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=971046.6666666666, ans=0.0 2023-10-02 18:08:00,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 18:08:01,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:02,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 18:08:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:06,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:08:06,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:06,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=971046.6666666666, ans=0.2 2023-10-02 18:08:07,323 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.865e+02 2.059e+02 2.462e+02 3.335e+02, threshold=4.117e+02, percent-clipped=0.0 2023-10-02 18:08:08,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:08:08,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:08:10,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:10,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:10,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:08:12,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:08:14,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:08:16,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 18:08:17,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:08:20,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:08:20,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 18:08:23,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:08:24,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 18:08:25,918 INFO [train.py:1046] (2/4) Epoch 28, batch 2250, loss[loss=0.1696, simple_loss=0.2577, pruned_loss=0.04077, over 24546.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.243, pruned_loss=0.04477, over 4699072.52 frames. ], batch size: 71, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:08:25,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:08:26,039 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 18:08:26,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=15.0 2023-10-02 18:08:27,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:08:27,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.56 vs. limit=15.0 2023-10-02 18:08:28,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:08:28,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:08:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 18:08:31,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:08:33,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:08:37,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:08:37,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:08:37,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=971180.0, ans=0.05 2023-10-02 18:08:40,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:08:42,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:08:44,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:08:45,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 18:08:46,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:47,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:08:48,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 18:08:49,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:08:49,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:08:50,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=971246.6666666666, ans=0.0 2023-10-02 18:08:52,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:08:55,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:08:56,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:08:56,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=971313.3333333334, ans=0.0 2023-10-02 18:08:58,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:08:59,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 18:09:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:09:02,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=971313.3333333334, ans=0.125 2023-10-02 18:09:03,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:09:08,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:09:09,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=971380.0, ans=0.125 2023-10-02 18:09:11,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:09:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:12,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:09:15,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:09:17,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:09:19,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=971380.0, ans=0.125 2023-10-02 18:09:22,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:09:24,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:09:27,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:09:28,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:09:28,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:09:32,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:09:34,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:09:34,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 18:09:34,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:35,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:09:36,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.13 vs. limit=15.0 2023-10-02 18:09:38,380 INFO [train.py:1046] (2/4) Epoch 28, batch 2300, loss[loss=0.1483, simple_loss=0.238, pruned_loss=0.02926, over 24613.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2443, pruned_loss=0.04485, over 4707049.47 frames. ], batch size: 68, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:09:38,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 18:09:41,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:09:41,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:47,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:47,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:09:48,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 18:09:51,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:51,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=971513.3333333334, ans=0.125 2023-10-02 18:09:54,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=971580.0, ans=0.0 2023-10-02 18:09:56,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=25.66 vs. limit=22.5 2023-10-02 18:09:58,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:09:58,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:09:58,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:09:59,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:59,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 18:09:59,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:10:01,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=971580.0, ans=0.1 2023-10-02 18:10:03,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:10:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:10:06,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:10:09,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:10:13,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:10:13,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=971646.6666666666, ans=0.2 2023-10-02 18:10:17,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:10:17,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:10:19,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=971646.6666666666, ans=0.0 2023-10-02 18:10:20,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:10:23,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:10:27,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:10:28,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:10:28,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:10:28,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 18:10:28,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=971713.3333333334, ans=0.125 2023-10-02 18:10:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:10:32,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:10:32,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:10:32,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:10:32,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:10:34,207 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.868e+02 2.077e+02 2.377e+02 3.384e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 18:10:34,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 18:10:34,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:10:34,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 18:10:34,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:10:34,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:10:35,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 18:10:39,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:10:42,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=971780.0, ans=0.1 2023-10-02 18:10:43,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:10:46,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:10:47,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:10:47,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:10:49,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=971780.0, ans=0.035 2023-10-02 18:10:49,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=22.5 2023-10-02 18:10:50,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:10:50,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:10:51,982 INFO [train.py:1046] (2/4) Epoch 28, batch 2350, loss[loss=0.1652, simple_loss=0.2445, pruned_loss=0.04293, over 24305.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.245, pruned_loss=0.04524, over 4702338.03 frames. ], batch size: 61, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:10:52,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:10:52,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 18:10:58,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:10:58,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 18:11:03,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 18:11:05,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:11:09,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:09,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:09,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:11:09,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:11:11,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 18:11:15,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:11:19,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=971980.0, ans=0.0 2023-10-02 18:11:20,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=971980.0, ans=0.0 2023-10-02 18:11:21,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 18:11:23,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:11:26,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:11:26,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:11:29,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:11:30,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 18:11:30,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:11:31,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:11:31,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:11:31,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:11:37,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:11:37,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 18:11:37,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:11:41,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:41,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:11:43,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 18:11:45,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:11:48,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 18:11:48,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:11:52,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 18:11:54,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 18:11:55,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:11:55,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:11:55,642 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 18:11:55,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 18:11:59,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 18:12:01,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:12:03,856 INFO [train.py:1046] (2/4) Epoch 28, batch 2400, loss[loss=0.162, simple_loss=0.2504, pruned_loss=0.03675, over 24665.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2438, pruned_loss=0.04473, over 4702566.95 frames. ], batch size: 73, lr: 3.66e-03, grad_scale: 16.0 2023-10-02 18:12:03,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:12:08,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:12:08,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:12:09,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 18:12:09,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 18:12:13,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=972180.0, ans=0.1 2023-10-02 18:12:17,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:12:17,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:12:19,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 18:12:19,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:12:20,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:20,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 18:12:26,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:27,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 18:12:30,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=972246.6666666666, ans=0.1 2023-10-02 18:12:33,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:12:33,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=972313.3333333334, ans=0.125 2023-10-02 18:12:37,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 18:12:39,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:12:40,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:45,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:12:47,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 18:12:47,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:12:50,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=972380.0, ans=0.125 2023-10-02 18:12:55,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:12:56,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:12:58,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:12:59,479 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.861e+02 2.059e+02 2.310e+02 3.271e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 18:12:59,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:12:59,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:12:59,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:12:59,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:13:00,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:13:00,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:13:03,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:13:04,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:13:04,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 18:13:05,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 18:13:08,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:13:08,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:13:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 18:13:08,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 18:13:08,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 18:13:08,781 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 18:13:10,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 18:13:11,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:13:14,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:14,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:13:15,502 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 18:13:17,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:17,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:13:18,805 INFO [train.py:1046] (2/4) Epoch 28, batch 2450, loss[loss=0.1661, simple_loss=0.2327, pruned_loss=0.04976, over 23782.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2428, pruned_loss=0.04445, over 4708958.91 frames. ], batch size: 212, lr: 3.66e-03, grad_scale: 16.0 2023-10-02 18:13:21,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:13:21,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:13:27,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:27,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:13:27,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 18:13:28,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=972513.3333333334, ans=0.0 2023-10-02 18:13:34,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:13:34,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:37,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:13:37,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:13:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:13:38,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 18:13:38,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=972580.0, ans=0.02 2023-10-02 18:13:42,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:44,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:13:46,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:13:49,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:13:50,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:13:50,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:13:50,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:54,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 18:13:54,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=972646.6666666666, ans=0.0 2023-10-02 18:13:55,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:14:02,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:05,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:14:05,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:05,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:14:05,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:06,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:14:07,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 18:14:10,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:14:10,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:14:13,468 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=12.0 2023-10-02 18:14:14,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:14:14,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:18,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:14:18,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 18:14:19,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:14:19,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:14:19,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 18:14:20,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:14:21,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:14:25,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:14:28,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:29,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:14:31,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=972846.6666666666, ans=0.2 2023-10-02 18:14:32,116 INFO [train.py:1046] (2/4) Epoch 28, batch 2500, loss[loss=0.1834, simple_loss=0.2483, pruned_loss=0.05922, over 23381.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2422, pruned_loss=0.04411, over 4710000.85 frames. ], batch size: 120, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:14:32,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 18:14:32,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:14:34,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.10 vs. limit=15.0 2023-10-02 18:14:37,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:14:48,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:14:48,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:49,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:14:49,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 18:14:57,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:14:57,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:14:57,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:14:57,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:14:58,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 18:14:59,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:01,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:15:01,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 18:15:01,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:01,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.33 vs. limit=12.0 2023-10-02 18:15:02,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 18:15:02,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:06,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:15:08,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:15:08,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=972980.0, ans=0.125 2023-10-02 18:15:08,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=972980.0, ans=0.125 2023-10-02 18:15:09,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:15:11,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 18:15:12,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:15:14,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:17,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:20,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:23,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:15:29,178 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.840e+02 2.003e+02 2.187e+02 4.032e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-02 18:15:30,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:15:33,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 18:15:33,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:15:33,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:15:34,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:15:34,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:15:36,394 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 18:15:36,394 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 18:15:36,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 18:15:39,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:39,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=973113.3333333334, ans=0.125 2023-10-02 18:15:40,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 18:15:40,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 18:15:41,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:15:41,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 18:15:45,235 INFO [train.py:1046] (2/4) Epoch 28, batch 2550, loss[loss=0.1728, simple_loss=0.2312, pruned_loss=0.05723, over 19099.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2423, pruned_loss=0.04406, over 4706810.90 frames. ], batch size: 388, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:15:45,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 18:15:49,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:15:51,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:15:51,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:15:54,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:15:54,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 18:15:55,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:15:56,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=973180.0, ans=0.1 2023-10-02 18:15:56,669 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.22 vs. limit=10.0 2023-10-02 18:16:00,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 18:16:00,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:16:03,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:06,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:16:06,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 18:16:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:16:06,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:16:06,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:16:08,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:16:08,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 18:16:08,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:16:08,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:10,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 18:16:10,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=973246.6666666666, ans=0.07 2023-10-02 18:16:13,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=973313.3333333334, ans=0.2 2023-10-02 18:16:15,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=973313.3333333334, ans=10.0 2023-10-02 18:16:16,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.14 vs. limit=15.0 2023-10-02 18:16:23,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:16:28,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=973380.0, ans=0.0 2023-10-02 18:16:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:16:29,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:29,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:16:30,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:16:32,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=973380.0, ans=0.0 2023-10-02 18:16:37,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:16:40,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:16:40,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:16:40,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:16:40,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:16:42,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:16:44,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:16:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:51,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:16:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 18:16:51,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:16:51,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:53,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:16:53,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:16:54,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:16:54,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=973446.6666666666, ans=0.125 2023-10-02 18:16:58,689 INFO [train.py:1046] (2/4) Epoch 28, batch 2600, loss[loss=0.2171, simple_loss=0.2804, pruned_loss=0.07689, over 19365.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2428, pruned_loss=0.04384, over 4712357.74 frames. ], batch size: 388, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:17:00,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:17:02,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:04,308 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 18:17:06,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=973513.3333333334, ans=0.125 2023-10-02 18:17:07,698 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 18:17:07,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:17:07,740 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 18:17:09,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 18:17:10,490 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 18:17:11,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:17:11,935 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 18:17:12,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=973580.0, ans=0.1 2023-10-02 18:17:13,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 18:17:14,784 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 18:17:16,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:17:16,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 18:17:17,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 18:17:19,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:17:19,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 18:17:22,675 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 18:17:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 18:17:27,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.96 vs. limit=15.0 2023-10-02 18:17:31,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:17:31,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:31,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:17:31,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 18:17:33,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:17:39,864 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 18:17:44,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:44,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:17:45,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 18:17:45,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:17:45,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:17:46,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 18:17:47,549 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.74 vs. limit=15.0 2023-10-02 18:17:50,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:17:50,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:17:53,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:17:56,167 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.832e+02 1.959e+02 2.190e+02 4.032e+02, threshold=3.917e+02, percent-clipped=2.0 2023-10-02 18:17:56,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 18:17:56,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:17:56,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:17:57,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=973780.0, ans=0.125 2023-10-02 18:18:00,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=973780.0, ans=0.95 2023-10-02 18:18:01,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:18:01,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:18:01,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 18:18:02,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:18:03,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:18:04,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:18:08,259 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:18:11,953 INFO [train.py:1046] (2/4) Epoch 28, batch 2650, loss[loss=0.1854, simple_loss=0.2707, pruned_loss=0.05007, over 24570.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.244, pruned_loss=0.04461, over 4715888.76 frames. ], batch size: 71, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:18:12,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 18:18:13,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:14,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:18:17,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 18:18:17,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:19,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=973846.6666666666, ans=0.125 2023-10-02 18:18:20,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:18:20,872 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 18:18:21,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:18:22,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:24,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=973846.6666666666, ans=0.125 2023-10-02 18:18:26,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:18:28,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:18:30,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:18:32,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 18:18:33,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:18:33,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:18:33,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=973913.3333333334, ans=0.2 2023-10-02 18:18:35,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 18:18:35,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=973913.3333333334, ans=0.0 2023-10-02 18:18:36,719 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 18:18:39,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:18:39,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=973980.0, ans=0.0 2023-10-02 18:18:41,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 18:18:41,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:18:42,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 18:18:45,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:45,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:18:46,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:47,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:18:50,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 18:18:50,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 18:18:53,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:18:56,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 18:18:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:56,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:18:57,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:18:57,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:18:59,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:19:00,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:19:02,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:19:02,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:19:02,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:19:02,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=974046.6666666666, ans=0.125 2023-10-02 18:19:03,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:19:03,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=974046.6666666666, ans=10.0 2023-10-02 18:19:04,195 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.67 vs. limit=15.0 2023-10-02 18:19:04,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:04,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:19:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:07,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:19:09,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:19:13,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:13,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:19:13,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 18:19:18,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:19:20,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:21,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:22,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:24,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:19:24,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:26,034 INFO [train.py:1046] (2/4) Epoch 28, batch 2700, loss[loss=0.1729, simple_loss=0.2673, pruned_loss=0.0393, over 24369.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2456, pruned_loss=0.04552, over 4691820.45 frames. ], batch size: 74, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:19:26,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:19:26,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 18:19:29,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:19:31,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 18:19:33,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:19:33,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:33,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:33,533 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:19:36,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:19:36,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:19:36,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:19:36,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 18:19:37,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:19:38,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:19:40,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:19:40,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:43,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:19:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 18:19:45,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:19:49,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:19:49,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:19:55,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:19:57,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:19:57,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:19:57,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:20:00,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=974313.3333333334, ans=0.1 2023-10-02 18:20:01,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:04,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:20:04,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:20:04,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:20:08,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:08,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:20:15,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:20:16,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:20:18,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=974380.0, ans=10.0 2023-10-02 18:20:20,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:20:20,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:23,617 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.829e+02 1.997e+02 2.292e+02 3.269e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 18:20:23,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:25,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:26,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:20:28,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:30,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:30,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:20:31,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:20:32,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:20:32,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:20:35,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=974446.6666666666, ans=0.0 2023-10-02 18:20:36,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 18:20:36,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:39,661 INFO [train.py:1046] (2/4) Epoch 28, batch 2750, loss[loss=0.1516, simple_loss=0.237, pruned_loss=0.03309, over 24460.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2447, pruned_loss=0.04502, over 4703276.70 frames. ], batch size: 63, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:20:39,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:20:39,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 18:20:41,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 18:20:41,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:45,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:20:45,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:47,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:49,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:20:49,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:50,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=974513.3333333334, ans=0.0 2023-10-02 18:20:51,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:20:53,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:20:53,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:20:53,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:53,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 18:20:53,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:20:53,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:57,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=974580.0, ans=0.125 2023-10-02 18:20:59,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-02 18:21:00,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 18:21:02,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:21:02,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:02,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:21:04,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:21:05,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:06,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:21:06,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:08,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:08,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=974646.6666666666, ans=0.0 2023-10-02 18:21:09,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=974646.6666666666, ans=0.07 2023-10-02 18:21:11,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:21:12,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:21:12,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:21:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:14,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:21:17,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=974646.6666666666, ans=0.0 2023-10-02 18:21:20,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:22,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:21:22,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:23,600 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.98 vs. limit=15.0 2023-10-02 18:21:25,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:25,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:21:26,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:21:28,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=974713.3333333334, ans=0.125 2023-10-02 18:21:30,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=974713.3333333334, ans=0.125 2023-10-02 18:21:31,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:21:33,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:21:33,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 18:21:37,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:38,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 18:21:41,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=974780.0, ans=0.2 2023-10-02 18:21:44,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:21:46,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:21:46,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 18:21:49,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:21:51,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:21:51,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 18:21:52,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:21:55,383 INFO [train.py:1046] (2/4) Epoch 28, batch 2800, loss[loss=0.1758, simple_loss=0.2634, pruned_loss=0.04408, over 24071.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2441, pruned_loss=0.04463, over 4716892.63 frames. ], batch size: 80, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:21:55,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 18:21:56,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:21:56,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:21:56,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 18:21:56,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:57,624 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.87 vs. limit=6.0 2023-10-02 18:21:58,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:59,717 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 18:21:59,718 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 18:22:03,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:22:05,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:22:05,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:22:07,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:22:10,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 18:22:10,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=974913.3333333334, ans=0.1 2023-10-02 18:22:11,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 18:22:11,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 18:22:14,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.86 vs. limit=6.0 2023-10-02 18:22:14,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:16,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:22:16,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:19,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:22:19,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:19,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:22:20,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:22:28,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:22:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:22:31,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:31,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:22:32,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:37,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:22:37,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 18:22:37,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:22:38,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:22:38,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:22:42,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:22:43,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:46,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:22:48,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:22:48,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:48,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:22:49,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:22:49,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:22:50,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:50,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 18:22:50,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:22:51,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=975046.6666666666, ans=0.2 2023-10-02 18:22:52,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:22:52,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:22:53,780 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.840e+02 2.003e+02 2.153e+02 3.195e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-02 18:22:53,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 18:22:54,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:55,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:22:55,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:22:56,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 18:23:02,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:23:02,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:23:02,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:23:05,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:09,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:23:09,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:09,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:23:10,543 INFO [train.py:1046] (2/4) Epoch 28, batch 2850, loss[loss=0.1715, simple_loss=0.2424, pruned_loss=0.05034, over 23638.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2429, pruned_loss=0.04469, over 4692556.21 frames. ], batch size: 232, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:23:10,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=975180.0, ans=0.1 2023-10-02 18:23:12,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:12,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:23:13,029 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:23:13,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:23:14,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 18:23:20,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 18:23:20,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:20,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=975180.0, ans=0.125 2023-10-02 18:23:22,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 18:23:24,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:27,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 18:23:28,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 18:23:30,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:40,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:42,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:23:42,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:23:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:23:43,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:23:43,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:23:47,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:23:47,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 18:23:48,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:23:50,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:23:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:50,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:51,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=975313.3333333334, ans=0.125 2023-10-02 18:23:52,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:53,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=12.0 2023-10-02 18:23:54,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:54,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:57,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:23:57,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=975380.0, ans=0.05 2023-10-02 18:23:59,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:23:59,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:00,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:02,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:24:02,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=975380.0, ans=0.125 2023-10-02 18:24:03,868 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:24:05,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:24:06,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 18:24:06,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 18:24:09,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:24:09,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:10,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 18:24:11,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.73 vs. limit=8.0 2023-10-02 18:24:11,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:24:11,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:13,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:24:13,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:24:13,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 18:24:13,593 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 18:24:13,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:24:14,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:17,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=975446.6666666666, ans=0.125 2023-10-02 18:24:19,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:24:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:24:21,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:24:22,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 18:24:23,879 INFO [train.py:1046] (2/4) Epoch 28, batch 2900, loss[loss=0.152, simple_loss=0.2382, pruned_loss=0.03287, over 24649.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2437, pruned_loss=0.04448, over 4701665.89 frames. ], batch size: 68, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:24:26,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=975513.3333333334, ans=0.125 2023-10-02 18:24:27,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:27,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 18:24:28,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 18:24:30,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:24:30,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:24:31,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:24:33,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:24:37,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:24:37,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:40,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:24:40,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 18:24:41,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:24:41,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=975580.0, ans=0.125 2023-10-02 18:24:43,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:45,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 18:24:46,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 18:24:49,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:49,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 18:24:50,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:24:51,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:24:51,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:24:53,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:24:55,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:58,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:25:01,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:03,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 18:25:03,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 18:25:03,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:25:06,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:25:07,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 18:25:08,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:25:13,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:25:22,907 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.803e+02 1.962e+02 2.123e+02 3.494e+02, threshold=3.924e+02, percent-clipped=0.0 2023-10-02 18:25:22,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:25:23,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:25:24,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 18:25:27,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:27,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 18:25:27,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:25:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:25:36,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:25:37,844 INFO [train.py:1046] (2/4) Epoch 28, batch 2950, loss[loss=0.2242, simple_loss=0.2862, pruned_loss=0.08109, over 19442.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2445, pruned_loss=0.04444, over 4700572.37 frames. ], batch size: 388, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:25:37,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 18:25:39,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:25:39,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:40,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:25:43,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:25:44,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 18:25:44,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 18:25:46,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:25:46,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:25:46,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=975846.6666666666, ans=0.1 2023-10-02 18:25:47,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=975846.6666666666, ans=0.125 2023-10-02 18:25:49,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:25:52,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:25:54,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:25:54,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:25:57,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:25:57,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:25:59,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:59,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:26:00,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:26:02,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 18:26:06,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 18:26:06,457 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 18:26:07,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:26:09,089 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 18:26:09,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 18:26:10,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:26:10,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:26:10,581 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 18:26:10,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:26:13,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 18:26:13,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:26:15,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:26:16,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:26:17,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:26:17,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:19,160 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 18:26:19,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:26:19,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 18:26:23,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:25,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:26:27,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 18:26:27,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:26:29,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 18:26:33,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:26:34,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:26:34,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:26:37,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:37,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:26:38,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:26:40,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:26:40,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:26:41,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:26:41,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:26:43,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:44,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 18:26:44,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:47,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:26:47,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:26:48,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.35 vs. limit=12.0 2023-10-02 18:26:51,591 INFO [train.py:1046] (2/4) Epoch 28, batch 3000, loss[loss=0.1801, simple_loss=0.2524, pruned_loss=0.05389, over 22689.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2449, pruned_loss=0.04465, over 4689661.26 frames. ], batch size: 322, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:26:51,592 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 18:27:03,150 INFO [train.py:1078] (2/4) Epoch 28, validation: loss=0.3199, simple_loss=0.2738, pruned_loss=0.183, over 1125622.00 frames. 2023-10-02 18:27:03,151 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 18:27:03,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 18:27:04,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 18:27:06,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:27:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:27:07,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 18:27:07,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:27:08,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=15.0 2023-10-02 18:27:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:27:21,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=976246.6666666666, ans=0.0 2023-10-02 18:27:23,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:27:25,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=976246.6666666666, ans=0.125 2023-10-02 18:27:28,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 18:27:30,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:27:30,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=976246.6666666666, ans=0.125 2023-10-02 18:27:34,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:27:34,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:27:34,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:27:37,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:27:37,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 18:27:40,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 18:27:41,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:27:43,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:27:45,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:27:45,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:27:45,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:27:45,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:27:48,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:27:49,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:27:49,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:27:51,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:27:54,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 18:27:54,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:27:55,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:27:56,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:28:00,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:00,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:01,803 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.829e+02 2.050e+02 2.331e+02 3.345e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 18:28:01,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 18:28:03,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 18:28:03,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:03,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 18:28:03,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:28:06,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 18:28:08,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:28:09,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:28:09,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 18:28:09,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=976446.6666666666, ans=0.125 2023-10-02 18:28:10,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 18:28:10,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:28:12,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:28:13,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:13,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:28:13,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:14,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:28:16,136 INFO [train.py:1046] (2/4) Epoch 28, batch 3050, loss[loss=0.1756, simple_loss=0.2503, pruned_loss=0.05045, over 23584.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2454, pruned_loss=0.04495, over 4706233.38 frames. ], batch size: 256, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:28:16,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 18:28:19,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:28:20,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:20,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:28:24,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:25,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=976513.3333333334, ans=0.125 2023-10-02 18:28:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 18:28:34,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 18:28:34,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 18:28:34,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:28:37,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:28:40,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:40,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:40,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:28:43,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:28:43,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:28:44,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:44,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:44,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:28:46,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:47,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:28:47,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=976646.6666666666, ans=0.2 2023-10-02 18:28:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:52,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 18:28:53,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:53,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:28:56,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:28:58,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:28:58,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:28:59,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:29:04,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:10,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=976713.3333333334, ans=0.05 2023-10-02 18:29:11,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:12,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:29:12,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:29:14,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:29:14,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:29:14,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:29:17,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 18:29:18,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:29:20,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:20,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 18:29:21,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:27,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:27,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:29:29,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:29:30,968 INFO [train.py:1046] (2/4) Epoch 28, batch 3100, loss[loss=0.1864, simple_loss=0.2749, pruned_loss=0.04897, over 24351.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.245, pruned_loss=0.04474, over 4715775.61 frames. ], batch size: 74, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:29:31,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 18:29:34,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 18:29:34,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 18:29:35,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:29:41,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:29:41,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:44,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 18:29:47,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:51,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 18:29:52,102 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.09 vs. limit=15.0 2023-10-02 18:29:56,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:29:57,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:29:57,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:29:57,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:29:58,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 18:29:59,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=976980.0, ans=0.125 2023-10-02 18:30:00,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:30:00,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 18:30:00,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:30:02,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:30:04,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 18:30:05,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:30:07,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=976980.0, ans=0.1 2023-10-02 18:30:08,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:30:10,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 18:30:10,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 18:30:10,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:11,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:30:13,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=15.0 2023-10-02 18:30:14,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:15,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:15,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:30:17,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=977046.6666666666, ans=0.2 2023-10-02 18:30:18,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:30:18,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:30:19,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:30:19,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:30:19,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:19,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 18:30:25,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:30:25,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 18:30:27,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:30:28,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 18:30:28,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:29,751 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.812e+02 1.984e+02 2.223e+02 4.054e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-02 18:30:29,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:29,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 18:30:41,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 18:30:44,706 INFO [train.py:1046] (2/4) Epoch 28, batch 3150, loss[loss=0.1823, simple_loss=0.256, pruned_loss=0.05434, over 23558.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2438, pruned_loss=0.04453, over 4719562.73 frames. ], batch size: 134, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:30:44,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:30:44,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:47,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:30:47,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:30:48,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 18:30:50,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:30:50,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:30:51,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 18:30:54,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:56,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=977180.0, ans=0.2 2023-10-02 18:30:57,284 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 18:30:57,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=977246.6666666666, ans=0.0 2023-10-02 18:30:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 18:31:00,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:31:00,866 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 18:31:00,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 18:31:03,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 18:31:03,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 18:31:03,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 18:31:03,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:31:03,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:31:06,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:31:08,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 18:31:09,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:31:10,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:31:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:31:12,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:31:17,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 18:31:17,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:31:20,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:31:20,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:31:21,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 18:31:24,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 18:31:26,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:31:26,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 18:31:26,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:31:27,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:31:27,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:31:28,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:31:28,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:31:28,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 18:31:29,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:31:29,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:32,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:31:32,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:31:33,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 18:31:33,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:31:35,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 18:31:35,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:36,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 18:31:37,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 18:31:39,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:31:39,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:31:41,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 18:31:43,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 18:31:43,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:31:46,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:31:47,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:47,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:31:51,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:31:53,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:54,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 18:31:58,494 INFO [train.py:1046] (2/4) Epoch 28, batch 3200, loss[loss=0.1638, simple_loss=0.2505, pruned_loss=0.03855, over 24319.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2433, pruned_loss=0.04401, over 4720251.87 frames. ], batch size: 74, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:31:58,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:31:58,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:32:00,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=977513.3333333334, ans=0.125 2023-10-02 18:32:04,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:32:04,810 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.83 vs. limit=10.0 2023-10-02 18:32:05,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:32:05,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 18:32:07,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:32:12,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:32:18,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:32:19,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.63 vs. limit=22.5 2023-10-02 18:32:25,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:32:34,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 18:32:35,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:32:35,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=977646.6666666666, ans=0.125 2023-10-02 18:32:36,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.38 vs. limit=15.0 2023-10-02 18:32:38,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 18:32:39,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:32:41,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:32:41,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:32:43,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:32:46,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 18:32:47,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=977713.3333333334, ans=0.125 2023-10-02 18:32:48,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 18:32:49,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 18:32:51,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=977713.3333333334, ans=0.1 2023-10-02 18:32:52,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 18:32:54,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:32:58,087 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.399e+02 1.829e+02 1.986e+02 2.158e+02 2.804e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 18:33:00,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:00,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:33:00,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:02,288 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 18:33:02,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:33:05,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:05,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 18:33:07,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 18:33:07,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 18:33:08,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 18:33:11,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:33:13,359 INFO [train.py:1046] (2/4) Epoch 28, batch 3250, loss[loss=0.1819, simple_loss=0.2515, pruned_loss=0.05613, over 23377.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.244, pruned_loss=0.04413, over 4723872.36 frames. ], batch size: 285, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:33:15,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:33:15,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 18:33:15,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:33:15,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:16,823 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 18:33:20,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:33:23,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:33:30,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:33:30,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 18:33:30,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:31,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=977913.3333333334, ans=0.0 2023-10-02 18:33:32,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:32,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:33:33,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:33:34,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:33:36,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:36,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:33:36,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:36,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:36,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:37,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:33:41,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:33:44,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:33:45,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:45,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:47,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:49,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:33:49,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:33:53,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 18:33:54,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:33:54,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:33:54,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:56,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:34:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:34:09,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:34:09,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:09,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 18:34:09,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:34:09,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:34:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:10,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=978113.3333333334, ans=0.125 2023-10-02 18:34:12,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 18:34:12,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 18:34:12,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=978113.3333333334, ans=0.0 2023-10-02 18:34:13,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:34:15,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:15,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:34:15,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 18:34:17,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:34:17,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=978113.3333333334, ans=0.0 2023-10-02 18:34:17,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=978113.3333333334, ans=0.0 2023-10-02 18:34:19,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.15 vs. limit=22.5 2023-10-02 18:34:21,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:34:21,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:34:24,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 18:34:24,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:24,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.75 vs. limit=22.5 2023-10-02 18:34:27,077 INFO [train.py:1046] (2/4) Epoch 28, batch 3300, loss[loss=0.1648, simple_loss=0.2546, pruned_loss=0.0375, over 24460.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2443, pruned_loss=0.04409, over 4729114.46 frames. ], batch size: 69, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:34:27,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:34:27,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 18:34:29,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:34:29,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 18:34:31,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 18:34:32,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 18:34:32,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:32,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=978180.0, ans=0.125 2023-10-02 18:34:35,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:34:38,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:34:38,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:41,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:34:41,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:34:41,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=978246.6666666666, ans=0.0 2023-10-02 18:34:44,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:45,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:34:49,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 18:34:49,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:34:49,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:52,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 18:34:53,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:34:55,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:34:55,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:34:56,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:34:57,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 18:34:57,799 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.01 vs. limit=12.0 2023-10-02 18:34:59,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:59,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:35:02,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:02,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 18:35:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 18:35:04,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:04,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:35:05,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 18:35:08,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 18:35:09,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:35:11,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 18:35:14,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:35:15,651 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.11 vs. limit=15.0 2023-10-02 18:35:16,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:35:16,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:35:19,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:20,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:35:20,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:35:20,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:35:23,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:35:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:23,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:35:25,221 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 18:35:26,522 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.840e+02 2.125e+02 2.554e+02 4.181e+02, threshold=4.250e+02, percent-clipped=1.0 2023-10-02 18:35:26,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 18:35:28,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:35:28,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:35:28,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:31,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:35:31,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:34,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:35:35,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:35,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:35:35,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:36,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:35:38,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 18:35:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:39,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=978513.3333333334, ans=0.125 2023-10-02 18:35:40,904 INFO [train.py:1046] (2/4) Epoch 28, batch 3350, loss[loss=0.1508, simple_loss=0.2318, pruned_loss=0.03483, over 24469.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2447, pruned_loss=0.04392, over 4735273.48 frames. ], batch size: 58, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:35:40,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:42,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:35:42,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:35:42,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:45,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:45,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:48,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:35:50,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:51,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:35:54,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:55,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:35:55,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:56,201 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:35:57,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:35:57,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=978580.0, ans=0.125 2023-10-02 18:35:59,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 18:36:00,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 18:36:00,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:36:02,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=978580.0, ans=0.125 2023-10-02 18:36:04,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 18:36:04,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 18:36:04,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:36:05,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:36:07,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:08,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 18:36:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:08,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:36:10,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:10,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=978646.6666666666, ans=0.125 2023-10-02 18:36:11,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:13,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:13,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:36:15,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:18,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:18,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:21,648 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:36:22,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:36:24,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:25,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:25,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:26,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:28,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 18:36:28,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:36:28,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 18:36:29,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:36:31,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 18:36:33,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:34,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:40,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.55 vs. limit=15.0 2023-10-02 18:36:43,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:43,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 18:36:43,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:36:44,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:36:46,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:36:51,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:36:52,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=978780.0, ans=0.125 2023-10-02 18:36:54,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 18:36:54,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:36:54,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:36:55,460 INFO [train.py:1046] (2/4) Epoch 28, batch 3400, loss[loss=0.1681, simple_loss=0.253, pruned_loss=0.04164, over 24182.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2456, pruned_loss=0.04432, over 4731743.46 frames. ], batch size: 80, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:36:55,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:55,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 18:36:56,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 18:36:57,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:36:57,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:36:58,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:36:58,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:36:58,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 18:37:01,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 18:37:01,399 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 18:37:01,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=978846.6666666666, ans=0.95 2023-10-02 18:37:03,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:07,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:37:07,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:37:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:08,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:37:12,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:37:13,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 18:37:18,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:37:19,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:19,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:37:20,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:37:24,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:37:29,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 18:37:35,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:36,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 18:37:38,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:37:39,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:37:39,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:37:39,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:37:43,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:49,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:37:49,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:37:53,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:37:54,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 18:37:55,880 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.833e+02 2.007e+02 2.238e+02 3.330e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-02 18:37:56,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=979113.3333333334, ans=0.0 2023-10-02 18:38:00,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:38:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 18:38:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 18:38:07,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:38:08,997 INFO [train.py:1046] (2/4) Epoch 28, batch 3450, loss[loss=0.156, simple_loss=0.241, pruned_loss=0.03554, over 24451.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2448, pruned_loss=0.04427, over 4733104.17 frames. ], batch size: 63, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:38:10,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:38:10,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 18:38:11,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:38:17,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:38:20,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:38:21,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:38:21,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:38:21,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:22,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:28,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 18:38:35,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 18:38:35,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:38:35,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=979246.6666666666, ans=0.2 2023-10-02 18:38:37,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:38:38,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:38:45,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 18:38:45,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:38:47,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=979313.3333333334, ans=0.125 2023-10-02 18:38:49,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:38:49,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:38:51,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=979313.3333333334, ans=0.0 2023-10-02 18:38:52,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:38:53,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:38:55,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 18:38:55,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:38:56,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:57,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:39:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 18:39:03,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=979380.0, ans=0.1 2023-10-02 18:39:04,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:39:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:39:09,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:12,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:19,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:19,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:39:20,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:39:21,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:39:23,140 INFO [train.py:1046] (2/4) Epoch 28, batch 3500, loss[loss=0.1394, simple_loss=0.2034, pruned_loss=0.03775, over 23586.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2439, pruned_loss=0.04395, over 4734892.08 frames. ], batch size: 256, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:39:26,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:27,656 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:39:28,089 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.48 vs. limit=15.0 2023-10-02 18:39:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:39:28,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 18:39:31,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:39:35,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:39:36,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:36,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 18:39:41,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:39:42,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:39:44,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:39:44,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:39:44,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:39:46,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:47,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:39:47,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 18:39:49,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:50,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:39:51,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:39:51,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=979646.6666666666, ans=0.125 2023-10-02 18:39:55,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:55,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 18:39:55,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:39:58,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:39:58,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:39:59,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:01,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:40:01,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:40:01,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 18:40:02,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 18:40:04,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 18:40:04,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:40:06,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:08,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:40:08,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:40:11,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:40:12,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:40:18,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:40:19,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 18:40:19,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 18:40:19,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:40:23,420 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.910e+02 2.082e+02 2.470e+02 3.693e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-02 18:40:23,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:40:23,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:40:24,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:27,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 18:40:28,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:40:29,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=979780.0, ans=0.125 2023-10-02 18:40:30,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:40:30,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 18:40:31,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 18:40:33,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:35,943 INFO [train.py:1046] (2/4) Epoch 28, batch 3550, loss[loss=0.1569, simple_loss=0.2415, pruned_loss=0.03615, over 24511.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2434, pruned_loss=0.04375, over 4734308.56 frames. ], batch size: 63, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:40:35,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:40:36,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:40:36,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:40:36,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=979846.6666666666, ans=0.025 2023-10-02 18:40:38,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:40:46,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:40:50,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 18:40:51,789 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:40:52,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:40:54,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:40:55,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:40:56,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:40:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:40:59,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:41:00,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:41:01,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:41:01,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:41:02,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:41:07,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:41:07,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:41:09,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:41:09,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:41:10,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:41:10,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 18:41:10,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:12,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:14,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:41:20,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:41:20,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:41:20,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:41:20,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=980046.6666666666, ans=0.125 2023-10-02 18:41:23,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=980046.6666666666, ans=15.0 2023-10-02 18:41:24,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 18:41:25,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:41:26,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 18:41:26,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:41:29,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:41:29,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:41:29,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=980046.6666666666, ans=0.1 2023-10-02 18:41:32,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 18:41:33,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:41:35,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=980113.3333333334, ans=0.2 2023-10-02 18:41:37,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:41:39,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 18:41:39,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:41:43,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:46,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 18:41:48,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=980180.0, ans=0.04949747468305833 2023-10-02 18:41:49,882 INFO [train.py:1046] (2/4) Epoch 28, batch 3600, loss[loss=0.1776, simple_loss=0.2606, pruned_loss=0.04731, over 24373.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2434, pruned_loss=0.043, over 4755008.67 frames. ], batch size: 77, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:41:54,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 18:41:54,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:41:55,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:41:57,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=980180.0, ans=0.125 2023-10-02 18:41:58,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:41:59,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=980180.0, ans=0.125 2023-10-02 18:42:00,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:42:00,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:42:03,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:42:04,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:05,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:42:06,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:42:06,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:06,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 18:42:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:42:10,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:11,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=980246.6666666666, ans=0.2 2023-10-02 18:42:12,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:42:16,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:42:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:42:18,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:42:18,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 18:42:18,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=980313.3333333334, ans=0.0 2023-10-02 18:42:19,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:42:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:22,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:42:22,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:25,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:42:26,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:42:28,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 18:42:33,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:42:34,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:42:34,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 18:42:36,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=980380.0, ans=0.125 2023-10-02 18:42:39,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:42:45,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:48,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:50,298 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.786e+02 1.946e+02 2.303e+02 3.332e+02, threshold=3.893e+02, percent-clipped=0.0 2023-10-02 18:42:54,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:42:54,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:42:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 18:42:55,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 18:42:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 18:42:59,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:43:00,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:43:00,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 18:43:02,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:02,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:43:02,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:43:03,427 INFO [train.py:1046] (2/4) Epoch 28, batch 3650, loss[loss=0.1567, simple_loss=0.2293, pruned_loss=0.0421, over 24435.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2439, pruned_loss=0.04342, over 4748572.43 frames. ], batch size: 58, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:43:03,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 18:43:04,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 18:43:09,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:43:09,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 18:43:11,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 18:43:13,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:43:17,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 18:43:19,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 18:43:23,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:43:23,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:43:23,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:43:26,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:43:26,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:43:28,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 18:43:28,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:43:29,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:29,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 18:43:29,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:43:31,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:43:31,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:32,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:43:35,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 18:43:36,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 18:43:38,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:43:38,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 18:43:40,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:43:40,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:43:41,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=980646.6666666666, ans=0.0 2023-10-02 18:43:45,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:43:47,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:47,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:43:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:43:49,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:43:51,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:43:53,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:53,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:43:54,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:43:56,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:43:57,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=980713.3333333334, ans=0.2 2023-10-02 18:43:58,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:59,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:06,501 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 18:44:07,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:44:07,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:09,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:44:10,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:10,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:44:12,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:13,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 18:44:13,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:13,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=980780.0, ans=0.125 2023-10-02 18:44:16,236 INFO [train.py:1046] (2/4) Epoch 28, batch 3700, loss[loss=0.212, simple_loss=0.2763, pruned_loss=0.07384, over 19208.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2448, pruned_loss=0.04372, over 4742480.06 frames. ], batch size: 388, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:44:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:44:18,942 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.50 vs. limit=10.0 2023-10-02 18:44:19,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.22 vs. limit=5.0 2023-10-02 18:44:20,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:44:20,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=980846.6666666666, ans=0.125 2023-10-02 18:44:21,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:44:24,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:24,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 18:44:24,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:24,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 18:44:24,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff3.min_abs, batch_count=980846.6666666666, ans=0.2 2023-10-02 18:44:25,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:44:29,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:44:31,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:44:33,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:44:33,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:44:34,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:34,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=980913.3333333334, ans=0.125 2023-10-02 18:44:35,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:44:38,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:44:40,142 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 18:44:45,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:44:45,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:44:45,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:44:45,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 18:44:45,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:44:47,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=980980.0, ans=0.1 2023-10-02 18:44:50,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:52,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 18:44:54,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:55,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:44:58,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:45:00,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:45:01,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:45:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:45:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 18:45:05,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:45:05,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 18:45:11,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:45:11,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:45:13,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:15,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 18:45:16,604 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.928e+02 2.164e+02 2.452e+02 3.426e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-02 18:45:16,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:45:16,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:45:16,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:45:16,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:16,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=981113.3333333334, ans=0.125 2023-10-02 18:45:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:45:20,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 18:45:20,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=981113.3333333334, ans=0.0 2023-10-02 18:45:21,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 18:45:21,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:45:23,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:25,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:45:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:45:27,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=981113.3333333334, ans=0.125 2023-10-02 18:45:29,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:45:31,123 INFO [train.py:1046] (2/4) Epoch 28, batch 3750, loss[loss=0.15, simple_loss=0.2376, pruned_loss=0.03115, over 24494.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2455, pruned_loss=0.04417, over 4723750.56 frames. ], batch size: 66, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:45:31,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:45:32,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:45:35,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 18:45:35,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 18:45:38,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:45:38,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 18:45:39,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:45:40,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:40,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:42,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:45:46,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:45:49,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:45:51,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:45:52,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:54,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:45:56,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 18:45:57,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:45:58,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:45:58,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=981246.6666666666, ans=0.125 2023-10-02 18:45:58,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=981246.6666666666, ans=0.125 2023-10-02 18:45:59,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:46:01,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.12 vs. limit=6.0 2023-10-02 18:46:02,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 18:46:05,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 18:46:07,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:46:07,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:46:09,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:46:11,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=981313.3333333334, ans=0.125 2023-10-02 18:46:15,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:17,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:46:20,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 18:46:24,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:26,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:46:28,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:46:30,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:46:34,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:46:35,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:46:37,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:46:40,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:46:42,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:46:44,013 INFO [train.py:1046] (2/4) Epoch 28, batch 3800, loss[loss=0.1462, simple_loss=0.222, pruned_loss=0.03519, over 24422.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2456, pruned_loss=0.04432, over 4715676.88 frames. ], batch size: 58, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:46:47,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=981513.3333333334, ans=0.0 2023-10-02 18:46:50,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:46:54,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:46:55,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:46:56,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 18:46:58,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:47:00,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:01,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:47:02,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.63 vs. limit=12.0 2023-10-02 18:47:03,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 18:47:03,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:03,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:47:05,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:47:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:47:05,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:06,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 18:47:10,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 18:47:10,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:47:12,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:13,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=981646.6666666666, ans=0.125 2023-10-02 18:47:14,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:47:15,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:47:16,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=14.47 vs. limit=15.0 2023-10-02 18:47:16,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:47:16,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:20,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:20,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:25,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:47:25,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 18:47:27,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:47:34,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:47:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:47:40,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 18:47:42,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 18:47:42,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:47:45,042 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.791e+02 1.970e+02 2.183e+02 2.893e+02, threshold=3.939e+02, percent-clipped=0.0 2023-10-02 18:47:45,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:46,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 18:47:47,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.67 vs. limit=15.0 2023-10-02 18:47:51,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 18:47:51,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 18:47:52,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:52,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:47:57,281 INFO [train.py:1046] (2/4) Epoch 28, batch 3850, loss[loss=0.1864, simple_loss=0.255, pruned_loss=0.05888, over 23825.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2448, pruned_loss=0.0441, over 4716654.69 frames. ], batch size: 164, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:47:57,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:47:58,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:48:00,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=981846.6666666666, ans=0.1 2023-10-02 18:48:00,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=981846.6666666666, ans=0.125 2023-10-02 18:48:03,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:48:03,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 18:48:05,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:48:05,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:48:05,750 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:48:07,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=15.0 2023-10-02 18:48:09,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:48:12,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:48:12,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:48:14,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 18:48:19,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:21,639 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.43 vs. limit=15.0 2023-10-02 18:48:22,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:48:23,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:48:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:48:28,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:28,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:48:30,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:48:30,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:48:32,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:48:33,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:48:34,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:34,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:48:34,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 18:48:34,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 18:48:36,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:48:37,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:40,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:40,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:40,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 18:48:41,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 18:48:43,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=982046.6666666666, ans=0.125 2023-10-02 18:48:44,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:46,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 18:48:47,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:48:49,580 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:48:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:52,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:56,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:58,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 18:49:00,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 18:49:02,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:02,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:05,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:49:05,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:49:06,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:06,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:06,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:49:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 18:49:08,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:49:09,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 18:49:09,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:09,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:10,988 INFO [train.py:1046] (2/4) Epoch 28, batch 3900, loss[loss=0.1652, simple_loss=0.2444, pruned_loss=0.04304, over 23359.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2433, pruned_loss=0.0438, over 4701348.10 frames. ], batch size: 119, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:49:11,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:49:13,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:15,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:49:15,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:15,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:49:17,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:49:17,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 18:49:18,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:20,714 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.41 vs. limit=15.0 2023-10-02 18:49:22,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:49:22,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:49:22,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:49:22,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=982180.0, ans=0.125 2023-10-02 18:49:24,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:49:25,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:49:25,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:25,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=982246.6666666666, ans=0.2 2023-10-02 18:49:28,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:49:30,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 18:49:30,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:49:30,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 18:49:32,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:32,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 18:49:34,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 18:49:38,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:49:40,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:49:40,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:49:40,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:49:45,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:49:46,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:49:49,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:49:49,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:49:49,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:49:55,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:49:55,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=982380.0, ans=0.125 2023-10-02 18:49:56,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:50:03,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:50:04,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:50:11,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=982446.6666666666, ans=0.5 2023-10-02 18:50:13,023 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.882e+02 2.101e+02 2.412e+02 3.470e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-02 18:50:13,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:50:13,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=982446.6666666666, ans=0.2 2023-10-02 18:50:16,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:50:16,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 18:50:16,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 18:50:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:50:16,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=982446.6666666666, ans=0.0 2023-10-02 18:50:17,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 18:50:19,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:50:19,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 18:50:24,517 INFO [train.py:1046] (2/4) Epoch 28, batch 3950, loss[loss=0.1801, simple_loss=0.2574, pruned_loss=0.05138, over 23473.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.243, pruned_loss=0.04364, over 4702095.37 frames. ], batch size: 93, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:50:25,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:50:26,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 18:50:27,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:50:29,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:50:31,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:50:34,422 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 18:50:35,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:50:35,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 18:50:37,077 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 18:50:37,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:50:38,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:50:39,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:50:39,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:50:41,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 18:50:43,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:50:44,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:50:44,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:50:44,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:50:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:50:56,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:50:56,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:51:02,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 18:51:08,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 18:51:08,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 18:51:09,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:51:12,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:51:18,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:51:18,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:51:18,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:51:19,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:51:19,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 18:51:21,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=982713.3333333334, ans=0.07 2023-10-02 18:51:22,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:51:24,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:51:28,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 18:51:32,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=982780.0, ans=0.125 2023-10-02 18:51:37,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=982846.6666666666, ans=0.1 2023-10-02 18:51:38,603 INFO [train.py:1046] (2/4) Epoch 28, batch 4000, loss[loss=0.1552, simple_loss=0.2318, pruned_loss=0.03935, over 23709.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2439, pruned_loss=0.04401, over 4705521.18 frames. ], batch size: 232, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:51:38,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:44,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:50,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:51:50,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:51:51,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:51,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 18:51:52,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:51:53,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=982913.3333333334, ans=0.125 2023-10-02 18:51:54,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 18:51:54,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:51:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 18:51:54,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.33 vs. limit=22.5 2023-10-02 18:51:55,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:51:59,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:51:59,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:51:59,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:51:59,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:51:59,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:52:02,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:52:03,815 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 18:52:05,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:52:05,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 18:52:09,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:52:09,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:52:17,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 18:52:18,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:52:18,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:52:20,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 18:52:21,036 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.38 vs. limit=15.0 2023-10-02 18:52:21,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:52:21,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=983046.6666666666, ans=0.0 2023-10-02 18:52:22,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 18:52:22,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:52:24,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:24,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:52:25,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:52:26,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:52:26,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:52:28,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 18:52:28,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:28,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=983046.6666666666, ans=0.125 2023-10-02 18:52:31,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 18:52:36,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:52:37,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 18:52:40,741 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.514e+02 1.835e+02 2.053e+02 2.244e+02 3.735e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-02 18:52:40,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:52:40,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:52:42,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:52:44,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:52:48,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=983113.3333333334, ans=0.125 2023-10-02 18:52:49,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:52:52,217 INFO [train.py:1046] (2/4) Epoch 28, batch 4050, loss[loss=0.1701, simple_loss=0.2409, pruned_loss=0.04965, over 23856.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2439, pruned_loss=0.04401, over 4720194.92 frames. ], batch size: 179, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:52:52,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:52:53,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 18:52:55,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:52:55,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:52:56,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:52:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:52:57,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:53:01,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=983180.0, ans=0.125 2023-10-02 18:53:02,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:53:05,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:53:05,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=983246.6666666666, ans=0.025 2023-10-02 18:53:07,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:53:08,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:53:08,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:53:13,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:53:15,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:53:19,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 18:53:20,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 18:53:20,604 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 18:53:22,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:53:30,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 18:53:30,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:53:31,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=983313.3333333334, ans=0.125 2023-10-02 18:53:34,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:53:35,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:53:37,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:53:37,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:53:40,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:53:44,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=983380.0, ans=0.125 2023-10-02 18:53:45,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 18:53:45,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:53:47,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:53:48,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 18:53:51,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:53:58,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 18:53:58,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:53:58,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:53:58,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=983446.6666666666, ans=0.07 2023-10-02 18:54:01,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 18:54:01,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 18:54:01,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:01,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=983446.6666666666, ans=0.0 2023-10-02 18:54:02,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:54:02,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:02,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:54:05,500 INFO [train.py:1046] (2/4) Epoch 28, batch 4100, loss[loss=0.1825, simple_loss=0.2525, pruned_loss=0.05629, over 22670.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2445, pruned_loss=0.04378, over 4734003.51 frames. ], batch size: 322, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:54:10,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 18:54:12,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 18:54:13,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 18:54:14,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 18:54:14,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:15,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:15,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:15,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:54:17,200 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 18:54:19,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:54:20,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:54:20,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:21,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:54:25,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:54:25,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:54:26,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:54:26,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 18:54:27,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=983580.0, ans=0.125 2023-10-02 18:54:28,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:28,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:54:28,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:54:28,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:54:29,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 18:54:29,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=983580.0, ans=0.125 2023-10-02 18:54:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:54:35,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 18:54:36,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:54:38,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:54:38,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 18:54:40,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:54:40,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=983646.6666666666, ans=10.0 2023-10-02 18:54:41,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:54:41,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:54:44,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 18:54:47,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:54:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:54:50,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 18:54:52,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:53,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:54:56,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:55:01,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:01,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=983713.3333333334, ans=0.1 2023-10-02 18:55:04,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:55:05,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:55:07,211 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.958e+02 2.238e+02 2.586e+02 4.135e+02, threshold=4.476e+02, percent-clipped=1.0 2023-10-02 18:55:11,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:11,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:55:15,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:55:17,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:55:19,513 INFO [train.py:1046] (2/4) Epoch 28, batch 4150, loss[loss=0.1525, simple_loss=0.2343, pruned_loss=0.03538, over 24466.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.245, pruned_loss=0.04366, over 4736669.12 frames. ], batch size: 63, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:55:22,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:55:23,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:55:24,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:55:24,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:55:27,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 18:55:27,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:28,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 18:55:28,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 18:55:28,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 18:55:31,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:34,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:55:34,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:36,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.96 vs. limit=6.0 2023-10-02 18:55:39,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:55:41,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:55:41,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:55:41,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=983913.3333333334, ans=0.2 2023-10-02 18:55:42,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.24 vs. limit=6.0 2023-10-02 18:55:42,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:55:42,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:55:44,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:55:47,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:48,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=983980.0, ans=0.5 2023-10-02 18:55:50,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:55:51,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=983980.0, ans=0.125 2023-10-02 18:55:52,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=983980.0, ans=0.0 2023-10-02 18:55:53,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 18:55:55,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 18:55:55,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:55:57,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 18:55:57,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:55:57,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:55:58,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=983980.0, ans=0.125 2023-10-02 18:55:59,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=983980.0, ans=0.125 2023-10-02 18:56:00,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:02,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:56:03,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 18:56:07,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:56:08,891 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.41 vs. limit=22.5 2023-10-02 18:56:09,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:56:09,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 18:56:10,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:56:12,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 18:56:15,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:56:15,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:56:16,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:18,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 18:56:18,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:18,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:56:18,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=984113.3333333334, ans=0.09899494936611666 2023-10-02 18:56:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:56:21,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 18:56:21,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:21,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:56:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:56:23,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 18:56:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:56:23,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:56:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:56:26,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:26,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 18:56:27,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:56:33,350 INFO [train.py:1046] (2/4) Epoch 28, batch 4200, loss[loss=0.1825, simple_loss=0.2461, pruned_loss=0.0595, over 23781.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2438, pruned_loss=0.04389, over 4726813.97 frames. ], batch size: 212, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:56:33,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:56:35,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 18:56:36,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:56:37,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:56:38,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=984180.0, ans=0.1 2023-10-02 18:56:39,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:56:39,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:56:40,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:56:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 18:56:46,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 18:56:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:48,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:56:49,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:56:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:56:56,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:56:57,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:57,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 18:56:58,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:57:00,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:57:01,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:57:01,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:57:02,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:57:03,564 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.95 vs. limit=15.0 2023-10-02 18:57:04,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 18:57:04,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:57:07,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:57:09,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:57:11,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:57:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:57:13,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=984313.3333333334, ans=0.2 2023-10-02 18:57:15,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:57:15,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 18:57:15,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:57:16,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:57:20,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=984380.0, ans=0.125 2023-10-02 18:57:21,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.whiten.whitening_limit, batch_count=984380.0, ans=12.0 2023-10-02 18:57:22,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:57:22,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:57:25,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=984380.0, ans=0.125 2023-10-02 18:57:29,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:57:32,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 18:57:35,471 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.827e+02 2.035e+02 2.482e+02 4.070e+02, threshold=4.070e+02, percent-clipped=0.0 2023-10-02 18:57:35,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:57:37,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=984446.6666666666, ans=0.2 2023-10-02 18:57:38,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:57:40,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:57:41,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 18:57:46,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=984513.3333333334, ans=0.125 2023-10-02 18:57:47,022 INFO [train.py:1046] (2/4) Epoch 28, batch 4250, loss[loss=0.1601, simple_loss=0.2302, pruned_loss=0.04504, over 23875.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2436, pruned_loss=0.04342, over 4740146.40 frames. ], batch size: 195, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:57:47,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:57:51,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:57:51,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:57:55,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:57:57,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.83 vs. limit=6.0 2023-10-02 18:57:58,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.25 vs. limit=15.0 2023-10-02 18:57:59,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:57:59,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 18:58:00,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:58:02,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:06,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:58:07,137 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.33 vs. limit=10.0 2023-10-02 18:58:09,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:09,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:11,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=984580.0, ans=0.0 2023-10-02 18:58:12,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:58:12,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:58:12,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:14,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:15,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:16,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:58:16,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:20,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 18:58:23,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 18:58:23,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:25,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:58:25,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:26,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:58:26,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:27,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:31,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:58:33,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:58:36,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=984713.3333333334, ans=0.125 2023-10-02 18:58:37,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:58:40,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:40,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 18:58:40,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:58:41,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 18:58:43,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:58:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:58:46,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:46,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:58:48,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 18:58:48,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=984780.0, ans=0.125 2023-10-02 18:58:50,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:58:50,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:58:55,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:55,931 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-10-02 18:58:56,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:56,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:58:58,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:59:00,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:59:01,435 INFO [train.py:1046] (2/4) Epoch 28, batch 4300, loss[loss=0.1817, simple_loss=0.2617, pruned_loss=0.05086, over 24334.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2437, pruned_loss=0.04357, over 4738709.99 frames. ], batch size: 77, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:59:01,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:59:02,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:59:02,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 18:59:04,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:59:08,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:59:09,220 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:59:10,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:59:13,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:59:17,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=984913.3333333334, ans=0.0 2023-10-02 18:59:20,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:59:20,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 18:59:22,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:59:23,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:59:23,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:59:23,896 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 18:59:28,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:59:28,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=984913.3333333334, ans=0.0 2023-10-02 18:59:29,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:59:32,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 18:59:33,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:59:33,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 18:59:34,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=984980.0, ans=0.0 2023-10-02 18:59:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 18:59:36,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:59:39,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:59:39,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:59:41,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:59:44,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:59:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:59:45,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 18:59:45,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=985046.6666666666, ans=0.0 2023-10-02 18:59:45,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=985046.6666666666, ans=0.05 2023-10-02 18:59:46,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 18:59:48,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:59:51,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:59:51,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:59:51,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:59:51,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:59:51,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 18:59:51,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 18:59:52,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 18:59:52,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:59:52,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 18:59:54,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 18:59:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:59:59,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=985113.3333333334, ans=0.05 2023-10-02 19:00:00,804 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 19:00:00,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:00:02,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:02,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:00:03,577 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.971e+02 2.225e+02 2.582e+02 4.307e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-02 19:00:05,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 19:00:05,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:00:05,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:06,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:00:06,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:00:07,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:00:09,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:00:10,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:13,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:13,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:00:15,292 INFO [train.py:1046] (2/4) Epoch 28, batch 4350, loss[loss=0.1734, simple_loss=0.2628, pruned_loss=0.04199, over 24538.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2443, pruned_loss=0.0439, over 4731341.18 frames. ], batch size: 71, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:00:18,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 19:00:18,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:00:18,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=985180.0, ans=0.125 2023-10-02 19:00:24,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:00:26,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:00:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:00:32,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:00:36,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:39,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:00:39,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:00:43,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:00:45,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:00:46,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:00:48,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.89 vs. limit=15.0 2023-10-02 19:00:52,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 19:00:54,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:00:54,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:58,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:02,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 19:01:05,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:06,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:01:07,104 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.18 vs. limit=12.0 2023-10-02 19:01:10,771 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 19:01:13,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:13,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:01:15,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 19:01:15,382 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 19:01:15,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:01:15,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:16,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:01:16,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:18,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:01:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:01:20,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 19:01:20,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:20,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:20,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:21,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=985446.6666666666, ans=0.0 2023-10-02 19:01:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 19:01:24,816 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 19:01:24,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 19:01:24,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 19:01:28,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:01:28,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:01:28,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:01:28,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=985513.3333333334, ans=0.125 2023-10-02 19:01:29,572 INFO [train.py:1046] (2/4) Epoch 28, batch 4400, loss[loss=0.1682, simple_loss=0.2423, pruned_loss=0.04703, over 23807.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2443, pruned_loss=0.04366, over 4733316.74 frames. ], batch size: 179, lr: 3.63e-03, grad_scale: 32.0 2023-10-02 19:01:29,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:01:31,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 19:01:33,076 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 19:01:33,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:37,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:01:37,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:37,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=985513.3333333334, ans=0.1 2023-10-02 19:01:38,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:40,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 19:01:41,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 19:01:41,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 19:01:41,906 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 19:01:42,374 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.81 vs. limit=22.5 2023-10-02 19:01:43,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.64 vs. limit=6.0 2023-10-02 19:01:44,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:01:44,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:01:46,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 19:01:48,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:49,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:49,401 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 19:01:52,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:01:52,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 19:01:52,302 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 19:01:56,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 19:01:56,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 19:01:56,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 19:01:56,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:58,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:58,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:59,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:02:01,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 19:02:02,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 19:02:02,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:02:02,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=985646.6666666666, ans=0.04949747468305833 2023-10-02 19:02:04,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:02:04,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:02:06,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:06,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:02:06,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 19:02:07,624 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 19:02:07,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=985646.6666666666, ans=0.0 2023-10-02 19:02:07,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=985646.6666666666, ans=0.125 2023-10-02 19:02:11,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:15,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:02:19,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 19:02:22,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:02:24,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:02:27,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=985713.3333333334, ans=0.125 2023-10-02 19:02:28,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:02:29,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 19:02:29,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:02:29,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:02:29,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:02:29,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:02:32,188 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.814e+02 2.040e+02 2.293e+02 3.611e+02, threshold=4.080e+02, percent-clipped=0.0 2023-10-02 19:02:32,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 19:02:33,110 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.88 vs. limit=10.0 2023-10-02 19:02:35,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 19:02:37,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 19:02:37,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:02:37,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 19:02:38,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:02:41,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:02:42,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 19:02:43,973 INFO [train.py:1046] (2/4) Epoch 28, batch 4450, loss[loss=0.2227, simple_loss=0.2888, pruned_loss=0.07828, over 19742.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2451, pruned_loss=0.04449, over 4722149.74 frames. ], batch size: 388, lr: 3.63e-03, grad_scale: 32.0 2023-10-02 19:02:46,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:02:48,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:48,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:02:54,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:02:55,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:03:00,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:01,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:03:04,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:03:04,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:03:06,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 19:03:06,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:03:06,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:06,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:03:06,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:03:09,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:03:14,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-10-02 19:03:15,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:15,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:17,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:03:17,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:03:18,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:03:19,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.43 vs. limit=12.0 2023-10-02 19:03:23,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 19:03:24,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 19:03:25,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 19:03:25,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:03:26,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:03:28,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 19:03:31,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:03:34,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:36,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 19:03:36,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:36,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:03:36,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:03:36,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:03:38,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:42,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:03:43,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 19:03:45,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:03:48,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:03:49,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:03:51,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:51,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 19:03:53,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:03:56,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 19:03:57,698 INFO [train.py:1046] (2/4) Epoch 28, batch 4500, loss[loss=0.1658, simple_loss=0.2444, pruned_loss=0.04356, over 24486.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2448, pruned_loss=0.04453, over 4715996.97 frames. ], batch size: 66, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:03:57,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:03:59,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:04:01,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 19:04:01,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 19:04:04,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:04:09,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:04:09,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:04:09,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:04:11,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:04:11,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:13,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:21,014 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.31 vs. limit=22.5 2023-10-02 19:04:23,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:04:24,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:04:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:04:27,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:04:27,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=986313.3333333334, ans=0.1 2023-10-02 19:04:29,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:04:33,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:04:38,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:04:41,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=986380.0, ans=0.0 2023-10-02 19:04:43,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:04:43,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=986380.0, ans=0.0 2023-10-02 19:04:46,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:04:46,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 19:04:47,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:04:49,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:04:50,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:04:50,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:04:52,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:52,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 19:04:52,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:04:52,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:04:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:04:56,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:04:57,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:00,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:05:00,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:05:02,251 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.810e+02 2.088e+02 2.457e+02 3.731e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-02 19:05:02,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 19:05:05,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 19:05:05,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 19:05:06,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 19:05:10,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 19:05:10,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:05:13,561 INFO [train.py:1046] (2/4) Epoch 28, batch 4550, loss[loss=0.1664, simple_loss=0.2324, pruned_loss=0.05018, over 23819.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2433, pruned_loss=0.04436, over 4714456.70 frames. ], batch size: 179, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:05:15,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:05:15,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:05:19,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:05:19,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=986513.3333333334, ans=0.1 2023-10-02 19:05:22,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:05:25,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:05:25,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:05:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:05:25,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:29,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:05:29,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:05:30,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=986580.0, ans=0.05 2023-10-02 19:05:32,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:05:35,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 19:05:37,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 19:05:38,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:05:38,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 19:05:42,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=986646.6666666666, ans=0.0 2023-10-02 19:05:43,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 19:05:45,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:05:50,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 19:05:51,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:05:54,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.95 vs. limit=15.0 2023-10-02 19:05:54,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:54,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:56,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:05:57,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 19:06:00,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:06:03,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:03,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:06:04,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:06:06,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 19:06:07,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 19:06:07,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:06:09,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 19:06:11,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 19:06:11,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:06:11,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=986713.3333333334, ans=0.125 2023-10-02 19:06:12,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:12,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:06:13,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:13,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:06:17,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:06:17,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=986780.0, ans=0.1 2023-10-02 19:06:18,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 19:06:18,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:06:18,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 19:06:20,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 19:06:20,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:06:20,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 19:06:23,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:06:23,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:06:26,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:06:26,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:26,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:06:28,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:06:30,061 INFO [train.py:1046] (2/4) Epoch 28, batch 4600, loss[loss=0.1636, simple_loss=0.26, pruned_loss=0.03364, over 24421.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.243, pruned_loss=0.04402, over 4719044.16 frames. ], batch size: 69, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:06:30,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:06:31,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:31,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:06:36,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:06:36,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:06:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:06:39,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 19:06:39,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=986846.6666666666, ans=0.5 2023-10-02 19:06:40,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:06:43,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:06:45,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:06:48,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:53,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=986913.3333333334, ans=0.125 2023-10-02 19:06:54,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 19:06:55,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:57,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:58,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=986980.0, ans=0.0 2023-10-02 19:07:00,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:07:00,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:07:06,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 19:07:06,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:07:07,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:07:07,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=986980.0, ans=0.125 2023-10-02 19:07:10,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:12,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:07:12,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:07:14,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 19:07:16,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:07:21,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:22,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:07:23,782 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.18 vs. limit=15.0 2023-10-02 19:07:25,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:25,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 19:07:25,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:27,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 19:07:27,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:27,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:29,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:29,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:07:31,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 19:07:32,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 19:07:33,824 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.870e+02 2.077e+02 2.555e+02 3.884e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 19:07:33,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 19:07:33,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:34,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:07:34,820 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.42 vs. limit=15.0 2023-10-02 19:07:35,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:36,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:44,202 INFO [train.py:1046] (2/4) Epoch 28, batch 4650, loss[loss=0.1685, simple_loss=0.2358, pruned_loss=0.05063, over 23913.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2424, pruned_loss=0.04363, over 4721483.26 frames. ], batch size: 195, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:07:46,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:07:48,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:07:48,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=987180.0, ans=0.1 2023-10-02 19:07:49,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:49,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:07:49,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:49,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:07:53,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:54,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=987180.0, ans=0.2 2023-10-02 19:07:56,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 19:07:56,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=987180.0, ans=0.125 2023-10-02 19:07:58,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=987246.6666666666, ans=0.125 2023-10-02 19:07:59,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:07:59,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 19:07:59,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:08:00,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 19:08:00,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:08:02,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 19:08:02,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 19:08:02,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:03,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:08:06,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:08:08,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:08,146 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 19:08:08,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=987246.6666666666, ans=0.2 2023-10-02 19:08:10,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 19:08:13,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:13,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:08:16,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 19:08:16,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:08:19,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:08:24,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:08:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:31,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:31,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:31,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:08:35,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 19:08:35,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 19:08:36,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 19:08:36,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 19:08:38,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:08:44,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:08:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:08:44,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 19:08:45,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:08:47,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:08:47,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:08:47,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:08:48,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:08:48,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:08:50,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:53,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:08:54,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:08:54,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:08:54,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 19:08:57,389 INFO [train.py:1046] (2/4) Epoch 28, batch 4700, loss[loss=0.1794, simple_loss=0.2495, pruned_loss=0.05466, over 23818.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2427, pruned_loss=0.04363, over 4723529.45 frames. ], batch size: 195, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:08:57,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:08:58,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 19:09:00,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=987513.3333333334, ans=0.1 2023-10-02 19:09:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:06,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=987513.3333333334, ans=0.125 2023-10-02 19:09:08,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:09:08,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:09:08,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:09:10,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:09:14,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 19:09:14,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 19:09:16,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=987580.0, ans=0.05 2023-10-02 19:09:17,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:19,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:09:20,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:09:23,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:23,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=987580.0, ans=0.025 2023-10-02 19:09:30,071 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.86 vs. limit=15.0 2023-10-02 19:09:30,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:09:32,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 19:09:33,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:09:38,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 19:09:40,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:09:42,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:09:47,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 19:09:47,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:09:48,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=987713.3333333334, ans=0.0 2023-10-02 19:09:52,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:09:53,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 19:09:55,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:09:55,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:09:57,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:57,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:09:57,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 19:09:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 19:09:59,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:10:00,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:00,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:02,128 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.850e+02 1.974e+02 2.169e+02 3.460e+02, threshold=3.948e+02, percent-clipped=0.0 2023-10-02 19:10:02,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 19:10:02,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:04,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 19:10:08,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=987780.0, ans=0.125 2023-10-02 19:10:09,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:10:09,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:10,751 INFO [train.py:1046] (2/4) Epoch 28, batch 4750, loss[loss=0.1794, simple_loss=0.2481, pruned_loss=0.05533, over 22809.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2437, pruned_loss=0.04418, over 4714136.07 frames. ], batch size: 322, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:10:13,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:13,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:10:15,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 19:10:15,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:10:17,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 19:10:21,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:10:21,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:10:21,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:10:25,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=987913.3333333334, ans=0.125 2023-10-02 19:10:26,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 19:10:32,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:10:34,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 19:10:34,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:10:38,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:10:38,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:10:38,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:38,713 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 19:10:40,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 19:10:46,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 19:10:49,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:10:50,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:10:53,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:10:53,683 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 19:10:53,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:10:56,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:10:59,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:11:00,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 19:11:01,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 19:11:01,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:11:01,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:11:02,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:03,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 19:11:03,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 19:11:04,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 19:11:09,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:14,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:11:14,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 19:11:14,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:11:14,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=988113.3333333334, ans=0.2 2023-10-02 19:11:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:17,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:11:18,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:18,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:11:21,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:11:21,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=988113.3333333334, ans=0.125 2023-10-02 19:11:22,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 19:11:23,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 19:11:24,791 INFO [train.py:1046] (2/4) Epoch 28, batch 4800, loss[loss=0.1915, simple_loss=0.2636, pruned_loss=0.05967, over 23883.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2453, pruned_loss=0.04469, over 4708487.50 frames. ], batch size: 196, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:11:24,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 19:11:26,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:11:27,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:11:27,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 19:11:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:32,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:39,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:11:40,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:40,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:41,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 19:11:42,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:11:42,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:11:42,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:11:47,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:11:48,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:49,309 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-10-02 19:11:49,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:11:51,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:51,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 19:11:51,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:51,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:56,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:58,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:59,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.47 vs. limit=15.0 2023-10-02 19:12:01,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:12:01,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:12:02,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 19:12:02,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:04,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 19:12:04,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 19:12:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:05,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:12:05,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:12:05,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:12:05,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:12:08,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:12:08,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:12:13,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:12:13,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:15,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:19,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 19:12:21,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:12:21,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:21,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:12:22,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:22,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=988446.6666666666, ans=0.125 2023-10-02 19:12:27,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:12:27,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:12:27,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:28,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:12:28,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:12:30,075 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.923e+02 2.078e+02 2.346e+02 3.782e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 19:12:30,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:12:34,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:34,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:12:35,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 19:12:37,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=988513.3333333334, ans=0.125 2023-10-02 19:12:38,346 INFO [train.py:1046] (2/4) Epoch 28, batch 4850, loss[loss=0.1532, simple_loss=0.2441, pruned_loss=0.03116, over 24326.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2452, pruned_loss=0.04471, over 4701827.99 frames. ], batch size: 74, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:12:38,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 19:12:38,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:12:38,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:12:40,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:12:40,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:43,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.51 vs. limit=15.0 2023-10-02 19:12:43,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:49,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 19:12:52,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:56,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:12:57,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:12:57,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:13:00,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:13:01,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:13:01,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=988580.0, ans=0.125 2023-10-02 19:13:02,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:13:04,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 19:13:06,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:13:09,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:13:09,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:13:11,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:13:11,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 19:13:14,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:13:14,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:19,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:19,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 19:13:19,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 19:13:21,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:13:30,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:13:30,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 19:13:31,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:13:31,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:13:32,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:13:34,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 19:13:34,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:37,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 19:13:37,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:13:37,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:13:38,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 19:13:47,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:52,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:13:52,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:13:54,087 INFO [train.py:1046] (2/4) Epoch 28, batch 4900, loss[loss=0.1589, simple_loss=0.2471, pruned_loss=0.03529, over 24440.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2438, pruned_loss=0.04408, over 4710608.50 frames. ], batch size: 69, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:13:56,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 19:13:56,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:13:57,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=988846.6666666666, ans=0.125 2023-10-02 19:14:01,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=988846.6666666666, ans=0.0 2023-10-02 19:14:02,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:03,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:14:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:14:07,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 19:14:12,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 19:14:15,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 19:14:17,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 19:14:18,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:14:18,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:14:18,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:14:18,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:14:19,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:14:19,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 19:14:23,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 19:14:25,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:14:25,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:14:26,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:14:26,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:14:28,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:28,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:14:28,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 19:14:29,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:14:30,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=988980.0, ans=0.1 2023-10-02 19:14:31,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:14:31,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 19:14:31,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 19:14:35,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 19:14:37,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:14:38,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:14:38,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:14:39,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:39,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 19:14:39,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:14:39,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 19:14:44,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:14:46,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:14:47,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:14:50,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=989046.6666666666, ans=15.0 2023-10-02 19:14:51,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 19:14:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:14:52,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 19:14:53,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 19:14:59,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:15:00,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.57 vs. limit=15.0 2023-10-02 19:15:01,254 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.868e+02 2.016e+02 2.240e+02 3.516e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 19:15:01,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:15:04,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 19:15:04,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:15:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:15:05,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:06,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.42 vs. limit=6.0 2023-10-02 19:15:08,283 INFO [train.py:1046] (2/4) Epoch 28, batch 4950, loss[loss=0.1727, simple_loss=0.2602, pruned_loss=0.04256, over 24035.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2423, pruned_loss=0.04335, over 4722111.73 frames. ], batch size: 80, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:15:09,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:15:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:15:09,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:15:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 19:15:12,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:15:15,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:15:15,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:15:17,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 19:15:18,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 19:15:19,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:15:20,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 19:15:20,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:20,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:15:22,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:15:22,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:23,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:25,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:15:26,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:15:26,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:15:26,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=989246.6666666666, ans=0.2 2023-10-02 19:15:29,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:29,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:15:31,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:15:33,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=989246.6666666666, ans=0.125 2023-10-02 19:15:35,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:36,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:15:38,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:38,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:39,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:15:41,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=989313.3333333334, ans=0.0 2023-10-02 19:15:42,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 19:15:42,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 19:15:45,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:47,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:15:47,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:15:48,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:15:50,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:15:50,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:15:50,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=989313.3333333334, ans=0.125 2023-10-02 19:15:53,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:55,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:15:57,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:15:58,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:58,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:59,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 19:15:59,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:16:01,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:16:07,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:16:08,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:16:08,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:16:08,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:16:08,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:16:09,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:16:10,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=989446.6666666666, ans=0.125 2023-10-02 19:16:11,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:16:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:16:12,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:16:12,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 19:16:17,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:21,700 INFO [train.py:1046] (2/4) Epoch 28, batch 5000, loss[loss=0.1625, simple_loss=0.2439, pruned_loss=0.04051, over 23180.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2415, pruned_loss=0.04334, over 4712981.50 frames. ], batch size: 105, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:16:21,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 19:16:21,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 19:16:23,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=989513.3333333334, ans=0.2 2023-10-02 19:16:27,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:16:28,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:16:30,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 19:16:30,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 19:16:33,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:16:34,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 19:16:34,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:16:34,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:16:36,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 19:16:36,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:16:36,710 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.35 vs. limit=15.0 2023-10-02 19:16:37,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:16:38,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 19:16:38,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:38,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:16:41,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 19:16:41,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 19:16:41,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:16:43,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 19:16:43,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:16:43,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:43,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:16:43,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 19:16:43,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=989580.0, ans=0.1 2023-10-02 19:16:44,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 19:16:46,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 19:16:46,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:16:47,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:47,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 19:16:49,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:16:51,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:51,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:53,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 19:16:54,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 19:16:55,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:16:57,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:17:01,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 19:17:05,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:17:07,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:17:07,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:10,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 19:17:10,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:17:10,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:17:11,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:17:12,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 19:17:12,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:17:14,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=989713.3333333334, ans=0.0 2023-10-02 19:17:15,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:17:16,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:17:22,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 19:17:27,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=989780.0, ans=0.125 2023-10-02 19:17:28,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.761e+02 1.932e+02 2.126e+02 3.192e+02, threshold=3.865e+02, percent-clipped=0.0 2023-10-02 19:17:28,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:31,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=989780.0, ans=0.2 2023-10-02 19:17:34,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=989846.6666666666, ans=0.0 2023-10-02 19:17:35,322 INFO [train.py:1046] (2/4) Epoch 28, batch 5050, loss[loss=0.1654, simple_loss=0.2494, pruned_loss=0.04068, over 24486.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2427, pruned_loss=0.04377, over 4715362.99 frames. ], batch size: 66, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:17:36,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:17:36,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:36,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:17:36,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:17:38,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:17:38,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:17:38,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:43,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:43,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 19:17:43,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:17:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:17:47,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:17:48,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 19:17:49,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:17:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:17:51,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=989913.3333333334, ans=0.0 2023-10-02 19:17:52,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:17:54,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:17:54,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:18:03,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 19:18:03,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:18:05,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:18:05,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 19:18:05,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:18:05,739 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:18:06,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:08,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:18:08,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:18:08,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 19:18:09,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 19:18:10,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:12,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:15,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:16,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 19:18:17,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:18:21,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 19:18:21,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=990046.6666666666, ans=0.2 2023-10-02 19:18:22,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:18:22,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:18:22,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:18:24,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:18:26,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:18:27,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=990046.6666666666, ans=0.05 2023-10-02 19:18:28,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:18:30,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:30,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:18:30,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:18:30,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 19:18:32,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:18:33,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:18:37,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=990113.3333333334, ans=0.2 2023-10-02 19:18:38,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:18:38,391 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 19:18:38,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:18:39,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:18:39,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:41,032 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 19:18:43,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:43,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 19:18:43,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:46,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:18:46,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:48,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 19:18:49,388 INFO [train.py:1046] (2/4) Epoch 28, batch 5100, loss[loss=0.1474, simple_loss=0.2272, pruned_loss=0.03374, over 20221.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2435, pruned_loss=0.04419, over 4709620.31 frames. ], batch size: 44, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:18:49,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 19:18:50,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:18:52,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:18:52,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:18:54,414 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 19:18:55,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:57,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 19:18:58,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 19:18:59,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:19:01,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:19:03,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:19:04,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 19:19:04,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 19:19:09,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:19:09,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:19:13,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:19:15,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 19:19:15,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:19:17,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:19:17,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 19:19:21,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:22,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:22,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 19:19:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 19:19:25,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:25,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 19:19:26,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 19:19:29,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:19:35,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:19:38,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 19:19:38,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 19:19:38,706 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 19:19:40,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 19:19:40,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:44,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 19:19:46,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 19:19:50,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 19:19:52,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:19:54,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 19:19:55,502 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.787e+02 2.059e+02 2.354e+02 3.734e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 19:19:57,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:19:58,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 19:19:59,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.36 vs. limit=6.0 2023-10-02 19:20:02,925 INFO [train.py:1046] (2/4) Epoch 28, batch 5150, loss[loss=0.1457, simple_loss=0.2286, pruned_loss=0.03142, over 24603.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2444, pruned_loss=0.0448, over 4710873.37 frames. ], batch size: 60, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:20:04,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:20:04,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:20:04,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:20:04,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:20:06,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:20:06,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:20:06,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 19:20:06,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 19:20:07,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 19:20:07,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:20:07,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 19:20:09,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:09,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 19:20:10,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:11,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:16,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:20:16,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 19:20:16,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=990580.0, ans=0.1 2023-10-02 19:20:18,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:18,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:20:18,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:20:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:20:18,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:20:20,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:20:20,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:20:20,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 19:20:22,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:20:23,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:20:26,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:20:28,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 19:20:29,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:20:34,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:20:37,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 19:20:41,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:20:45,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:20:45,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:49,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:20:50,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:20:52,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 19:20:54,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=990713.3333333334, ans=0.125 2023-10-02 19:20:57,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:59,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:20:59,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:21:03,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:04,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:21:04,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 19:21:09,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:21:10,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=990780.0, ans=0.1 2023-10-02 19:21:11,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:21:14,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:21:14,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:21:15,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:21:15,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:21:15,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:21:15,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:21:16,764 INFO [train.py:1046] (2/4) Epoch 28, batch 5200, loss[loss=0.1647, simple_loss=0.2526, pruned_loss=0.03842, over 24462.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2445, pruned_loss=0.04438, over 4731803.65 frames. ], batch size: 69, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:21:19,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:21:19,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:21:22,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:28,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 19:21:30,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:21:31,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:34,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:34,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:21:34,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:37,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 19:21:40,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:21:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:42,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 19:21:43,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:21:45,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:21:46,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 19:21:46,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 19:21:49,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 19:21:50,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:50,666 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 19:21:50,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:52,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:21:52,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:21:53,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 19:21:53,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:21:54,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:58,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 19:21:59,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 19:21:59,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 19:22:04,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 19:22:05,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:22:10,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:22:10,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:12,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 19:22:12,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:22:12,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:22:12,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:13,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:22:14,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=991046.6666666666, ans=0.0 2023-10-02 19:22:16,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:22:17,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:22:19,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=991113.3333333334, ans=0.0 2023-10-02 19:22:20,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:22:22,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:22,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:23,452 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.958e+02 2.159e+02 2.397e+02 4.088e+02, threshold=4.317e+02, percent-clipped=0.0 2023-10-02 19:22:26,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:28,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 19:22:28,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:22:28,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:22:29,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:30,961 INFO [train.py:1046] (2/4) Epoch 28, batch 5250, loss[loss=0.1642, simple_loss=0.2189, pruned_loss=0.05472, over 22684.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.243, pruned_loss=0.0441, over 4728304.46 frames. ], batch size: 322, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:22:31,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:22:31,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:22:32,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=991180.0, ans=0.035 2023-10-02 19:22:32,704 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:22:33,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:22:37,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:37,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:22:38,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:22:45,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:47,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:22:50,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:22:51,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:22:54,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 19:22:54,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:54,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:58,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=991313.3333333334, ans=0.1 2023-10-02 19:23:05,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=991313.3333333334, ans=0.1 2023-10-02 19:23:15,191 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.05 vs. limit=22.5 2023-10-02 19:23:39,947 INFO [train.py:1046] (2/4) Epoch 28, batch 5300, loss[loss=0.1734, simple_loss=0.2434, pruned_loss=0.05167, over 13487.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2418, pruned_loss=0.04352, over 4706118.27 frames. ], batch size: 28, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:23:52,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=991580.0, ans=0.0 2023-10-02 19:23:55,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:23:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 19:23:55,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 19:23:55,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:55,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:55,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:55,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:55,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:55,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:23:55,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:56,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:23:56,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:23:56,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 19:23:56,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 19:23:56,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 19:23:56,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:23:56,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 19:23:56,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 19:23:56,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:57,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:57,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:57,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:23:57,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:23:57,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:23:57,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:57,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:58,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:58,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:58,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:23:58,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:58,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:23:58,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 19:23:58,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:23:58,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:58,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 19:23:58,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 19:23:59,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:23:59,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:23:59,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 19:23:59,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 19:23:59,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:23:59,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:23:59,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:23:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 19:23:59,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 19:23:59,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:24:00,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:24:00,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 19:24:00,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 19:24:00,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 19:24:00,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:24:07,201 INFO [train.py:1046] (2/4) Epoch 29, batch 0, loss[loss=0.1665, simple_loss=0.2428, pruned_loss=0.0451, over 23697.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2428, pruned_loss=0.0451, over 23697.00 frames. ], batch size: 232, lr: 3.56e-03, grad_scale: 32.0 2023-10-02 19:24:07,201 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 19:24:18,214 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.8230, 1.7438, 3.8497, 3.5798], device='cuda:2') 2023-10-02 19:24:19,053 INFO [train.py:1078] (2/4) Epoch 29, validation: loss=0.3081, simple_loss=0.2785, pruned_loss=0.1688, over 1125622.00 frames. 2023-10-02 19:24:19,053 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 19:24:20,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 19:24:20,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:24:22,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:24:27,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:27,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:24:27,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:29,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 19:24:30,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 19:24:32,250 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:24:33,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:33,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:38,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:38,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:39,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:24:39,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:24:40,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 19:24:43,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:24:50,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:24:50,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:54,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 19:24:57,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:24:57,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:24:59,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:03,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:25:05,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.855e+02 2.126e+02 2.436e+02 5.590e+02, threshold=4.252e+02, percent-clipped=2.0 2023-10-02 19:25:07,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:13,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 19:25:17,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 19:25:18,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:25:18,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:18,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:25:19,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:25:21,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 19:25:22,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:24,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:26,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:25:29,434 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 19:25:30,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:25:32,150 INFO [train.py:1046] (2/4) Epoch 29, batch 50, loss[loss=0.224, simple_loss=0.2951, pruned_loss=0.0765, over 19983.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2478, pruned_loss=0.04583, over 1060765.19 frames. ], batch size: 388, lr: 3.56e-03, grad_scale: 32.0 2023-10-02 19:25:33,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:25:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:25:34,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 19:25:36,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:25:36,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:25:37,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:25:37,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=991933.3333333334, ans=0.2 2023-10-02 19:25:41,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:25:43,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:25:46,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 19:25:46,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:47,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.60 vs. limit=10.0 2023-10-02 19:25:49,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=992000.0, ans=0.025 2023-10-02 19:25:52,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=992000.0, ans=0.125 2023-10-02 19:25:53,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:25:54,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 19:25:56,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 19:25:58,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:26:00,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:00,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:26:01,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:26:03,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:26:03,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:26:03,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:26:09,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=992066.6666666666, ans=0.125 2023-10-02 19:26:10,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:26:13,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:13,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:26:14,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 19:26:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:26:17,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:26:17,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 19:26:17,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:26:18,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 19:26:26,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:26:26,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:26:28,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:28,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:28,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:26:32,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 19:26:32,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 19:26:33,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:33,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:26:35,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:26:35,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:26:35,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=992200.0, ans=0.125 2023-10-02 19:26:36,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 19:26:37,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 19:26:38,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 19:26:38,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:26:38,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:26:39,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 19:26:39,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 19:26:41,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:26:41,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=992200.0, ans=0.125 2023-10-02 19:26:42,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:45,420 INFO [train.py:1046] (2/4) Epoch 29, batch 100, loss[loss=0.2326, simple_loss=0.2928, pruned_loss=0.0862, over 19691.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.247, pruned_loss=0.04473, over 1867884.47 frames. ], batch size: 388, lr: 3.56e-03, grad_scale: 16.0 2023-10-02 19:26:45,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:26:45,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:26:46,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:26:48,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=992266.6666666666, ans=0.0 2023-10-02 19:26:49,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:26:51,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:26:52,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 19:26:52,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:56,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:26:56,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:26:58,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:58,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:58,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:27:00,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 19:27:02,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:27:02,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:02,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:02,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:27:06,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 19:27:06,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:09,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:11,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:27:13,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:27:15,800 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 19:27:15,814 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 19:27:15,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=992400.0, ans=0.0 2023-10-02 19:27:17,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:17,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:27:20,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-10-02 19:27:21,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:27:24,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:25,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:30,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:30,427 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 19:27:30,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=992466.6666666666, ans=0.0 2023-10-02 19:27:32,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 19:27:34,992 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.846e+02 1.979e+02 2.261e+02 3.658e+02, threshold=3.958e+02, percent-clipped=0.0 2023-10-02 19:27:35,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:27:36,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:27:38,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:39,734 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:27:40,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:44,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:27:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:27:47,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=992533.3333333334, ans=0.125 2023-10-02 19:27:47,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=12.0 2023-10-02 19:27:48,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:48,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:49,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:49,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:27:49,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:49,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 19:27:49,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 19:27:50,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:50,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:27:52,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:27:52,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:52,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 19:27:52,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:27:53,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:27:53,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:27:53,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:53,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:55,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:27:56,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:27:57,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:59,693 INFO [train.py:1046] (2/4) Epoch 29, batch 150, loss[loss=0.1569, simple_loss=0.2337, pruned_loss=0.04007, over 24582.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.246, pruned_loss=0.04463, over 2506746.78 frames. ], batch size: 60, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:27:59,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:27:59,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:04,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:28:04,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:28:08,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:11,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 19:28:11,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 19:28:11,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 19:28:14,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:28:14,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:28:14,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:28:17,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:28:17,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:28:17,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:18,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:20,033 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 19:28:21,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:28:26,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:30,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:28:30,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 19:28:33,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:28:33,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:33,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:28:36,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:28:37,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:28:40,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:28:41,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:41,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 19:28:48,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:50,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:28:50,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:28:50,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:28:52,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:55,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 19:28:55,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=992866.6666666666, ans=0.0 2023-10-02 19:28:58,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:28:59,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:29:01,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:02,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:29:03,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 19:29:04,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:29:04,389 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 19:29:07,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:29:11,743 INFO [train.py:1046] (2/4) Epoch 29, batch 200, loss[loss=0.1768, simple_loss=0.2616, pruned_loss=0.04607, over 23499.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2462, pruned_loss=0.04465, over 3008682.41 frames. ], batch size: 93, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:29:11,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:29:11,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:29:16,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 19:29:17,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:17,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:21,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 19:29:22,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:29:23,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=992933.3333333334, ans=0.125 2023-10-02 19:29:24,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:24,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:29:29,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:29:29,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:29:29,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:44,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=993066.6666666666, ans=0.125 2023-10-02 19:29:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:29:45,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:29:46,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:29:48,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:29:49,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 19:29:49,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:29:50,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.92 vs. limit=12.0 2023-10-02 19:29:51,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=993066.6666666666, ans=0.1 2023-10-02 19:29:51,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=993066.6666666666, ans=0.125 2023-10-02 19:29:52,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:29:52,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:29:53,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:53,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:29:55,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 19:29:55,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:29:55,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:56,991 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=22.5 2023-10-02 19:29:59,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:30:00,532 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.804e+02 1.986e+02 2.199e+02 3.066e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 19:30:03,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:30:09,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=993200.0, ans=0.125 2023-10-02 19:30:10,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:11,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:30:20,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:23,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 19:30:23,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:30:23,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:30:23,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:30:24,480 INFO [train.py:1046] (2/4) Epoch 29, batch 250, loss[loss=0.1742, simple_loss=0.2608, pruned_loss=0.04381, over 24428.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2453, pruned_loss=0.04411, over 3399498.85 frames. ], batch size: 69, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:30:24,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:30:24,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 19:30:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:30:26,001 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 19:30:27,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:28,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:30:30,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:30,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:30:30,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=993266.6666666666, ans=0.0 2023-10-02 19:30:33,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:30:34,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:35,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:30:40,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:30:51,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=993333.3333333334, ans=0.5 2023-10-02 19:30:53,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:30:54,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:30:55,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:30:56,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=993400.0, ans=0.1 2023-10-02 19:31:00,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:31:01,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:31:01,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:31:01,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:31:02,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:31:04,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:31:04,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:31:05,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:31:08,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 19:31:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:31:11,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:31:12,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:31:12,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:31:14,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:31:14,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:31:14,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:31:16,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:19,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:31:19,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:24,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:31:26,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:27,211 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:31:29,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:31:34,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:36,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:31:37,998 INFO [train.py:1046] (2/4) Epoch 29, batch 300, loss[loss=0.1738, simple_loss=0.2479, pruned_loss=0.04982, over 23364.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.245, pruned_loss=0.04384, over 3704676.43 frames. ], batch size: 119, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:31:38,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 19:31:39,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:31:39,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:31:40,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 19:31:40,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:31:42,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:31:44,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 19:31:48,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:49,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:31:51,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:31:53,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 19:31:53,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:56,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:31:56,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 19:31:56,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:31:59,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:32:03,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:32:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 19:32:04,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=993666.6666666666, ans=0.0 2023-10-02 19:32:07,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 19:32:07,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:09,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=993733.3333333334, ans=0.125 2023-10-02 19:32:10,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:32:10,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:10,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 19:32:10,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:32:13,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:32:15,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:32:15,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:32:20,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 19:32:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 19:32:21,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:32:24,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:26,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 19:32:27,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:32:28,948 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.830e+02 2.036e+02 2.251e+02 3.092e+02, threshold=4.071e+02, percent-clipped=0.0 2023-10-02 19:32:30,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:32:33,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:32:33,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 19:32:37,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:37,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:32:41,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:43,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:32:43,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 19:32:45,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:32:45,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:32:45,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 19:32:46,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:46,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:32:48,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:32:49,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:32:49,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:32:52,475 INFO [train.py:1046] (2/4) Epoch 29, batch 350, loss[loss=0.1459, simple_loss=0.1955, pruned_loss=0.04817, over 19232.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2432, pruned_loss=0.04334, over 3930869.07 frames. ], batch size: 388, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:32:54,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:32:54,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 19:32:56,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.26 vs. limit=22.5 2023-10-02 19:32:57,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:02,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:33:05,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:06,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:09,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 19:33:11,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:33:11,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 19:33:11,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=994000.0, ans=0.125 2023-10-02 19:33:11,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=994000.0, ans=0.125 2023-10-02 19:33:14,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:16,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 19:33:16,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:33:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 19:33:21,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:33:22,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:33:22,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:33:23,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:23,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:33:25,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:25,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:33:27,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:33:27,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:27,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=994066.6666666666, ans=0.0 2023-10-02 19:33:28,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=994066.6666666666, ans=0.125 2023-10-02 19:33:30,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=994066.6666666666, ans=0.125 2023-10-02 19:33:32,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:33:34,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:33:34,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=994066.6666666666, ans=0.125 2023-10-02 19:33:35,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:33:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:41,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 19:33:41,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:45,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:45,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:33:45,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:33:45,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.54 vs. limit=10.0 2023-10-02 19:33:46,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 19:33:49,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:33:50,480 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.10 vs. limit=15.0 2023-10-02 19:33:50,986 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 19:33:52,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 19:33:52,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:54,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:33:54,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 19:33:55,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:33:56,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=994200.0, ans=0.2 2023-10-02 19:33:58,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:33:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:00,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:00,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:34:04,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:34:06,739 INFO [train.py:1046] (2/4) Epoch 29, batch 400, loss[loss=0.165, simple_loss=0.2481, pruned_loss=0.04091, over 24636.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.243, pruned_loss=0.04332, over 4111992.53 frames. ], batch size: 65, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:34:08,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:34:09,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=994266.6666666666, ans=15.0 2023-10-02 19:34:09,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:34:10,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 19:34:10,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:12,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:13,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:34:15,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:17,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=994266.6666666666, ans=0.125 2023-10-02 19:34:18,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:20,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:23,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 19:34:24,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.67 vs. limit=15.0 2023-10-02 19:34:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 19:34:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:26,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 19:34:27,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:32,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:34:32,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:32,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 19:34:32,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:34:32,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:32,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:33,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:33,854 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 19:34:35,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 19:34:35,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=994400.0, ans=0.2 2023-10-02 19:34:39,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:40,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 19:34:41,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 19:34:42,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=994400.0, ans=0.1 2023-10-02 19:34:44,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:34:47,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:34:52,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 19:34:55,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:34:57,565 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.820e+02 1.968e+02 2.221e+02 3.877e+02, threshold=3.936e+02, percent-clipped=0.0 2023-10-02 19:34:57,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 19:34:59,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:59,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:34:59,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 19:35:01,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:35:03,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:35:04,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:35:04,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=994533.3333333334, ans=0.125 2023-10-02 19:35:07,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:07,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 19:35:10,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:35:11,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 19:35:14,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:35:14,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:35:17,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 19:35:19,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:35:20,552 INFO [train.py:1046] (2/4) Epoch 29, batch 450, loss[loss=0.1668, simple_loss=0.2565, pruned_loss=0.03851, over 24294.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2433, pruned_loss=0.04317, over 4247341.42 frames. ], batch size: 74, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:35:20,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:35:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:35:22,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 19:35:22,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:35:23,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:35:23,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:35:23,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 19:35:24,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:35:24,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:35:26,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:35:33,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:33,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:35:35,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 19:35:35,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 19:35:35,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=994666.6666666666, ans=0.125 2023-10-02 19:35:39,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:35:42,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:43,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:35:48,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:35:48,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:35:51,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 19:35:51,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 19:35:51,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=994733.3333333334, ans=0.0 2023-10-02 19:35:54,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 19:35:55,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:35:57,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:35:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:35:58,885 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 19:35:58,893 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 19:36:00,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:00,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=994733.3333333334, ans=0.125 2023-10-02 19:36:01,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:36:02,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 19:36:03,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.49 vs. limit=15.0 2023-10-02 19:36:05,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:36:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:36:07,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 19:36:07,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 19:36:09,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:36:11,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:36:13,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:36:13,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 19:36:16,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:36:17,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 19:36:19,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 19:36:19,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:36:25,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:36:28,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:36:29,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:36:29,649 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 19:36:32,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:36:33,975 INFO [train.py:1046] (2/4) Epoch 29, batch 500, loss[loss=0.1683, simple_loss=0.256, pruned_loss=0.04031, over 24419.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2442, pruned_loss=0.04343, over 4356644.50 frames. ], batch size: 69, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:36:34,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:36:34,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:36:34,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 19:36:36,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 19:36:36,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:36:39,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:36:42,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:36:44,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:36:47,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:36:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:36:47,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:36:57,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=995000.0, ans=0.125 2023-10-02 19:36:59,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:59,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:36:59,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:36:59,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:59,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 19:37:01,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:37:01,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=995000.0, ans=0.2 2023-10-02 19:37:03,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:37:03,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:37:03,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:37:04,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 19:37:08,183 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 19:37:10,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:12,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:14,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:14,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:15,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:37:16,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 19:37:19,095 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.45 vs. limit=15.0 2023-10-02 19:37:19,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=995133.3333333334, ans=0.125 2023-10-02 19:37:20,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:37:22,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:24,156 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.819e+02 1.997e+02 2.314e+02 3.134e+02, threshold=3.994e+02, percent-clipped=0.0 2023-10-02 19:37:25,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:28,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.78 vs. limit=15.0 2023-10-02 19:37:28,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:33,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:37,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 19:37:37,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:37,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:38,825 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:37:40,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 19:37:41,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:37:42,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:47,407 INFO [train.py:1046] (2/4) Epoch 29, batch 550, loss[loss=0.1641, simple_loss=0.2402, pruned_loss=0.04405, over 23405.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2446, pruned_loss=0.04311, over 4456923.85 frames. ], batch size: 119, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:37:47,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 19:37:48,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 19:37:48,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:48,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 19:37:50,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:37:50,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:50,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:52,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:52,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:37:52,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:37:55,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:56,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 19:37:56,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:38:01,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:01,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:05,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:38:05,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:09,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 19:38:10,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 19:38:12,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:38:13,014 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.46 vs. limit=15.0 2023-10-02 19:38:18,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:38:18,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:38:19,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:38:23,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:23,408 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 19:38:24,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:25,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 19:38:28,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:38:30,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:38:30,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:38:31,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:33,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 19:38:33,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 19:38:34,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:38:34,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:38:34,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:38:34,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:38:37,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:38:39,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:38:40,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:38:41,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:41,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 19:38:43,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:38:45,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:38:46,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:38:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:46,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=995533.3333333334, ans=0.0 2023-10-02 19:38:48,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:38:48,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 19:38:53,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=995533.3333333334, ans=0.1 2023-10-02 19:38:54,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 19:38:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 19:38:57,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:38:57,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:38:58,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:02,064 INFO [train.py:1046] (2/4) Epoch 29, batch 600, loss[loss=0.165, simple_loss=0.2489, pruned_loss=0.0405, over 24639.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2451, pruned_loss=0.0436, over 4516194.18 frames. ], batch size: 65, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:39:07,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:39:09,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:39:10,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 19:39:13,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:39:13,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:39:14,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:16,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 19:39:18,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:39:24,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 19:39:27,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:39:27,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:27,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:39:36,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:39:36,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:39:36,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:44,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:39:47,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:47,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:39:49,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:53,223 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.812e+02 1.989e+02 2.203e+02 3.587e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-02 19:39:55,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 19:39:57,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=995800.0, ans=0.125 2023-10-02 19:40:02,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:40:02,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:40:05,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 19:40:06,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:40:09,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 19:40:09,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:40:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:40:15,413 INFO [train.py:1046] (2/4) Epoch 29, batch 650, loss[loss=0.1706, simple_loss=0.2519, pruned_loss=0.04468, over 24457.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2443, pruned_loss=0.04375, over 4559017.78 frames. ], batch size: 63, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:40:15,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 19:40:16,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:40:20,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:40:20,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:40:22,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:26,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 19:40:27,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:40:33,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:40:33,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:40:36,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:39,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 19:40:42,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:40:44,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:40:45,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:40:45,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 19:40:48,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:48,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:50,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:40:51,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:51,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:40:54,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:40:54,621 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 19:40:54,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:55,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:40:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:00,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:41:00,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:01,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:41:01,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 19:41:02,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:41:02,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:41:04,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:41:04,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:41:05,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:41:07,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 19:41:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 19:41:09,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:41:09,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:41:09,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:41:13,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:41:18,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:18,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:41:20,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:41:23,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:23,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 19:41:23,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:31,060 INFO [train.py:1046] (2/4) Epoch 29, batch 700, loss[loss=0.1682, simple_loss=0.2379, pruned_loss=0.0492, over 23824.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2429, pruned_loss=0.04314, over 4597180.78 frames. ], batch size: 179, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:41:31,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:41:31,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:41:31,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:41:31,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:41:35,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 19:41:36,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 19:41:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 19:41:38,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=996266.6666666666, ans=0.1 2023-10-02 19:41:39,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:41,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:41:42,314 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.79 vs. limit=10.0 2023-10-02 19:41:44,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 19:41:48,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:41:51,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:41:51,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:52,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:41:54,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:41:55,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:42:00,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 19:42:00,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:42:02,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 19:42:03,390 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.50 vs. limit=10.0 2023-10-02 19:42:03,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 19:42:07,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:42:07,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:42:10,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:42:15,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:42:15,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 19:42:18,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=996466.6666666666, ans=0.5 2023-10-02 19:42:19,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:19,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:42:21,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 19:42:22,596 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.803e+02 2.016e+02 2.224e+02 3.281e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 19:42:25,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:42:25,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:28,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:42:33,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:42:33,873 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.66 vs. limit=10.0 2023-10-02 19:42:34,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 19:42:37,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 19:42:39,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 19:42:40,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:43,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:42:43,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:42:45,100 INFO [train.py:1046] (2/4) Epoch 29, batch 750, loss[loss=0.1635, simple_loss=0.2494, pruned_loss=0.03882, over 24527.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2417, pruned_loss=0.04287, over 4623416.13 frames. ], batch size: 63, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:42:46,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:46,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 19:42:50,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 19:42:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 19:42:50,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 19:42:51,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 19:42:51,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 19:42:52,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:42:52,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 19:42:53,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=996600.0, ans=0.2 2023-10-02 19:42:54,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:55,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:42:55,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:42:57,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:59,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:43:00,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:43:02,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:43:04,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:43:05,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:43:05,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=996666.6666666666, ans=0.2 2023-10-02 19:43:07,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:43:09,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:43:09,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 19:43:10,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:43:10,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:43:13,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:43:15,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:43:15,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 19:43:15,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:43:19,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 19:43:19,217 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 19:43:19,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 19:43:19,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:43:19,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:43:22,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:43:28,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:43:28,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:43:28,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:43:31,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:43:31,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=996800.0, ans=0.0 2023-10-02 19:43:32,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:43:32,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 19:43:34,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:43:35,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 19:43:35,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:43:38,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:43:38,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 19:43:40,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:43:44,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:43:45,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:43:45,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:43:47,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:43:52,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 19:43:53,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:43:53,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:43:57,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:43:57,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:43:58,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=996933.3333333334, ans=0.0 2023-10-02 19:43:59,210 INFO [train.py:1046] (2/4) Epoch 29, batch 800, loss[loss=0.1545, simple_loss=0.2442, pruned_loss=0.03243, over 24663.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2424, pruned_loss=0.04322, over 4655207.26 frames. ], batch size: 68, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:44:01,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:01,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:44:04,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=996933.3333333334, ans=0.2 2023-10-02 19:44:05,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.73 vs. limit=15.0 2023-10-02 19:44:08,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:08,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:11,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:44:11,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:44:11,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:12,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:14,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:16,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:18,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:44:21,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 19:44:22,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:24,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:44:24,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:44:26,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:44:26,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 19:44:26,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:27,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 19:44:31,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:33,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:34,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:44:34,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:44:37,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:37,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:42,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:44:42,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:44:42,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 19:44:43,884 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 19:44:43,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 19:44:45,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:44:45,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:46,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:46,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:44:49,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 19:44:51,488 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.909e+02 2.086e+02 2.400e+02 3.373e+02, threshold=4.173e+02, percent-clipped=0.0 2023-10-02 19:44:51,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 19:44:51,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:44:53,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:44:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:45:00,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:45:00,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=997200.0, ans=0.2 2023-10-02 19:45:02,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 19:45:02,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:45:06,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 19:45:13,271 INFO [train.py:1046] (2/4) Epoch 29, batch 850, loss[loss=0.1699, simple_loss=0.2453, pruned_loss=0.04723, over 23470.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2432, pruned_loss=0.04319, over 4685720.33 frames. ], batch size: 285, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:45:13,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:45:14,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:45:16,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 19:45:17,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:45:18,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:45:20,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 19:45:20,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:20,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:45:23,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:23,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:45:24,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:45:26,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 19:45:26,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 19:45:26,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 19:45:29,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:45:29,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:45:31,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:31,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:45:31,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:45:36,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:36,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:45:36,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 19:45:41,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 19:45:44,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:45,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 19:45:49,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=997400.0, ans=0.1 2023-10-02 19:45:50,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 19:45:51,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 19:45:53,323 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 19:45:53,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:45:53,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:45:53,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 19:45:56,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:57,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:57,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 19:45:58,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:45:58,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:00,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:46:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:46:02,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:46:03,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:46:03,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 19:46:08,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:46:08,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:46:08,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:46:10,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:46:10,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:14,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:46:16,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:46:16,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=997533.3333333334, ans=0.2 2023-10-02 19:46:17,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:46:18,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:20,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:46:20,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=997533.3333333334, ans=0.2 2023-10-02 19:46:25,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=997533.3333333334, ans=0.2 2023-10-02 19:46:26,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=997600.0, ans=0.0 2023-10-02 19:46:27,484 INFO [train.py:1046] (2/4) Epoch 29, batch 900, loss[loss=0.1643, simple_loss=0.2354, pruned_loss=0.04662, over 24439.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.244, pruned_loss=0.04374, over 4695177.77 frames. ], batch size: 58, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:46:27,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:46:28,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:46:29,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 19:46:29,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:46:30,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:46:31,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 19:46:37,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:46:41,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:41,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 19:46:44,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:46:44,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 19:46:44,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 19:46:47,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:46:47,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:46:47,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:46:48,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:46:56,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:56,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:57,148 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.98 vs. limit=12.0 2023-10-02 19:46:57,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:47:00,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:47:05,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 19:47:07,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:47:08,875 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.06 vs. limit=15.0 2023-10-02 19:47:12,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:47:12,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:47:13,915 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 19:47:13,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 19:47:19,915 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.861e+02 2.059e+02 2.455e+02 3.512e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 19:47:20,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:47:20,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:47:21,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:47:27,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:27,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:47:30,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 19:47:30,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:47:32,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 19:47:33,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:47:33,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:36,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:47:36,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:47:37,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=997866.6666666666, ans=0.1 2023-10-02 19:47:40,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 19:47:40,518 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 19:47:41,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 19:47:41,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.60 vs. limit=15.0 2023-10-02 19:47:42,462 INFO [train.py:1046] (2/4) Epoch 29, batch 950, loss[loss=0.1705, simple_loss=0.2562, pruned_loss=0.04246, over 24014.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2444, pruned_loss=0.04379, over 4709749.86 frames. ], batch size: 86, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:47:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 19:47:43,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:48,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 19:47:52,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:47:53,863 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.40 vs. limit=22.5 2023-10-02 19:47:54,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:47:56,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:47:56,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:47:57,689 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 19:48:03,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:03,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:48:03,871 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:48:04,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:48:04,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:48:06,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 19:48:06,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:48:07,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:09,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 19:48:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:48:13,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:13,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:48:14,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:48:15,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-10-02 19:48:16,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 19:48:17,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:48:19,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:48:21,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:48:25,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:48:25,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:48:28,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 19:48:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 19:48:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:48:31,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:48:31,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:31,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:48:35,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 19:48:36,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:48:38,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:48:38,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=998133.3333333334, ans=0.1 2023-10-02 19:48:40,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:40,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 19:48:40,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:40,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:48:40,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 19:48:44,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:48:47,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:50,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:48:52,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 19:48:52,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 19:48:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:56,966 INFO [train.py:1046] (2/4) Epoch 29, batch 1000, loss[loss=0.1422, simple_loss=0.2188, pruned_loss=0.03284, over 24458.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2431, pruned_loss=0.04353, over 4695133.90 frames. ], batch size: 58, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:48:57,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=998266.6666666666, ans=0.2 2023-10-02 19:48:59,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 19:48:59,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:05,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:49:07,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 19:49:07,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 19:49:12,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:12,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:49:13,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:16,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 19:49:17,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=998333.3333333334, ans=0.125 2023-10-02 19:49:20,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 19:49:22,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 19:49:22,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:49:22,975 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.76 vs. limit=15.0 2023-10-02 19:49:23,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 19:49:25,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 19:49:25,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 19:49:28,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:28,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:35,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:37,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:49:37,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:38,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:38,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 19:49:38,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:49:40,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:49:40,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:41,895 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 19:49:44,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 19:49:46,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 19:49:47,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 19:49:48,770 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.876e+02 2.032e+02 2.220e+02 3.868e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 19:49:48,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:49:49,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=998466.6666666666, ans=0.125 2023-10-02 19:49:56,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:56,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:49:56,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:58,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:49:58,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=998533.3333333334, ans=0.125 2023-10-02 19:49:59,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 19:49:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:49:59,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 19:49:59,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 19:50:01,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:50:01,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:50:03,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:50:07,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:50:08,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:50:11,772 INFO [train.py:1046] (2/4) Epoch 29, batch 1050, loss[loss=0.1722, simple_loss=0.2556, pruned_loss=0.04439, over 23257.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2418, pruned_loss=0.0435, over 4698582.67 frames. ], batch size: 93, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:50:11,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:50:11,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:50:13,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:50:13,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:50:13,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=998600.0, ans=0.125 2023-10-02 19:50:16,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:50:17,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.14 vs. limit=15.0 2023-10-02 19:50:17,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:50:21,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:50:22,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:50:23,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:50:23,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:50:25,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:50:25,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=998666.6666666666, ans=0.1 2023-10-02 19:50:25,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=998666.6666666666, ans=0.09899494936611666 2023-10-02 19:50:26,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 19:50:26,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:50:27,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=998666.6666666666, ans=0.0 2023-10-02 19:50:28,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 19:50:28,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=998666.6666666666, ans=0.125 2023-10-02 19:50:31,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=998666.6666666666, ans=0.0 2023-10-02 19:50:32,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:50:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 19:50:32,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 19:50:33,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=998666.6666666666, ans=0.0 2023-10-02 19:50:36,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:50:38,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:50:38,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:50:41,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 19:50:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 19:50:42,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:50:44,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 19:50:46,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=998733.3333333334, ans=0.0 2023-10-02 19:50:48,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 19:50:50,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:50:52,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 19:50:54,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 19:50:55,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:50:55,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:50:58,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:51:02,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 19:51:03,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 19:51:03,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 19:51:05,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:51:05,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:51:06,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 19:51:08,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=998800.0, ans=0.125 2023-10-02 19:51:11,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:51:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:51:13,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:51:13,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:51:14,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:51:19,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:51:19,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 19:51:21,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:51:21,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 19:51:21,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 19:51:21,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=998866.6666666666, ans=0.125 2023-10-02 19:51:22,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:51:24,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=998866.6666666666, ans=0.1 2023-10-02 19:51:26,503 INFO [train.py:1046] (2/4) Epoch 29, batch 1100, loss[loss=0.1653, simple_loss=0.2256, pruned_loss=0.05251, over 18996.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2416, pruned_loss=0.04319, over 4699751.34 frames. ], batch size: 388, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:51:26,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:51:30,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:51:34,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:51:35,105 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:51:36,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:51:37,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:51:37,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 19:51:37,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:51:39,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.65 vs. limit=15.0 2023-10-02 19:51:40,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:51:41,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:51:45,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:51:45,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 19:51:46,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 19:51:46,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:51:46,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:51:50,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:51:50,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=999000.0, ans=0.1 2023-10-02 19:51:51,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:51:54,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=999066.6666666666, ans=0.125 2023-10-02 19:51:55,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:51:59,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 19:52:01,123 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 19:52:02,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:02,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=999066.6666666666, ans=0.125 2023-10-02 19:52:03,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:04,714 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.50 vs. limit=15.0 2023-10-02 19:52:05,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:52:05,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:52:06,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 19:52:06,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=999066.6666666666, ans=0.0 2023-10-02 19:52:07,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:52:07,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:52:07,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:52:08,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:09,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 19:52:12,856 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.18 vs. limit=15.0 2023-10-02 19:52:16,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=999133.3333333334, ans=0.1 2023-10-02 19:52:17,272 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.440e+02 1.856e+02 2.110e+02 2.423e+02 3.915e+02, threshold=4.220e+02, percent-clipped=0.0 2023-10-02 19:52:17,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:52:17,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 19:52:18,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:52:24,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:52:26,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 19:52:26,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:52:28,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:30,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:52:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:52:32,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 19:52:32,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:52:33,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:52:33,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 19:52:34,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:52:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 19:52:36,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:52:36,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:52:37,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:52:38,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.68 vs. limit=15.0 2023-10-02 19:52:39,034 INFO [train.py:1046] (2/4) Epoch 29, batch 1150, loss[loss=0.1704, simple_loss=0.2589, pruned_loss=0.04098, over 24560.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2421, pruned_loss=0.04349, over 4695144.74 frames. ], batch size: 71, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:52:40,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:52:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:52:46,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:52:46,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:52:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 19:52:46,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:52:48,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=999266.6666666666, ans=0.1 2023-10-02 19:52:50,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 19:52:52,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:52:52,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:52:57,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 19:52:59,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:03,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:53:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:03,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 19:53:03,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:53:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:53:07,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 19:53:07,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:08,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:53:20,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:25,023 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.19 vs. limit=15.0 2023-10-02 19:53:27,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:28,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 19:53:28,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:33,965 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 19:53:36,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:41,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=999533.3333333334, ans=0.04949747468305833 2023-10-02 19:53:42,411 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 19:53:45,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:53:45,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=999533.3333333334, ans=0.125 2023-10-02 19:53:47,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:53:47,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:53:47,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:53:50,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:53:51,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.53 vs. limit=22.5 2023-10-02 19:53:53,719 INFO [train.py:1046] (2/4) Epoch 29, batch 1200, loss[loss=0.1548, simple_loss=0.2383, pruned_loss=0.03564, over 24497.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2428, pruned_loss=0.04343, over 4708334.61 frames. ], batch size: 63, lr: 3.54e-03, grad_scale: 32.0 2023-10-02 19:53:55,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:53:55,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:53:55,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:55,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:53:56,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:53:57,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:54:00,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:54:01,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:54:01,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:54:03,163 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 19:54:04,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 19:54:08,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:54:10,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:54:12,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:54:14,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:54:14,903 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 19:54:16,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:54:19,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=999666.6666666666, ans=0.0 2023-10-02 19:54:26,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:54:26,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:54:26,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 19:54:27,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:54:30,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 19:54:34,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 19:54:34,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:54:35,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:54:37,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:54:38,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:54:38,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:54:38,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:54:39,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:54:41,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 19:54:41,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:54:41,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:54:41,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 19:54:44,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:54:44,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:54:45,319 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.823e+02 1.995e+02 2.246e+02 2.877e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 19:54:49,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:54:50,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:54:53,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 19:54:54,814 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.49 vs. limit=15.0 2023-10-02 19:54:58,464 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 19:54:59,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:55:02,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:55:03,094 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.27 vs. limit=15.0 2023-10-02 19:55:05,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:55:06,512 INFO [train.py:1046] (2/4) Epoch 29, batch 1250, loss[loss=0.149, simple_loss=0.2238, pruned_loss=0.03714, over 23549.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2438, pruned_loss=0.04399, over 4702443.63 frames. ], batch size: 149, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:55:06,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:55:08,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 19:55:12,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:55:13,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:13,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 19:55:16,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:55:16,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:55:23,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:55:23,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:24,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:55:24,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:55:26,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:55:31,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:55:32,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:55:32,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:55:34,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:55:34,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:36,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:36,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 19:55:42,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 19:55:43,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:55:45,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:55:45,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.02 vs. limit=22.5 2023-10-02 19:55:46,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 19:55:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 19:55:46,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:46,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:52,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:54,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1000133.3333333334, ans=0.2 2023-10-02 19:55:55,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:56,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:55:57,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 19:55:59,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 19:55:59,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 19:56:01,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:03,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 19:56:04,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:56:06,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 19:56:06,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:56:09,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 19:56:09,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:56:10,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:56:10,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 19:56:10,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:56:12,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.38 vs. limit=22.5 2023-10-02 19:56:13,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 19:56:16,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:56:17,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.30 vs. limit=15.0 2023-10-02 19:56:17,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:56:18,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:56:20,358 INFO [train.py:1046] (2/4) Epoch 29, batch 1300, loss[loss=0.1763, simple_loss=0.2592, pruned_loss=0.0467, over 23930.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2448, pruned_loss=0.04435, over 4702382.32 frames. ], batch size: 86, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:56:21,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:56:23,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:56:24,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 19:56:28,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:31,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:56:32,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:56:34,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:56:36,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:56:36,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 19:56:40,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:56:42,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:56:43,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 19:56:45,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:56:48,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:56:51,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:56:51,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:52,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:56:53,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:56:54,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:56:54,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 19:57:01,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:57:01,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:57:02,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 19:57:04,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:57:05,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:57:06,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1000466.6666666666, ans=0.125 2023-10-02 19:57:09,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:57:09,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 19:57:10,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:10,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 19:57:12,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:14,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:57:16,069 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.865e+02 2.006e+02 2.214e+02 3.009e+02, threshold=4.012e+02, percent-clipped=0.0 2023-10-02 19:57:16,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:57:18,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 19:57:18,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 19:57:20,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 19:57:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:57:27,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 19:57:29,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:57:35,524 INFO [train.py:1046] (2/4) Epoch 29, batch 1350, loss[loss=0.1538, simple_loss=0.2109, pruned_loss=0.04838, over 19349.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2434, pruned_loss=0.04445, over 4698734.73 frames. ], batch size: 388, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:57:35,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1000600.0, ans=0.0 2023-10-02 19:57:37,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 19:57:40,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:57:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:57:45,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:57:45,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:57:47,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:57:47,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1000600.0, ans=0.2 2023-10-02 19:57:48,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:57:51,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:57:53,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 19:57:54,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:57:54,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:57:56,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.10 vs. limit=6.0 2023-10-02 19:57:57,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 19:57:57,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:58,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:57:58,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 19:57:58,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1000666.6666666666, ans=0.0 2023-10-02 19:58:01,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 19:58:03,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 19:58:05,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:05,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 19:58:05,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1000733.3333333334, ans=0.125 2023-10-02 19:58:15,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:19,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1000800.0, ans=0.125 2023-10-02 19:58:19,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1000800.0, ans=0.0 2023-10-02 19:58:24,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:24,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:24,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1000800.0, ans=0.0 2023-10-02 19:58:25,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 19:58:27,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:29,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 19:58:29,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:58:29,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:58:32,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:58:33,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 19:58:35,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:58:41,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 19:58:44,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 19:58:48,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 19:58:49,922 INFO [train.py:1046] (2/4) Epoch 29, batch 1400, loss[loss=0.1531, simple_loss=0.2385, pruned_loss=0.03386, over 24687.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2425, pruned_loss=0.04386, over 4698207.51 frames. ], batch size: 65, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:58:49,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:53,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:58:53,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:58:59,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 19:58:59,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 19:59:09,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:59:11,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:59:13,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:59:14,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:59:16,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:59:18,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 19:59:28,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:28,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:33,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 19:59:33,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:59:33,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:59:35,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:59:35,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:59:36,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:59:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:59:36,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:59:38,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 19:59:38,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:59:45,130 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.875e+02 2.105e+02 2.508e+02 3.725e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-02 19:59:45,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:45,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1001133.3333333334, ans=0.2 2023-10-02 19:59:49,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:59:57,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 19:59:58,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:59:58,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1001200.0, ans=0.0 2023-10-02 19:59:59,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:00:01,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 20:00:03,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:04,351 INFO [train.py:1046] (2/4) Epoch 29, batch 1450, loss[loss=0.1766, simple_loss=0.2653, pruned_loss=0.04392, over 24403.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2425, pruned_loss=0.0434, over 4705167.46 frames. ], batch size: 69, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:00:04,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:00:07,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1001266.6666666666, ans=10.0 2023-10-02 20:00:08,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:00:10,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:00:10,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:10,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 20:00:15,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:15,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:00:18,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:00:18,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 20:00:18,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:00:18,839 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.77 vs. limit=6.0 2023-10-02 20:00:18,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.93 vs. limit=15.0 2023-10-02 20:00:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 20:00:21,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:22,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:22,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 20:00:23,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:00:23,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:00:24,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 20:00:25,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:25,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:00:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:29,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:32,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:00:32,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:00:37,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:37,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:39,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:39,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:00:39,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:39,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:00:44,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 20:00:44,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1001400.0, ans=0.125 2023-10-02 20:00:45,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1001400.0, ans=0.125 2023-10-02 20:00:46,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:00:49,718 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 20:00:51,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:00:52,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:00:52,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:00:53,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 20:00:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:00:58,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 20:01:01,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 20:01:02,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:02,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1001533.3333333334, ans=0.125 2023-10-02 20:01:05,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:01:07,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:01:07,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 20:01:10,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 20:01:10,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 20:01:13,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:13,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:01:19,193 INFO [train.py:1046] (2/4) Epoch 29, batch 1500, loss[loss=0.1686, simple_loss=0.2571, pruned_loss=0.04008, over 24386.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2428, pruned_loss=0.04333, over 4712852.07 frames. ], batch size: 74, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:01:23,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 20:01:23,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:01:23,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:01:23,706 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:01:24,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:26,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:01:26,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:01:26,907 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.76 vs. limit=15.0 2023-10-02 20:01:27,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 20:01:29,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:01:29,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:01:29,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1001600.0, ans=0.125 2023-10-02 20:01:31,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:01:31,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:01:31,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:01:32,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:01:40,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:01:40,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 20:01:40,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:01:40,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:01:42,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:46,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 20:01:50,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 20:01:51,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=15.0 2023-10-02 20:01:52,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:52,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 20:01:54,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:01:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:01:57,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:58,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:00,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 20:02:00,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:02:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:02:02,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 20:02:02,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:02:09,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:02:09,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 20:02:13,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.835e+02 2.062e+02 2.409e+02 3.555e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 20:02:15,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:02:15,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1001800.0, ans=0.125 2023-10-02 20:02:16,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:02:21,300 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 20:02:21,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:21,363 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 20:02:23,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:23,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:02:23,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1001866.6666666666, ans=0.125 2023-10-02 20:02:24,622 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 20:02:26,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:02:28,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 20:02:30,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:33,221 INFO [train.py:1046] (2/4) Epoch 29, batch 1550, loss[loss=0.1767, simple_loss=0.2501, pruned_loss=0.05161, over 22672.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2432, pruned_loss=0.04384, over 4716314.62 frames. ], batch size: 322, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:02:33,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:02:33,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:34,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:02:34,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:36,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:02:36,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 20:02:38,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 20:02:39,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:02:39,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 20:02:39,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 20:02:42,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:43,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:43,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:02:43,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:02:46,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:46,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:49,047 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 20:02:49,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:50,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:02:50,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:02:52,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:02:52,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 20:02:55,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:55,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 20:02:55,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 20:02:56,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 20:02:56,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:56,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:01,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:03:03,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 20:03:03,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 20:03:11,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:15,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:03:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:03:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:03:17,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 20:03:23,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:03:25,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:27,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:03:30,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:03:30,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:30,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 20:03:31,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:03:33,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:03:33,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:34,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 20:03:34,718 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 20:03:36,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1002200.0, ans=0.2 2023-10-02 20:03:38,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:03:38,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1002200.0, ans=0.125 2023-10-02 20:03:42,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 20:03:43,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.27 vs. limit=22.5 2023-10-02 20:03:46,181 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.04 vs. limit=22.5 2023-10-02 20:03:46,748 INFO [train.py:1046] (2/4) Epoch 29, batch 1600, loss[loss=0.1676, simple_loss=0.2325, pruned_loss=0.0513, over 23828.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2434, pruned_loss=0.04373, over 4726303.27 frames. ], batch size: 150, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 20:03:46,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:03:47,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1002266.6666666666, ans=0.2 2023-10-02 20:03:48,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:49,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 20:03:49,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:03:49,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:03:49,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:03:51,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:03:52,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:03:54,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:03:55,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 20:03:57,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 20:03:59,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 20:04:01,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:04:03,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 20:04:03,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:04:04,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:04:11,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:04:13,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 20:04:16,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:04:18,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 20:04:18,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:18,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 20:04:21,438 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-10-02 20:04:24,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 20:04:31,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:04:31,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 20:04:33,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:04:33,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:04:33,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:04:37,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 20:04:41,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:04:42,330 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.872e+02 2.190e+02 2.383e+02 3.841e+02, threshold=4.379e+02, percent-clipped=0.0 2023-10-02 20:04:42,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:04:42,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:42,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:43,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:04:45,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:04:46,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:04:49,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:04:55,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:55,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:04:58,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 20:04:58,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:04:58,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 20:05:00,930 INFO [train.py:1046] (2/4) Epoch 29, batch 1650, loss[loss=0.1499, simple_loss=0.2221, pruned_loss=0.0389, over 20298.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2442, pruned_loss=0.04391, over 4712619.09 frames. ], batch size: 44, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 20:05:02,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:04,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:05:05,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:05:05,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 20:05:05,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 20:05:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 20:05:07,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 20:05:10,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:05:12,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:05:13,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:05:13,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:05:16,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:17,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 20:05:19,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:05:19,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:05:19,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:05:19,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:05:20,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 20:05:20,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 20:05:25,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:05:28,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:05:34,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 20:05:36,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:38,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 20:05:39,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1002733.3333333334, ans=0.5 2023-10-02 20:05:43,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:05:45,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:05:46,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:05:46,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:05:47,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:05:47,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:48,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1002800.0, ans=0.125 2023-10-02 20:05:50,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:50,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:52,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:05:52,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:05:53,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:05:53,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:05:55,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1002800.0, ans=0.125 2023-10-02 20:05:56,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:05:56,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 20:05:57,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:05:57,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 20:06:00,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 20:06:00,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 20:06:00,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:02,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:06:02,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:06:03,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:06:03,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 20:06:07,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:06:10,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:06:10,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:06:11,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 20:06:16,114 INFO [train.py:1046] (2/4) Epoch 29, batch 1700, loss[loss=0.1685, simple_loss=0.2497, pruned_loss=0.04361, over 23852.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2435, pruned_loss=0.04357, over 4721116.75 frames. ], batch size: 94, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:06:16,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:06:16,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:06:16,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 20:06:17,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:06:17,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:06:17,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:06:19,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:06:19,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:06:20,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 20:06:23,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:06:30,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1003000.0, ans=0.125 2023-10-02 20:06:32,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:06:34,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:06:38,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:06:40,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:06:40,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:06:40,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:06:43,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 20:06:44,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.95 vs. limit=12.0 2023-10-02 20:06:44,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:06:44,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:46,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:06:47,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:06:47,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1003066.6666666666, ans=0.0 2023-10-02 20:06:49,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 20:06:50,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 20:06:51,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:52,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 20:06:53,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.02 vs. limit=22.5 2023-10-02 20:06:54,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:07:02,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:04,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:04,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:07:05,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:07:07,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 20:07:07,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:07:09,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:09,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 20:07:11,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:07:11,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:11,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:11,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:11,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1003133.3333333334, ans=0.125 2023-10-02 20:07:11,754 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.27 vs. limit=15.0 2023-10-02 20:07:13,658 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.858e+02 2.032e+02 2.351e+02 3.196e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-02 20:07:13,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:13,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:07:15,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:16,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:07:17,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:20,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:07:22,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 20:07:24,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:24,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1003200.0, ans=0.125 2023-10-02 20:07:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:07:27,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 20:07:31,187 INFO [train.py:1046] (2/4) Epoch 29, batch 1750, loss[loss=0.1435, simple_loss=0.19, pruned_loss=0.04849, over 19313.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2409, pruned_loss=0.04314, over 4702577.72 frames. ], batch size: 388, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:07:32,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:33,716 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-10-02 20:07:34,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:35,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:07:35,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 20:07:35,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:40,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:07:40,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:42,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1003266.6666666666, ans=0.0 2023-10-02 20:07:44,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 20:07:45,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:47,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.91 vs. limit=15.0 2023-10-02 20:07:48,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 20:07:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:49,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:07:52,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:07:52,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 20:07:55,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:07:55,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 20:08:01,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:08:04,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:08:07,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:07,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:08:09,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:08:11,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:13,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:08:14,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:08:15,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 20:08:18,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:08:19,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1003466.6666666666, ans=0.09899494936611666 2023-10-02 20:08:21,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 20:08:21,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:08:24,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:08:24,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:08:26,367 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.03 vs. limit=15.0 2023-10-02 20:08:28,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:08:29,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 20:08:29,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:31,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:08:31,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1003533.3333333334, ans=0.0 2023-10-02 20:08:33,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:08:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:08:37,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1003533.3333333334, ans=0.125 2023-10-02 20:08:39,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:08:39,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 20:08:39,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:41,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:08:41,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:08:41,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:08:41,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:08:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:08:44,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1003533.3333333334, ans=0.125 2023-10-02 20:08:44,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1003533.3333333334, ans=0.0 2023-10-02 20:08:45,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:08:46,644 INFO [train.py:1046] (2/4) Epoch 29, batch 1800, loss[loss=0.1623, simple_loss=0.2432, pruned_loss=0.04068, over 23580.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2405, pruned_loss=0.04298, over 4708258.33 frames. ], batch size: 149, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:08:46,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:48,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:08:48,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1003600.0, ans=0.125 2023-10-02 20:08:50,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:54,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:08:55,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:08:58,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:01,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:01,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:02,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:09:03,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:09:03,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 20:09:04,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:06,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:12,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 20:09:13,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 20:09:13,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 20:09:15,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:15,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:15,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:09:16,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:09:22,549 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 20:09:23,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:09:25,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:26,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 20:09:28,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 20:09:28,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:09:31,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:09:31,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1003800.0, ans=0.125 2023-10-02 20:09:32,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:09:36,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 20:09:42,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:09:43,454 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.936e+02 2.168e+02 2.501e+02 3.680e+02, threshold=4.336e+02, percent-clipped=0.0 2023-10-02 20:09:43,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 20:09:44,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:09:44,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:45,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:09:46,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 20:09:47,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1003866.6666666666, ans=0.0 2023-10-02 20:09:48,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:09:49,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:09:52,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 20:09:52,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:54,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:09:54,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:09:54,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:55,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:55,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:09:55,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1003866.6666666666, ans=0.125 2023-10-02 20:09:58,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:09:58,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:10:00,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.09 vs. limit=15.0 2023-10-02 20:10:01,021 INFO [train.py:1046] (2/4) Epoch 29, batch 1850, loss[loss=0.1603, simple_loss=0.2437, pruned_loss=0.03848, over 24034.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2417, pruned_loss=0.04274, over 4725565.85 frames. ], batch size: 86, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:10:01,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:10:02,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:10:03,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=12.0 2023-10-02 20:10:07,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1003933.3333333334, ans=15.0 2023-10-02 20:10:09,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:10:09,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 20:10:12,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 20:10:16,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 20:10:16,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1004000.0, ans=0.0 2023-10-02 20:10:20,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:10:20,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 20:10:20,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 20:10:24,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1004000.0, ans=0.2 2023-10-02 20:10:30,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:10:32,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 20:10:35,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:10:35,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:10:39,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 20:10:39,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:10:40,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:10:40,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:10:40,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1004066.6666666666, ans=0.0 2023-10-02 20:10:43,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:10:45,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:10:49,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:10:51,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:10:51,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:10:51,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:10:55,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:10:55,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:10:58,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 20:10:58,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:11:01,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1004200.0, ans=0.0 2023-10-02 20:11:02,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:11:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:11:03,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 20:11:03,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 20:11:05,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 20:11:06,599 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 20:11:06,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:11:06,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:11:06,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:11:07,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:09,253 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 20:11:09,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:11:09,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:09,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.76 vs. limit=15.0 2023-10-02 20:11:10,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:11:10,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:11:12,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:11:13,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 20:11:14,856 INFO [train.py:1046] (2/4) Epoch 29, batch 1900, loss[loss=0.174, simple_loss=0.2519, pruned_loss=0.04803, over 23190.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.242, pruned_loss=0.04235, over 4736306.96 frames. ], batch size: 105, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:11:14,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:14,962 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 20:11:14,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:11:17,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:11:17,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1004266.6666666666, ans=0.125 2023-10-02 20:11:19,048 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.59 vs. limit=6.0 2023-10-02 20:11:21,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:11:23,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:11:23,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 20:11:25,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 20:11:26,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:11:27,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:11:27,832 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 20:11:27,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 20:11:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 20:11:32,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:11:36,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 20:11:38,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 20:11:48,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 20:11:49,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1004400.0, ans=0.125 2023-10-02 20:11:50,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 20:11:52,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:52,821 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 20:11:54,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 20:11:54,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 20:11:54,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 20:11:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:11:58,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 20:12:00,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:12:02,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1004466.6666666666, ans=15.0 2023-10-02 20:12:03,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:12:03,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 20:12:05,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:12:08,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 20:12:08,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:12:11,359 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.825e+02 1.966e+02 2.194e+02 3.470e+02, threshold=3.932e+02, percent-clipped=0.0 2023-10-02 20:12:16,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:12:16,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:12:16,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:12:16,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:12:17,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:12:19,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:12:19,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:12:23,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:12:23,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:12:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:12:26,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:12:28,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:12:30,107 INFO [train.py:1046] (2/4) Epoch 29, batch 1950, loss[loss=0.154, simple_loss=0.2412, pruned_loss=0.03339, over 24502.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2437, pruned_loss=0.04294, over 4738658.39 frames. ], batch size: 66, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:12:30,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:12:32,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:12:34,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:12:34,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:34,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:12:35,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 20:12:37,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 20:12:38,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:39,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:42,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:12:42,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:12:42,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:45,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:12:49,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:12:49,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:12:49,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:12:49,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:53,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.10 vs. limit=15.0 2023-10-02 20:12:54,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:57,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:12:57,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:12:57,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:12:57,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 20:12:59,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:12:59,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:12:59,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:04,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:13:07,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:13:08,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1004733.3333333334, ans=0.125 2023-10-02 20:13:09,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:13:12,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:13:14,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:13:14,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 20:13:14,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:13:19,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:13:20,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:13:21,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:13:25,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1004800.0, ans=0.125 2023-10-02 20:13:30,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:30,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:30,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1004866.6666666666, ans=0.1 2023-10-02 20:13:33,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:36,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:38,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:13:38,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:38,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 20:13:38,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:13:38,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:13:41,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 20:13:41,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:13:43,990 INFO [train.py:1046] (2/4) Epoch 29, batch 2000, loss[loss=0.1517, simple_loss=0.2264, pruned_loss=0.03848, over 23776.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2437, pruned_loss=0.04289, over 4738471.77 frames. ], batch size: 150, lr: 3.53e-03, grad_scale: 16.0 2023-10-02 20:13:45,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:13:46,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:13:46,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:13:49,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:13:52,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:55,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 20:13:55,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:13:59,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:14:01,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 20:14:02,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:14:02,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:14:04,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:14:05,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 20:14:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:10,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 20:14:11,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:14:13,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 20:14:13,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:14:14,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:16,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 20:14:16,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:17,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:14:18,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:14:20,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 20:14:23,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 20:14:23,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:14:23,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:27,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:29,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:14:29,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:14:29,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:14:32,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:14:32,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:33,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:14:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:35,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:38,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:14:38,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 20:14:38,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1005133.3333333334, ans=0.0 2023-10-02 20:14:39,875 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.873e+02 1.992e+02 2.268e+02 3.359e+02, threshold=3.985e+02, percent-clipped=0.0 2023-10-02 20:14:41,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:14:42,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:44,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:44,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:14:48,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:49,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:49,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:51,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:14:51,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:14:53,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1005200.0, ans=0.1 2023-10-02 20:14:54,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:56,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:57,723 INFO [train.py:1046] (2/4) Epoch 29, batch 2050, loss[loss=0.1377, simple_loss=0.2211, pruned_loss=0.02718, over 24441.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2431, pruned_loss=0.04305, over 4721407.41 frames. ], batch size: 58, lr: 3.53e-03, grad_scale: 16.0 2023-10-02 20:14:57,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:57,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:15:03,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:15:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:15:05,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:15:06,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1005266.6666666666, ans=0.07 2023-10-02 20:15:07,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-02 20:15:07,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:15:10,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 20:15:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:15:12,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:15:13,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:15:23,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:15:23,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:15:25,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 20:15:27,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1005400.0, ans=0.125 2023-10-02 20:15:28,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:15:28,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 20:15:28,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:15:33,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:15:35,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:15:36,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1005400.0, ans=0.1 2023-10-02 20:15:37,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:15:37,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:15:38,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:15:40,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:15:40,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:15:42,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:15:42,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1005466.6666666666, ans=0.0 2023-10-02 20:15:44,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:15:46,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1005466.6666666666, ans=0.0 2023-10-02 20:15:47,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:15:48,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:15:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:15:58,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:15:58,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 20:16:04,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:16:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:16:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:16:10,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 20:16:11,384 INFO [train.py:1046] (2/4) Epoch 29, batch 2100, loss[loss=0.1579, simple_loss=0.2316, pruned_loss=0.04207, over 24421.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2421, pruned_loss=0.04314, over 4714523.18 frames. ], batch size: 58, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:16:12,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 20:16:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:14,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:16:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:16:15,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:16:15,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 20:16:15,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 20:16:17,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:16:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:16:19,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:16:24,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:25,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:16:25,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 20:16:26,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1005666.6666666666, ans=15.0 2023-10-02 20:16:27,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:16:27,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 20:16:27,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 20:16:28,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:28,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:16:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 20:16:30,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 20:16:35,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 20:16:35,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:16:38,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:16:39,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:16:41,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:16:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 20:16:43,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:43,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 20:16:45,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 20:16:45,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:45,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 20:16:45,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 20:16:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 20:16:48,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:16:48,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1005733.3333333334, ans=0.2 2023-10-02 20:16:50,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1005733.3333333334, ans=0.2 2023-10-02 20:16:51,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:16:52,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:16:54,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:16:56,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:57,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:57,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 20:16:57,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:57,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:58,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:58,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 20:17:00,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 20:17:02,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 20:17:02,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1005800.0, ans=0.125 2023-10-02 20:17:03,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1005800.0, ans=0.0 2023-10-02 20:17:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:17:09,075 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.849e+02 2.136e+02 2.701e+02 4.119e+02, threshold=4.273e+02, percent-clipped=1.0 2023-10-02 20:17:09,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:17:09,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 20:17:16,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:17,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:17:19,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:17:19,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:17:19,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 20:17:19,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:17:20,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:20,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:17:22,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:17:22,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:23,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 20:17:25,432 INFO [train.py:1046] (2/4) Epoch 29, batch 2150, loss[loss=0.1626, simple_loss=0.2527, pruned_loss=0.03622, over 24585.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2412, pruned_loss=0.04277, over 4709686.68 frames. ], batch size: 71, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:17:26,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 20:17:26,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:28,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:17:28,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:17:28,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:17:29,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:17:34,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 20:17:37,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:40,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:17:40,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:40,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:17:43,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:44,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:17:44,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:17:47,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:48,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 20:17:51,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:17:53,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:17:53,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:54,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:17:54,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:55,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:17:55,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:55,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:17:56,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:57,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1006066.6666666666, ans=0.125 2023-10-02 20:17:59,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 20:17:59,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:17:59,724 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.35 vs. limit=22.5 2023-10-02 20:18:00,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:00,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:01,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:18:03,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:18:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:07,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:18:07,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:07,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 20:18:08,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:18:13,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:18:13,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:13,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:18:14,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:18:15,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:16,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:16,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 20:18:17,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 20:18:18,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:18:18,898 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 20:18:20,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:20,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:18:21,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 20:18:21,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:18:21,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 20:18:21,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 20:18:21,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 20:18:22,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 20:18:23,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1006200.0, ans=0.125 2023-10-02 20:18:24,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:24,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:18:24,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:18:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:25,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:18:29,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:29,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:37,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:18:39,724 INFO [train.py:1046] (2/4) Epoch 29, batch 2200, loss[loss=0.1779, simple_loss=0.2578, pruned_loss=0.04895, over 24657.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2413, pruned_loss=0.04287, over 4700501.75 frames. ], batch size: 65, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:18:39,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 20:18:43,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:18:48,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:48,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:18:49,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:49,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:18:52,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:52,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:52,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 20:18:56,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 20:18:59,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:19:04,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1006333.3333333334, ans=0.125 2023-10-02 20:19:05,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 20:19:07,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:09,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:19:09,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:19:13,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:19:13,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 20:19:16,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:19:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:19,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 20:19:20,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:19:22,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:19:23,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:19:23,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1006466.6666666666, ans=0.1 2023-10-02 20:19:24,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:25,214 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:19:26,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 20:19:27,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:29,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 20:19:29,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1006466.6666666666, ans=0.125 2023-10-02 20:19:31,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:31,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:19:32,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:34,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:19:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:19:34,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:36,201 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.909e+02 2.141e+02 2.599e+02 8.500e+02, threshold=4.282e+02, percent-clipped=2.0 2023-10-02 20:19:36,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:36,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:19:38,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:19:38,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1006533.3333333334, ans=0.125 2023-10-02 20:19:39,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:19:40,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1006533.3333333334, ans=0.125 2023-10-02 20:19:43,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:19:44,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:19:46,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:19:47,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 20:19:48,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:19:48,810 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 20:19:50,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:19:51,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 20:19:52,844 INFO [train.py:1046] (2/4) Epoch 29, batch 2250, loss[loss=0.1711, simple_loss=0.2555, pruned_loss=0.04338, over 24461.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2421, pruned_loss=0.04357, over 4697513.78 frames. ], batch size: 66, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:19:52,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:52,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:19:54,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 20:19:56,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:20:00,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:20:04,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1006600.0, ans=0.125 2023-10-02 20:20:05,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:20:06,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:20:09,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:11,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:20:11,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:20:14,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 20:20:14,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:20:16,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:20:17,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 20:20:18,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:20:18,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:20,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1006666.6666666666, ans=0.0 2023-10-02 20:20:21,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:20:26,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:20:28,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:20:28,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:20:29,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 20:20:31,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:34,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:20:37,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:20:37,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1006800.0, ans=0.04949747468305833 2023-10-02 20:20:38,613 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.89 vs. limit=22.5 2023-10-02 20:20:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:20:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:20:40,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:20:41,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:20:43,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:20:47,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1006800.0, ans=0.125 2023-10-02 20:20:48,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:20:49,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:20:54,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:20:54,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:20:55,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:20:58,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:21:00,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:21:00,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 20:21:00,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:00,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:21:04,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 20:21:06,749 INFO [train.py:1046] (2/4) Epoch 29, batch 2300, loss[loss=0.1641, simple_loss=0.2527, pruned_loss=0.03779, over 23847.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2432, pruned_loss=0.04403, over 4703280.62 frames. ], batch size: 86, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:21:06,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:21:06,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:11,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:11,951 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:21:13,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:21:16,151 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 20:21:17,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:24,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:21:24,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:21:24,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:21:24,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.66 vs. limit=15.0 2023-10-02 20:21:25,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:25,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 20:21:27,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:21:30,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:21:31,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:21:34,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:21:37,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:21:39,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:21:44,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:21:44,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1007066.6666666666, ans=0.125 2023-10-02 20:21:45,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:47,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:21:49,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:53,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:21:54,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:21:54,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:21:54,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 20:22:00,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:22:00,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:01,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:01,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:22:01,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:22:02,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 20:22:02,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:22:02,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 20:22:02,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:22:02,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:04,290 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.800e+02 1.972e+02 2.166e+02 3.182e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-02 20:22:04,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 20:22:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:22:13,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:22:17,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:22:17,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:22:17,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:22:17,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1007200.0, ans=0.5 2023-10-02 20:22:19,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:22:19,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1007200.0, ans=0.0 2023-10-02 20:22:20,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:22:21,774 INFO [train.py:1046] (2/4) Epoch 29, batch 2350, loss[loss=0.1792, simple_loss=0.2509, pruned_loss=0.05374, over 23844.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2441, pruned_loss=0.04423, over 4700146.13 frames. ], batch size: 195, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:22:21,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:22:21,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 20:22:24,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1007266.6666666666, ans=0.125 2023-10-02 20:22:27,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:22:27,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 20:22:32,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 20:22:37,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:40,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:40,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:40,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:22:40,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:22:41,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 20:22:47,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:22:51,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 20:22:52,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:22:57,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:22:57,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:22:58,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:22:58,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 20:23:00,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:23:01,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:23:01,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:23:02,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:23:06,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:23:08,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 20:23:08,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:23:11,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:23:11,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:23:13,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 20:23:14,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:23:16,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 20:23:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:23:21,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 20:23:22,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1007533.3333333334, ans=0.2 2023-10-02 20:23:24,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 20:23:25,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:23:26,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 20:23:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 20:23:26,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 20:23:29,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 20:23:30,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:23:35,435 INFO [train.py:1046] (2/4) Epoch 29, batch 2400, loss[loss=0.183, simple_loss=0.2482, pruned_loss=0.05896, over 23794.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2436, pruned_loss=0.04419, over 4702314.70 frames. ], batch size: 164, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:23:35,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:23:38,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:23:39,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:23:39,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 20:23:41,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 20:23:47,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:23:47,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:23:50,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 20:23:50,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:23:51,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:23:51,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 20:23:56,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:23:59,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 20:24:05,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:24:05,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1007733.3333333334, ans=0.0 2023-10-02 20:24:07,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 20:24:09,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:24:10,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:15,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:24:16,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 20:24:16,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1007733.3333333334, ans=0.0 2023-10-02 20:24:17,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:24:17,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1007733.3333333334, ans=0.2 2023-10-02 20:24:18,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1007800.0, ans=0.125 2023-10-02 20:24:22,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:26,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:24:26,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.93 vs. limit=12.0 2023-10-02 20:24:28,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:24:30,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:24:30,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:24:30,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:24:30,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:30,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:24:30,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:24:35,051 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.909e+02 2.127e+02 2.478e+02 3.965e+02, threshold=4.255e+02, percent-clipped=1.0 2023-10-02 20:24:35,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:24:35,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:24:35,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 20:24:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 20:24:38,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:24:38,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:38,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 20:24:39,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 20:24:39,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 20:24:39,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 20:24:40,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 20:24:42,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:24:45,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:45,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:24:46,654 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 20:24:48,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:48,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:24:50,010 INFO [train.py:1046] (2/4) Epoch 29, batch 2450, loss[loss=0.1889, simple_loss=0.2685, pruned_loss=0.05463, over 23280.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.242, pruned_loss=0.04358, over 4699387.45 frames. ], batch size: 105, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:24:51,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:24:51,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:24:55,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:24:55,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:24:58,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 20:25:03,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:25:03,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:07,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:25:08,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:25:08,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:25:08,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 20:25:12,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:25:15,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:25:18,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:25:20,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:21,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:21,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:25:24,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 20:25:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:25:30,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:31,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:31,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:25:32,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:25:32,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.63 vs. limit=15.0 2023-10-02 20:25:33,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:35,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:25:36,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 20:25:40,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:40,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:25:45,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:25:45,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:25:48,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:25:48,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 20:25:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:25:49,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:25:50,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 20:25:51,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:25:51,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:25:54,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1008200.0, ans=0.125 2023-10-02 20:25:55,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:25:59,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:59,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:26:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 20:26:04,385 INFO [train.py:1046] (2/4) Epoch 29, batch 2500, loss[loss=0.1616, simple_loss=0.2455, pruned_loss=0.03884, over 24305.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.241, pruned_loss=0.04356, over 4707258.51 frames. ], batch size: 61, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:26:04,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:26:09,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:26:17,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:26:18,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:26:18,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:26:18,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 20:26:19,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1008333.3333333334, ans=0.0 2023-10-02 20:26:24,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:26:25,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.70 vs. limit=6.0 2023-10-02 20:26:26,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:26:26,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:26:26,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:26:27,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 20:26:27,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:29,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:26:29,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 20:26:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:30,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 20:26:30,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:34,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:26:36,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:26:37,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:26:38,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 20:26:40,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:26:40,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:44,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:46,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:49,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:26:56,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:26:58,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 20:26:59,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:26:59,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:27:01,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:27:01,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:27:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 20:27:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 20:27:01,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 20:27:04,409 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.824e+02 2.011e+02 2.167e+02 3.747e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-02 20:27:04,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:27:06,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 20:27:06,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 20:27:07,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:27:08,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 20:27:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 20:27:17,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:27:18,558 INFO [train.py:1046] (2/4) Epoch 29, batch 2550, loss[loss=0.1578, simple_loss=0.236, pruned_loss=0.03974, over 20192.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2424, pruned_loss=0.04386, over 4710507.30 frames. ], batch size: 44, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:27:18,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:27:18,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:27:20,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:27:22,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 20:27:23,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:27:26,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 20:27:26,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:27:29,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:32,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:27:32,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 20:27:32,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:27:32,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:27:32,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:27:36,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:27:36,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 20:27:36,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:27:36,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:36,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 20:27:44,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1008666.6666666666, ans=0.125 2023-10-02 20:27:50,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:27:54,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:27:54,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:54,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:27:56,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:28:03,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:28:03,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1008800.0, ans=0.125 2023-10-02 20:28:06,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:28:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:28:06,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:28:06,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:28:07,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:28:11,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:28:12,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:28:15,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:28:15,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 20:28:15,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:28:16,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1008800.0, ans=0.2 2023-10-02 20:28:17,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:28:18,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:28:19,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:28:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:28:27,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:28:28,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:28:29,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1008866.6666666666, ans=0.125 2023-10-02 20:28:32,970 INFO [train.py:1046] (2/4) Epoch 29, batch 2600, loss[loss=0.1615, simple_loss=0.239, pruned_loss=0.04201, over 24446.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2426, pruned_loss=0.04378, over 4719098.38 frames. ], batch size: 63, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:28:33,008 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 20:28:36,075 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 20:28:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:28:36,127 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 20:28:37,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 20:28:37,526 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 20:28:39,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:28:40,922 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 20:28:42,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 20:28:43,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 20:28:45,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:28:47,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 20:28:48,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 20:28:49,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:28:51,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 20:28:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 20:28:53,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 20:29:00,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:01,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:01,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:29:01,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 20:29:02,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:29:08,902 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 20:29:15,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:15,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:15,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 20:29:15,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1009066.6666666666, ans=0.125 2023-10-02 20:29:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:29:16,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:29:17,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 20:29:22,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:29:22,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:29:23,117 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.99 vs. limit=22.5 2023-10-02 20:29:25,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:29:28,064 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 20:29:28,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:29:29,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:29:32,332 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.423e+02 1.950e+02 2.073e+02 2.316e+02 4.084e+02, threshold=4.145e+02, percent-clipped=1.0 2023-10-02 20:29:33,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:29:35,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:29:35,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 20:29:36,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:38,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:29:38,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:29:38,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1009200.0, ans=0.0 2023-10-02 20:29:42,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 20:29:42,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:46,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:29:47,467 INFO [train.py:1046] (2/4) Epoch 29, batch 2650, loss[loss=0.208, simple_loss=0.2714, pruned_loss=0.07232, over 19537.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2435, pruned_loss=0.04394, over 4718480.48 frames. ], batch size: 388, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:29:49,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 20:29:49,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:50,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:29:50,934 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 20:29:51,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.22 vs. limit=12.0 2023-10-02 20:29:52,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:29:53,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:53,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1009266.6666666666, ans=0.2 2023-10-02 20:29:54,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:29:56,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:29:59,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:30:00,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 20:30:00,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:30:01,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:30:05,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 20:30:05,536 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 20:30:08,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:11,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 20:30:12,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:14,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 20:30:18,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:18,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:30:18,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:18,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:22,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 20:30:24,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 20:30:26,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:30:28,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 20:30:29,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:29,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.81 vs. limit=15.0 2023-10-02 20:30:30,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:32,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:30:32,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:30:32,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:33,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:30:34,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:30:36,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:30:37,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:30:39,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:30:39,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:39,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:30:39,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1009466.6666666666, ans=0.125 2023-10-02 20:30:42,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:43,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:30:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:30:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:49,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:30:49,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:49,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 20:30:54,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:55,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:57,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:00,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:31:00,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:01,349 INFO [train.py:1046] (2/4) Epoch 29, batch 2700, loss[loss=0.1815, simple_loss=0.2678, pruned_loss=0.04753, over 24330.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2439, pruned_loss=0.04388, over 4725652.28 frames. ], batch size: 77, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:31:02,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:31:02,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 20:31:04,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:31:07,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 20:31:08,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=12.0 2023-10-02 20:31:08,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:31:08,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:09,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:10,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:31:10,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:31:11,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:31:11,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:31:11,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 20:31:11,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:31:15,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:31:16,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:31:16,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:31:19,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:31:20,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 20:31:20,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:31:26,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:31:26,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:31:34,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:31:34,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:31:34,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:31:34,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:31:35,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:31:36,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1009733.3333333334, ans=0.5 2023-10-02 20:31:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:31:38,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:31:38,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:31:42,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:42,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:31:50,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:31:51,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:31:54,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:31:54,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:31:58,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:58,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:31:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:31:59,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.57 vs. limit=22.5 2023-10-02 20:32:00,990 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.786e+02 2.084e+02 2.386e+02 3.655e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 20:32:01,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:01,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:32:03,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:32:03,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1009866.6666666666, ans=0.125 2023-10-02 20:32:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:32:07,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:32:07,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.90 vs. limit=15.0 2023-10-02 20:32:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:32:11,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 20:32:11,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:12,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=15.0 2023-10-02 20:32:14,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:32:14,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 20:32:15,906 INFO [train.py:1046] (2/4) Epoch 29, batch 2750, loss[loss=0.1539, simple_loss=0.2077, pruned_loss=0.0501, over 19419.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2441, pruned_loss=0.0444, over 4704147.39 frames. ], batch size: 388, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:32:17,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 20:32:17,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:18,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:18,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:32:21,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:21,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:32:21,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:24,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:32:26,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:32:26,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:32:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 20:32:26,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:32:27,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:33,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 20:32:35,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:32:36,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:36,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:32:36,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:32:38,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:32:38,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:32:39,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:39,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:43,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:32:43,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:32:45,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:32:45,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:47,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:32:53,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:55,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:32:55,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:32:56,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1010066.6666666666, ans=0.125 2023-10-02 20:33:01,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:33:01,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:33:01,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:33:06,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:33:06,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:33:06,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 20:33:11,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:11,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-10-02 20:33:13,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 20:33:19,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:33:21,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:33:21,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 20:33:21,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:33:24,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:33:24,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 20:33:24,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:33:27,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 20:33:29,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:29,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:33:29,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 20:33:29,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1010266.6666666666, ans=0.125 2023-10-02 20:33:30,439 INFO [train.py:1046] (2/4) Epoch 29, batch 2800, loss[loss=0.1613, simple_loss=0.2412, pruned_loss=0.04069, over 24301.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2419, pruned_loss=0.04361, over 4700893.31 frames. ], batch size: 61, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:33:30,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:33:30,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:32,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:33:33,816 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 20:33:33,817 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 20:33:36,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:38,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:33:39,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:33:39,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1010266.6666666666, ans=0.0 2023-10-02 20:33:42,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:33:43,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 20:33:44,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 20:33:47,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 20:33:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:49,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:33:49,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:33:52,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:33:52,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:53,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:33:54,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:33:59,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1010400.0, ans=0.2 2023-10-02 20:34:03,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:34:04,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:34:07,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:09,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:34:09,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:11,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.77 vs. limit=15.0 2023-10-02 20:34:13,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:34:13,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 20:34:13,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:15,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:34:15,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:34:19,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:19,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:23,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:34:25,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:34:27,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:27,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:34:27,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:34:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:34:28,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:34:28,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 20:34:28,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:34:29,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.46 vs. limit=15.0 2023-10-02 20:34:29,615 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.917e+02 2.083e+02 2.340e+02 4.683e+02, threshold=4.167e+02, percent-clipped=1.0 2023-10-02 20:34:31,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:34:31,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:34:32,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 20:34:34,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:34,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:34:35,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:34:35,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 20:34:36,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1010533.3333333334, ans=0.0 2023-10-02 20:34:43,735 INFO [train.py:1046] (2/4) Epoch 29, batch 2850, loss[loss=0.1706, simple_loss=0.2528, pruned_loss=0.04417, over 23738.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2415, pruned_loss=0.04335, over 4706450.86 frames. ], batch size: 85, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:34:43,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:34:43,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:34:43,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:34:45,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:34:50,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:34:50,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:34:50,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:52,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:52,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:34:54,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1010600.0, ans=0.0 2023-10-02 20:34:55,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 20:35:02,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 20:35:02,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:03,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 20:35:05,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:07,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 20:35:07,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1010666.6666666666, ans=0.125 2023-10-02 20:35:08,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 20:35:10,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:10,489 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:35:16,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.85 vs. limit=15.0 2023-10-02 20:35:19,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:35:21,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:35:21,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:35:23,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:35:23,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:35:23,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:35:25,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:35:25,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 20:35:28,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:35:28,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:35:29,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:35:30,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:32,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:35:33,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:35:33,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:35,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:35:37,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:35:38,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:40,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:41,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:35:41,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1010800.0, ans=0.125 2023-10-02 20:35:46,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:35:48,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 20:35:48,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 20:35:48,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1010866.6666666666, ans=0.07 2023-10-02 20:35:51,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:35:51,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:35:51,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 20:35:53,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:35:53,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:35:53,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:35:53,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:35:53,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 20:35:53,205 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 20:35:53,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:35:54,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:57,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1010933.3333333334, ans=0.125 2023-10-02 20:35:58,940 INFO [train.py:1046] (2/4) Epoch 29, batch 2900, loss[loss=0.1508, simple_loss=0.2356, pruned_loss=0.03297, over 24664.00 frames. ], tot_loss[loss=0.164, simple_loss=0.242, pruned_loss=0.04305, over 4716573.24 frames. ], batch size: 65, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:36:00,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:36:00,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:36:01,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:36:02,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 20:36:07,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:36:07,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 20:36:08,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1010933.3333333334, ans=22.5 2023-10-02 20:36:08,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 20:36:10,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:36:10,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:36:13,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:36:14,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:36:16,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:36:18,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:36:19,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:36:20,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 20:36:22,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:36:22,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:25,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 20:36:25,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 20:36:28,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1011066.6666666666, ans=0.07 2023-10-02 20:36:30,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:36:30,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 20:36:30,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:36:33,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:36:33,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:36:35,148 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.17 vs. limit=15.0 2023-10-02 20:36:36,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:36:37,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:40,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:36:43,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:36:45,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 20:36:45,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 20:36:45,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:36:48,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:36:50,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 20:36:52,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:36:54,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:37:00,192 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.887e+02 2.065e+02 2.292e+02 3.818e+02, threshold=4.129e+02, percent-clipped=0.0 2023-10-02 20:37:02,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:37:03,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:37:04,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 20:37:06,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:06,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 20:37:07,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:37:08,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:37:13,641 INFO [train.py:1046] (2/4) Epoch 29, batch 2950, loss[loss=0.1709, simple_loss=0.2607, pruned_loss=0.04052, over 24333.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2431, pruned_loss=0.04353, over 4699979.88 frames. ], batch size: 74, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:37:15,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:37:17,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 20:37:17,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:37:17,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:20,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:37:21,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:37:22,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 20:37:22,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.57 vs. limit=6.0 2023-10-02 20:37:23,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 20:37:25,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:37:25,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:37:30,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:37:30,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1011333.3333333334, ans=0.125 2023-10-02 20:37:32,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:37:35,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:37:35,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:37:38,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:37:38,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:37:38,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1011333.3333333334, ans=0.0 2023-10-02 20:37:39,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:40,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:40,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:37:44,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 20:37:44,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.99 vs. limit=15.0 2023-10-02 20:37:48,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 20:37:48,702 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 20:37:50,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:37:51,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 20:37:53,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 20:37:54,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:37:54,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:37:54,760 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 20:37:54,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:37:56,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 20:37:57,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:37:58,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:38:01,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:38:01,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1011466.6666666666, ans=0.1 2023-10-02 20:38:02,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:38:02,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:02,958 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 20:38:04,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:38:04,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 20:38:09,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:10,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:38:10,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 20:38:10,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:38:12,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 20:38:15,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:38:16,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:38:16,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:38:19,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:19,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:38:21,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:38:21,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:21,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:38:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:38:22,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1011533.3333333334, ans=0.1 2023-10-02 20:38:24,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:38:24,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:38:25,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:25,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 20:38:26,930 INFO [train.py:1046] (2/4) Epoch 29, batch 3000, loss[loss=0.1752, simple_loss=0.245, pruned_loss=0.05274, over 23727.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2443, pruned_loss=0.04373, over 4718276.41 frames. ], batch size: 164, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:38:26,930 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 20:38:39,078 INFO [train.py:1078] (2/4) Epoch 29, validation: loss=0.3203, simple_loss=0.2757, pruned_loss=0.1825, over 1125622.00 frames. 2023-10-02 20:38:39,079 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 20:38:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:42,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:38:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:38:45,294 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 20:38:45,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 20:38:47,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:38:48,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:38:48,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 20:38:48,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:38:51,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1011600.0, ans=0.125 2023-10-02 20:38:55,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:39:05,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:39:11,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 20:39:12,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:39:14,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:39:16,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:39:16,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:39:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:39:18,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 20:39:18,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1011733.3333333334, ans=0.0 2023-10-02 20:39:21,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 20:39:21,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:39:22,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:39:25,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:39:25,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:39:25,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:25,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:39:29,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:39:29,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:39:29,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:39:30,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:39:33,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 20:39:33,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:39:35,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:39:35,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:39:38,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:39,806 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.868e+02 2.049e+02 2.188e+02 3.716e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 20:39:39,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:39,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 20:39:41,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 20:39:41,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:39:41,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 20:39:42,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:39:44,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 20:39:46,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:39:49,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:39:49,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 20:39:50,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 20:39:50,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:39:51,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:39:51,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:53,103 INFO [train.py:1046] (2/4) Epoch 29, batch 3050, loss[loss=0.1634, simple_loss=0.2548, pruned_loss=0.03599, over 24319.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2444, pruned_loss=0.04315, over 4733383.03 frames. ], batch size: 74, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:39:53,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:39:53,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:39:54,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:39:54,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 20:39:56,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:39:57,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:39:59,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:40:01,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:06,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 20:40:09,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 20:40:09,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 20:40:09,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:40:14,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1012000.0, ans=0.0 2023-10-02 20:40:16,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1012000.0, ans=0.1 2023-10-02 20:40:18,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:18,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:40:18,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:21,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:40:21,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:40:21,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:23,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:40:23,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:23,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:24,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1012066.6666666666, ans=0.0 2023-10-02 20:40:25,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:28,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:28,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 20:40:29,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:29,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:40:32,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:40:32,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:40:34,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:40:35,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:39,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:41,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:45,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1012133.3333333334, ans=0.0 2023-10-02 20:40:46,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:47,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:40:47,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:48,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:40:48,182 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:40:49,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:40:49,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:40:51,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 20:40:51,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:40:52,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:53,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 20:40:55,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:01,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:02,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:41:02,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1012200.0, ans=0.125 2023-10-02 20:41:05,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:41:06,528 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.91 vs. limit=15.0 2023-10-02 20:41:07,159 INFO [train.py:1046] (2/4) Epoch 29, batch 3100, loss[loss=0.1697, simple_loss=0.2452, pruned_loss=0.04708, over 23358.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2439, pruned_loss=0.04328, over 4721290.21 frames. ], batch size: 119, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:41:08,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 20:41:11,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 20:41:12,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 20:41:15,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:41:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:41:19,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:19,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1012266.6666666666, ans=0.125 2023-10-02 20:41:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 20:41:25,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:30,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 20:41:34,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:41:34,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:36,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:41:36,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:41:39,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 20:41:40,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:41:40,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 20:41:40,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:41:42,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:43,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 20:41:44,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:41:46,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:41:46,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 20:41:50,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 20:41:50,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:51,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:52,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:41:52,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:54,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:41:54,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1012466.6666666666, ans=0.0 2023-10-02 20:41:55,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-10-02 20:41:56,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:41:56,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:57,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:41:57,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:41:57,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:57,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 20:42:01,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:42:03,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 20:42:05,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:42:05,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 20:42:07,524 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.894e+02 2.077e+02 2.394e+02 5.109e+02, threshold=4.155e+02, percent-clipped=1.0 2023-10-02 20:42:07,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:07,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:07,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 20:42:18,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 20:42:19,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1012600.0, ans=0.125 2023-10-02 20:42:20,656 INFO [train.py:1046] (2/4) Epoch 29, batch 3150, loss[loss=0.175, simple_loss=0.2649, pruned_loss=0.0426, over 24623.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2437, pruned_loss=0.04301, over 4735292.47 frames. ], batch size: 73, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:42:20,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:21,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:24,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:42:24,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:42:24,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 20:42:24,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1012600.0, ans=0.0 2023-10-02 20:42:27,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:27,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 20:42:29,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 20:42:31,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:34,120 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 20:42:34,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 20:42:35,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:42:35,696 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 20:42:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 20:42:37,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 20:42:38,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 20:42:38,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 20:42:38,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:38,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:42:40,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:40,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 20:42:41,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:41,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:41,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1012666.6666666666, ans=0.0 2023-10-02 20:42:42,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:42:46,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:42:48,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 20:42:49,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:42:49,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1012733.3333333334, ans=0.1 2023-10-02 20:42:52,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:42:54,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:42:54,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 20:42:55,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 20:42:57,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:42:57,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 20:42:57,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:42:58,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:58,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:42:58,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1012733.3333333334, ans=0.125 2023-10-02 20:42:59,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:42:59,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:43:01,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 20:43:01,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:43:02,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:03,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:43:03,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:43:03,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 20:43:05,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:07,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 20:43:08,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:11,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 20:43:11,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 20:43:12,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:43:12,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:12,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 20:43:13,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 20:43:13,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:43:17,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:43:20,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:20,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:43:25,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:43:27,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:30,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 20:43:32,352 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.20 vs. limit=6.0 2023-10-02 20:43:34,259 INFO [train.py:1046] (2/4) Epoch 29, batch 3200, loss[loss=0.1714, simple_loss=0.2522, pruned_loss=0.04533, over 24072.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2417, pruned_loss=0.04275, over 4734320.60 frames. ], batch size: 80, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:43:34,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:43:34,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:43:34,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1012933.3333333334, ans=0.07 2023-10-02 20:43:37,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:37,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1012933.3333333334, ans=0.025 2023-10-02 20:43:38,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:43:38,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 20:43:41,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:42,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1012933.3333333334, ans=0.07 2023-10-02 20:43:44,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:43:49,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:58,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:44:02,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1013066.6666666666, ans=0.2 2023-10-02 20:44:09,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 20:44:09,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:44:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 20:44:12,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:44:15,003 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-10-02 20:44:16,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:44:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:44:17,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:44:20,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 20:44:21,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 20:44:23,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 20:44:25,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 20:44:26,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:44:31,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.19 vs. limit=15.0 2023-10-02 20:44:35,237 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.842e+02 2.034e+02 2.276e+02 3.426e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 20:44:36,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:44:36,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:44:36,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:44:37,977 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 20:44:37,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:44:42,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:44:42,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 20:44:43,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 20:44:44,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 20:44:44,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1013200.0, ans=0.1 2023-10-02 20:44:46,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 20:44:48,058 INFO [train.py:1046] (2/4) Epoch 29, batch 3250, loss[loss=0.1798, simple_loss=0.2495, pruned_loss=0.05499, over 22797.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2417, pruned_loss=0.04249, over 4734542.65 frames. ], batch size: 322, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:44:48,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:44:51,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:44:51,370 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 20:44:51,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:44:51,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:44:54,181 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 20:44:56,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1013266.6666666666, ans=0.125 2023-10-02 20:44:59,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:45:00,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1013266.6666666666, ans=0.125 2023-10-02 20:45:02,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:45:10,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:10,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 20:45:12,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:12,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:45:12,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:45:13,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:45:15,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:45:18,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:45:18,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:18,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:45:21,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:22,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:45:24,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:24,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:25,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:27,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:45:27,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:45:31,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 20:45:31,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:45:32,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1013400.0, ans=0.2 2023-10-02 20:45:33,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:45:34,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:34,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:45:37,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1013466.6666666666, ans=0.125 2023-10-02 20:45:39,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:45:45,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:45:45,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:45,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 20:45:45,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:45:45,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:45:47,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:48,664 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:45:49,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 20:45:49,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 20:45:51,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:45:53,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:54,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:55,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 20:45:55,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:45:58,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1013533.3333333334, ans=0.0 2023-10-02 20:46:00,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:46:02,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 20:46:02,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:03,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:46:03,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 20:46:06,106 INFO [train.py:1046] (2/4) Epoch 29, batch 3300, loss[loss=0.1791, simple_loss=0.2478, pruned_loss=0.05519, over 23758.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2427, pruned_loss=0.04254, over 4741674.26 frames. ], batch size: 232, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:46:06,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:46:06,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 20:46:08,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 20:46:09,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 20:46:09,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:13,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:46:13,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:46:15,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:16,008 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.51 vs. limit=15.0 2023-10-02 20:46:16,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:46:16,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:46:19,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:21,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:46:22,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1013666.6666666666, ans=0.125 2023-10-02 20:46:25,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 20:46:27,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:46:27,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:27,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1013666.6666666666, ans=0.1 2023-10-02 20:46:28,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:30,048 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 20:46:32,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:46:32,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:46:33,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:46:33,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:46:33,282 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 20:46:33,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1013666.6666666666, ans=0.125 2023-10-02 20:46:35,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.44 vs. limit=15.0 2023-10-02 20:46:36,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:36,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:46:38,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:38,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 20:46:38,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1013733.3333333334, ans=0.125 2023-10-02 20:46:38,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1013733.3333333334, ans=0.125 2023-10-02 20:46:40,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 20:46:40,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:42,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:46:43,506 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 20:46:46,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 20:46:46,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:46:49,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 20:46:49,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1013800.0, ans=0.0 2023-10-02 20:46:50,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:46:51,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:46:51,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:46:56,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:46:56,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:56,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:56,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:46:59,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:46:59,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:59,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1013800.0, ans=0.1 2023-10-02 20:46:59,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1013800.0, ans=0.0 2023-10-02 20:47:00,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:47:00,766 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 20:47:00,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 20:47:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:47:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:47:04,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:06,764 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.852e+02 2.083e+02 2.464e+02 2.851e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 20:47:06,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:47:06,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:08,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:47:08,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:09,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:47:09,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:47:09,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1013866.6666666666, ans=0.0 2023-10-02 20:47:09,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1013866.6666666666, ans=0.0 2023-10-02 20:47:12,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:47:14,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 20:47:14,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:14,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:15,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:47:17,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:47:17,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:18,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:18,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:20,619 INFO [train.py:1046] (2/4) Epoch 29, batch 3350, loss[loss=0.2135, simple_loss=0.2787, pruned_loss=0.07417, over 19321.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2435, pruned_loss=0.04319, over 4725140.10 frames. ], batch size: 388, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:47:23,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:47:25,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:26,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:47:29,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:30,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:47:32,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:33,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:47:35,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 20:47:36,783 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 20:47:36,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:39,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 20:47:40,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 20:47:42,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:47:42,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:47:42,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1014000.0, ans=0.1 2023-10-02 20:47:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:47:43,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 20:47:43,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:43,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:47:45,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:48,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:48,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:49,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:47:54,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:47:56,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:57,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:00,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:48:00,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.38 vs. limit=15.0 2023-10-02 20:48:01,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:48:03,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:48:03,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:06,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:08,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 20:48:08,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:48:08,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 20:48:09,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.16 vs. limit=22.5 2023-10-02 20:48:10,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:48:10,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 20:48:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:13,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:48:19,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:20,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 20:48:20,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:48:22,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:48:22,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:48:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:48:27,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1014200.0, ans=0.125 2023-10-02 20:48:28,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 20:48:28,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:48:30,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:48:30,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:31,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 20:48:32,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:32,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 20:48:32,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1014200.0, ans=0.125 2023-10-02 20:48:33,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:48:34,759 INFO [train.py:1046] (2/4) Epoch 29, batch 3400, loss[loss=0.1697, simple_loss=0.258, pruned_loss=0.04068, over 24352.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.245, pruned_loss=0.04363, over 4724976.69 frames. ], batch size: 74, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:48:34,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:48:34,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:48:36,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:48:36,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 20:48:40,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 20:48:41,764 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 20:48:41,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:48:44,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:48:44,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:48:44,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:48:46,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:48:50,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1014333.3333333334, ans=0.125 2023-10-02 20:48:53,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:48:55,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 20:48:59,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:49:01,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:49:01,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:49:02,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:49:08,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:49:11,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 20:49:11,730 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:49:13,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1014400.0, ans=0.0 2023-10-02 20:49:18,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:49:20,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:49:20,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 20:49:21,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:49:21,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:49:21,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:49:21,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:49:23,125 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:49:26,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:49:29,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:49:29,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:49:33,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:49:35,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.869e+02 2.116e+02 2.414e+02 3.746e+02, threshold=4.233e+02, percent-clipped=0.0 2023-10-02 20:49:35,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 20:49:39,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:49:42,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 20:49:45,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 20:49:46,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:49:47,902 INFO [train.py:1046] (2/4) Epoch 29, batch 3450, loss[loss=0.1605, simple_loss=0.2207, pruned_loss=0.05012, over 23458.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2441, pruned_loss=0.04357, over 4717568.74 frames. ], batch size: 285, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:49:48,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:49:48,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.64 vs. limit=15.0 2023-10-02 20:49:49,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 20:49:51,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:49:54,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:50:01,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:50:01,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:03,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:50:03,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:04,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:09,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 20:50:14,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 20:50:14,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:50:16,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:50:17,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:23,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 20:50:23,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1014733.3333333334, ans=0.125 2023-10-02 20:50:25,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:50:30,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:50:30,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:50:31,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:50:33,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:50:34,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 20:50:34,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:50:34,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:37,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:50:40,834 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:50:42,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 20:50:44,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:50:47,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1014866.6666666666, ans=0.05 2023-10-02 20:50:48,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:50:50,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:50,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1014866.6666666666, ans=0.125 2023-10-02 20:50:53,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:50:57,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:57,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:50:57,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:50:59,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:51:02,653 INFO [train.py:1046] (2/4) Epoch 29, batch 3500, loss[loss=0.1698, simple_loss=0.2346, pruned_loss=0.05252, over 23768.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2425, pruned_loss=0.04313, over 4706866.27 frames. ], batch size: 179, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:51:04,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:51:07,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:51:08,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 20:51:10,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:51:10,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1014933.3333333334, ans=0.0 2023-10-02 20:51:12,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 20:51:15,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:51:15,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 20:51:21,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:51:21,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:51:22,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:51:22,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:51:22,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:51:22,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:24,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:51:24,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 20:51:27,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:27,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:51:28,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:51:32,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:33,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 20:51:33,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:51:36,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:51:39,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:51:40,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:42,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:51:42,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:51:45,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 20:51:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 20:51:45,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 20:51:46,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:51:47,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.13 vs. limit=10.0 2023-10-02 20:51:48,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:48,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:51:48,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:51:52,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:51:52,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:51:55,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:51:57,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 20:51:57,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 20:51:57,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:51:58,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:51:59,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:52:01,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:03,218 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.823e+02 1.998e+02 2.225e+02 3.593e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 20:52:05,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 20:52:06,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:52:07,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:52:09,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 20:52:11,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 20:52:13,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:14,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:52:14,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:14,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:16,034 INFO [train.py:1046] (2/4) Epoch 29, batch 3550, loss[loss=0.1621, simple_loss=0.2302, pruned_loss=0.04699, over 23850.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2418, pruned_loss=0.04309, over 4701231.20 frames. ], batch size: 164, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:52:18,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:52:22,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=14.88 vs. limit=15.0 2023-10-02 20:52:28,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:29,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 20:52:32,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:52:32,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:52:33,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:35,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:52:35,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:52:39,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:52:39,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:52:40,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:40,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:52:40,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:52:46,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:52:46,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:52:47,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:52:47,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:48,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:52:48,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 20:52:48,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:50,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:51,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:52:54,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1015400.0, ans=0.05 2023-10-02 20:52:55,221 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.08 vs. limit=15.0 2023-10-02 20:52:56,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:57,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:52:59,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:00,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 20:53:01,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:53:03,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 20:53:03,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:53:04,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:53:04,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:53:08,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 20:53:10,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:53:15,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:53:16,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 20:53:17,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:20,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:53:22,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 20:53:22,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1015533.3333333334, ans=0.0 2023-10-02 20:53:29,814 INFO [train.py:1046] (2/4) Epoch 29, batch 3600, loss[loss=0.1697, simple_loss=0.2608, pruned_loss=0.03927, over 24643.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2417, pruned_loss=0.04302, over 4705612.63 frames. ], batch size: 73, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 20:53:29,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 20:53:29,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:53:31,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:53:31,761 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.72 vs. limit=15.0 2023-10-02 20:53:34,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:34,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:36,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:53:39,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:53:39,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:41,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:53:42,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:53:42,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:42,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 20:53:46,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:53:47,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:48,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.37 vs. limit=10.0 2023-10-02 20:53:50,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:53:52,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:53:53,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:53:54,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:53:54,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 20:53:56,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:53:59,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:54:01,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:54:03,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:03,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1015733.3333333334, ans=0.125 2023-10-02 20:54:04,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:54:04,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:54:05,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 20:54:12,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:54:12,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:54:13,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 20:54:15,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1015800.0, ans=0.2 2023-10-02 20:54:18,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:54:23,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:26,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:31,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:54:31,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:54:31,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 20:54:32,623 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.883e+02 2.051e+02 2.269e+02 3.379e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 20:54:34,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 20:54:35,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 20:54:38,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:54:38,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:54:40,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 20:54:40,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1015866.6666666666, ans=0.04949747468305833 2023-10-02 20:54:41,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:54:41,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:54:41,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:54:42,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 20:54:43,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 20:54:44,829 INFO [train.py:1046] (2/4) Epoch 29, batch 3650, loss[loss=0.1571, simple_loss=0.2423, pruned_loss=0.03595, over 24645.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2424, pruned_loss=0.04311, over 4705911.94 frames. ], batch size: 65, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 20:54:46,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:46,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 20:54:46,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1015933.3333333334, ans=0.125 2023-10-02 20:54:50,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 20:54:51,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:54:56,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 20:54:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 20:55:02,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:02,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:55:03,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:55:06,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:55:08,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:55:08,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 20:55:10,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:55:10,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:55:10,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 20:55:11,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:55:13,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:55:13,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:13,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1016066.6666666666, ans=0.125 2023-10-02 20:55:14,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.35 vs. limit=22.5 2023-10-02 20:55:15,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:55:16,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 20:55:16,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1016066.6666666666, ans=0.0 2023-10-02 20:55:16,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1016066.6666666666, ans=0.125 2023-10-02 20:55:17,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 20:55:17,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:55:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 20:55:20,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:55:20,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:55:24,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1016066.6666666666, ans=0.0 2023-10-02 20:55:26,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:55:27,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:27,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:55:30,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:55:30,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:55:33,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:55:37,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:55:37,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1016133.3333333334, ans=0.1 2023-10-02 20:55:38,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:38,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:55:40,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:55:41,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:41,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1016133.3333333334, ans=0.125 2023-10-02 20:55:42,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:55:48,878 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 20:55:52,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:55:52,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:55:54,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:55:54,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:55:55,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:55:58,308 INFO [train.py:1046] (2/4) Epoch 29, batch 3700, loss[loss=0.153, simple_loss=0.2381, pruned_loss=0.03397, over 24675.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.243, pruned_loss=0.04283, over 4724689.90 frames. ], batch size: 65, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:55:58,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:59,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 20:55:59,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:56:02,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:56:03,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:56:04,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:56:05,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:56:05,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 20:56:05,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:56:07,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 20:56:07,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:56:11,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:56:12,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:56:13,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:15,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:56:15,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:56:16,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:56:19,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:21,003 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 20:56:27,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:56:28,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.65 vs. limit=10.0 2023-10-02 20:56:29,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:56:30,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:56:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 20:56:30,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:56:33,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:35,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 20:56:36,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:36,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:56:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:40,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:56:42,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 20:56:45,820 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.56 vs. limit=15.0 2023-10-02 20:56:46,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:56:46,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 20:56:46,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:46,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 20:56:50,295 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.74 vs. limit=22.5 2023-10-02 20:56:50,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:56:50,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:56:53,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:56:55,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 20:56:55,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1016466.6666666666, ans=0.0 2023-10-02 20:56:57,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:56:59,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:56:59,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:56:59,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:57:01,746 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.813e+02 1.991e+02 2.176e+02 3.248e+02, threshold=3.981e+02, percent-clipped=0.0 2023-10-02 20:57:03,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:57:03,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 20:57:05,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 20:57:05,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:57:05,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:07,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1016533.3333333334, ans=10.0 2023-10-02 20:57:08,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:57:09,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:57:09,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1016533.3333333334, ans=0.125 2023-10-02 20:57:11,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1016600.0, ans=0.0 2023-10-02 20:57:12,503 INFO [train.py:1046] (2/4) Epoch 29, batch 3750, loss[loss=0.1723, simple_loss=0.2438, pruned_loss=0.05038, over 23619.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2442, pruned_loss=0.04365, over 4718339.92 frames. ], batch size: 256, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:57:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:57:12,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:57:13,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:57:17,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 20:57:18,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 20:57:22,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:57:22,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 20:57:22,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:57:23,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:24,110 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.14 vs. limit=12.0 2023-10-02 20:57:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:27,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:57:30,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:57:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:57:34,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:57:35,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:57:39,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:57:40,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 20:57:42,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:57:42,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:57:43,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:57:45,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1016733.3333333334, ans=0.0 2023-10-02 20:57:47,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 20:57:49,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 20:57:50,528 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.05 vs. limit=15.0 2023-10-02 20:57:51,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:57:53,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:57:54,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:57:57,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:57:58,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:58:00,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.13 vs. limit=15.0 2023-10-02 20:58:01,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 20:58:01,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1016800.0, ans=0.0 2023-10-02 20:58:04,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:07,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:58:07,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:58:11,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:58:16,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:58:17,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:58:18,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:58:20,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:58:23,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:58:27,702 INFO [train.py:1046] (2/4) Epoch 29, batch 3800, loss[loss=0.1493, simple_loss=0.2305, pruned_loss=0.03402, over 24507.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2441, pruned_loss=0.04348, over 4729202.88 frames. ], batch size: 63, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:58:28,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1016933.3333333334, ans=0.04949747468305833 2023-10-02 20:58:29,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:58:29,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1016933.3333333334, ans=0.0 2023-10-02 20:58:33,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:58:33,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:58:35,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 20:58:35,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1016933.3333333334, ans=0.2 2023-10-02 20:58:36,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:39,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:58:40,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:58:44,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 20:58:44,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:58:44,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:58:45,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:45,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:58:47,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:58:48,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 20:58:50,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 20:58:51,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:58:54,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:58:56,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:58:56,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.84 vs. limit=12.0 2023-10-02 20:58:57,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 20:58:57,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:58:57,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:59:00,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:01,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:59:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:59:06,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 20:59:07,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:59:15,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:59:19,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:59:22,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 20:59:25,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 20:59:26,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:59:29,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:59:29,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:30,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.830e+02 2.115e+02 2.392e+02 3.412e+02, threshold=4.230e+02, percent-clipped=0.0 2023-10-02 20:59:31,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 20:59:33,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 20:59:33,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 20:59:33,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:35,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:59:41,230 INFO [train.py:1046] (2/4) Epoch 29, batch 3850, loss[loss=0.1541, simple_loss=0.2274, pruned_loss=0.04042, over 23332.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2437, pruned_loss=0.04349, over 4729564.55 frames. ], batch size: 119, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:59:41,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:59:43,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:59:43,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1017266.6666666666, ans=0.125 2023-10-02 20:59:48,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:59:49,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 20:59:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:59:52,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:54,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:59:56,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:59:59,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:59:59,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 21:00:02,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1017333.3333333334, ans=0.2 2023-10-02 21:00:02,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1017333.3333333334, ans=0.1 2023-10-02 21:00:05,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:06,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:00:08,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:08,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:00:08,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1017333.3333333334, ans=0.2 2023-10-02 21:00:10,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1017400.0, ans=0.1 2023-10-02 21:00:13,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:15,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:00:15,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:15,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:00:16,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:18,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:19,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:19,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:00:19,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 21:00:21,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 21:00:21,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:22,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:24,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:24,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:24,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 21:00:27,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 21:00:28,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 21:00:33,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 21:00:38,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:38,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:41,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1017533.3333333334, ans=0.125 2023-10-02 21:00:43,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:44,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 21:00:46,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 21:00:49,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:49,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:52,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:00:52,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:00:52,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:54,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:54,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:00:54,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 21:00:55,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:56,968 INFO [train.py:1046] (2/4) Epoch 29, batch 3900, loss[loss=0.1684, simple_loss=0.2564, pruned_loss=0.04019, over 24367.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2415, pruned_loss=0.04333, over 4709040.16 frames. ], batch size: 77, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:00:57,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 21:00:57,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:57,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:58,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:00:59,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:01,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:01:01,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:01:01,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:01:02,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:01:02,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 21:01:03,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:06,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:01:06,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:01:06,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:01:08,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:01:10,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:01:10,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:12,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:01:14,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 21:01:14,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:01:14,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1017666.6666666666, ans=0.2 2023-10-02 21:01:16,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 21:01:17,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 21:01:19,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 21:01:23,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:01:25,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:01:25,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:01:26,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:01:28,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1017733.3333333334, ans=0.1 2023-10-02 21:01:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:01:33,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:01:36,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:01:36,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:01:37,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:01:38,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.98 vs. limit=15.0 2023-10-02 21:01:41,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:01:42,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:01:48,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1017800.0, ans=0.2 2023-10-02 21:01:49,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:01:51,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1017800.0, ans=0.125 2023-10-02 21:01:52,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:01:57,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1017866.6666666666, ans=0.125 2023-10-02 21:02:00,139 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.908e+02 2.090e+02 2.279e+02 3.319e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 21:02:01,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:02:04,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:02:04,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 21:02:04,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 21:02:04,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:02:05,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 21:02:07,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:02:07,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 21:02:10,058 INFO [train.py:1046] (2/4) Epoch 29, batch 3950, loss[loss=0.1637, simple_loss=0.2451, pruned_loss=0.04113, over 24446.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2413, pruned_loss=0.04307, over 4708689.91 frames. ], batch size: 63, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:02:14,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:02:14,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 21:02:15,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:02:18,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:02:20,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:02:26,547 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 21:02:26,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:02:26,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 21:02:26,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1018000.0, ans=0.0 2023-10-02 21:02:28,633 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 21:02:28,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:02:30,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:02:31,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:02:31,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:02:32,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 21:02:35,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:02:35,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:02:35,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:02:36,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:02:36,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:02:41,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1018066.6666666666, ans=0.0 2023-10-02 21:02:48,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:02:48,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:02:54,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 21:02:59,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 21:02:59,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 21:03:00,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:03:02,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:03:07,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:03:07,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:03:08,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1018200.0, ans=0.125 2023-10-02 21:03:09,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:03:09,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:03:10,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 21:03:17,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:03:17,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:03:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 21:03:21,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1018200.0, ans=0.125 2023-10-02 21:03:23,939 INFO [train.py:1046] (2/4) Epoch 29, batch 4000, loss[loss=0.163, simple_loss=0.2537, pruned_loss=0.03613, over 24663.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2419, pruned_loss=0.04286, over 4714844.43 frames. ], batch size: 68, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 21:03:24,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1018266.6666666666, ans=0.125 2023-10-02 21:03:28,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:37,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:41,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:03:41,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:03:43,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:43,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 21:03:43,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:03:43,877 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:03:44,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 21:03:44,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:03:46,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 21:03:47,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:03:49,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1018333.3333333334, ans=0.125 2023-10-02 21:03:50,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:03:50,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:03:50,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:03:50,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:03:50,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:03:51,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1018333.3333333334, ans=0.0 2023-10-02 21:03:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:03:53,884 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 21:03:55,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:03:55,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:03:58,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 21:03:59,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:03:59,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:04:04,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1018400.0, ans=0.125 2023-10-02 21:04:07,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 21:04:07,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:04:10,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:04:10,092 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 21:04:12,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:04:13,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 21:04:13,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:04:14,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:04:16,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:04:17,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:04:19,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:04:19,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:04:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 21:04:20,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:04:22,010 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 21:04:25,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:04:28,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 21:04:30,108 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.895e+02 2.160e+02 2.522e+02 3.518e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-02 21:04:30,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:04:30,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:04:30,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:04:30,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1018533.3333333334, ans=0.125 2023-10-02 21:04:33,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:04:36,946 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-10-02 21:04:37,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:04:38,756 INFO [train.py:1046] (2/4) Epoch 29, batch 4050, loss[loss=0.1718, simple_loss=0.2508, pruned_loss=0.04637, over 23562.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2426, pruned_loss=0.043, over 4716068.09 frames. ], batch size: 93, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:04:40,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:04:41,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 21:04:43,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:04:43,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:04:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:04:45,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:04:46,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:04:50,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:04:52,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:04:54,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 21:04:55,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:04:55,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:04:57,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=8.45 vs. limit=12.0 2023-10-02 21:04:58,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:05:01,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:05:04,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 21:05:07,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 21:05:07,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 21:05:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:05:16,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 21:05:17,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:05:19,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:05:22,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:05:24,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:05:24,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:05:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:05:29,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 21:05:29,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:05:31,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:05:32,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 21:05:37,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:05:37,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1018866.6666666666, ans=0.125 2023-10-02 21:05:45,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 21:05:45,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:05:45,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:05:49,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 21:05:49,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 21:05:49,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:05:50,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:05:52,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:05:52,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:05:53,896 INFO [train.py:1046] (2/4) Epoch 29, batch 4100, loss[loss=0.2258, simple_loss=0.2887, pruned_loss=0.08147, over 19348.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2432, pruned_loss=0.0431, over 4714916.14 frames. ], batch size: 388, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:05:59,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 21:05:59,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 21:06:02,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 21:06:05,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 21:06:05,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:05,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:05,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:05,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:06:07,120 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 21:06:09,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1019000.0, ans=0.125 2023-10-02 21:06:11,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:06:11,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:06:13,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:13,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:06:14,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:06:17,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:06:17,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:06:17,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 21:06:17,909 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:06:19,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:19,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:06:19,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:06:19,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:06:20,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 21:06:22,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:06:22,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1019066.6666666666, ans=0.0 2023-10-02 21:06:24,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 21:06:25,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:06:27,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:06:27,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 21:06:28,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:06:28,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:06:28,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:06:31,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1019066.6666666666, ans=0.2 2023-10-02 21:06:32,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 21:06:32,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:06:32,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:06:33,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1019066.6666666666, ans=0.0 2023-10-02 21:06:36,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 21:06:36,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:36,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:06:39,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:06:41,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1019133.3333333334, ans=0.5 2023-10-02 21:06:43,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:06:46,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:06:47,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:56,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:06:56,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:07:00,153 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.365e+02 1.918e+02 2.139e+02 2.602e+02 3.703e+02, threshold=4.278e+02, percent-clipped=0.0 2023-10-02 21:07:00,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:07:01,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:07:06,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:07:07,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:07:08,933 INFO [train.py:1046] (2/4) Epoch 29, batch 4150, loss[loss=0.1499, simple_loss=0.2369, pruned_loss=0.03146, over 24469.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2429, pruned_loss=0.04358, over 4711978.54 frames. ], batch size: 63, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:07:08,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:07:08,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:07:10,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 21:07:10,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1019266.6666666666, ans=0.125 2023-10-02 21:07:12,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:07:12,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 21:07:12,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 21:07:12,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 21:07:13,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:07:19,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:07:19,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:07:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:07:25,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=12.0 2023-10-02 21:07:25,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:07:25,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:07:28,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:07:28,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:07:29,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:07:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:07:38,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:07:38,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 21:07:42,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 21:07:42,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:07:42,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 21:07:42,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:07:42,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:07:45,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:07:46,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:07:46,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1019400.0, ans=0.1 2023-10-02 21:07:50,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 21:07:53,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:07:55,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:07:57,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 21:07:57,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:07:58,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 21:08:01,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1019466.6666666666, ans=0.0 2023-10-02 21:08:02,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:08:02,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:08:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:04,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1019466.6666666666, ans=0.1 2023-10-02 21:08:04,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.53 vs. limit=15.0 2023-10-02 21:08:05,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 21:08:05,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:05,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:08:06,129 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.55 vs. limit=22.5 2023-10-02 21:08:06,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:08:09,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 21:08:11,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:11,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:08:11,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:08:11,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 21:08:13,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:08:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:08:14,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:08:16,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:16,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 21:08:16,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:08:21,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:08:22,981 INFO [train.py:1046] (2/4) Epoch 29, batch 4200, loss[loss=0.1769, simple_loss=0.2548, pruned_loss=0.04947, over 23328.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2422, pruned_loss=0.04331, over 4717416.45 frames. ], batch size: 93, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:08:23,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 21:08:23,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:08:27,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:08:28,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:08:28,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:08:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:08:29,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 21:08:33,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 21:08:34,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:36,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:08:39,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1019666.6666666666, ans=0.2 2023-10-02 21:08:40,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:08:41,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1019666.6666666666, ans=0.1 2023-10-02 21:08:43,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:08:44,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:08:44,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:46,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 21:08:46,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:08:47,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:47,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:08:47,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:08:49,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:08:51,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 21:08:51,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:54,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.90 vs. limit=15.0 2023-10-02 21:08:55,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:08:56,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:08:58,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:08:58,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1019733.3333333334, ans=0.1 2023-10-02 21:08:59,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.73 vs. limit=10.0 2023-10-02 21:08:59,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:09:02,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:09:02,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 21:09:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:09:05,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:09:09,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:09:13,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:09:18,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:09:21,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 21:09:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:09:24,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1019866.6666666666, ans=0.125 2023-10-02 21:09:28,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.836e+02 2.050e+02 2.312e+02 3.476e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 21:09:29,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:09:29,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:31,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 21:09:35,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1019866.6666666666, ans=0.0 2023-10-02 21:09:36,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:09:38,219 INFO [train.py:1046] (2/4) Epoch 29, batch 4250, loss[loss=0.1766, simple_loss=0.2682, pruned_loss=0.04247, over 24437.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2421, pruned_loss=0.04328, over 4707050.24 frames. ], batch size: 69, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:09:39,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:09:39,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:09:42,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:47,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:09:47,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 21:09:48,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:09:50,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:51,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:09:53,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1020000.0, ans=0.5 2023-10-02 21:09:57,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:09:57,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:09:58,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1020000.0, ans=0.125 2023-10-02 21:09:59,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:09:59,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:10:01,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:01,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1020000.0, ans=0.2 2023-10-02 21:10:02,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:04,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:05,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:10:08,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:10,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 21:10:13,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 21:10:13,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:14,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:10:14,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:15,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:10:15,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:17,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:20,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:10:21,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:10:25,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:10:28,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:28,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 21:10:28,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:10:28,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1020133.3333333334, ans=0.0 2023-10-02 21:10:30,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 21:10:30,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:10:31,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:10:35,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:35,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:10:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 21:10:39,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:10:40,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:10:43,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:48,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:10:49,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:10:49,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:10:51,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:10:52,704 INFO [train.py:1046] (2/4) Epoch 29, batch 4300, loss[loss=0.1579, simple_loss=0.2443, pruned_loss=0.03576, over 24284.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2416, pruned_loss=0.04289, over 4716092.03 frames. ], batch size: 74, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:10:52,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:10:52,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 21:10:55,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:10:56,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-10-02 21:10:58,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:10:58,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:11:01,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1020266.6666666666, ans=0.125 2023-10-02 21:11:04,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:11:11,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:11:11,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 21:11:13,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:11:15,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:11:15,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:11:15,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 21:11:17,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:11:19,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:11:22,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 21:11:22,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:11:22,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 21:11:23,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:11:25,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:11:29,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:11:29,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:11:31,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:11:32,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:11:32,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:11:32,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 21:11:34,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 21:11:36,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:11:37,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:37,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:11:37,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:38,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:11:38,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 21:11:38,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 21:11:38,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 21:11:40,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:11:40,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 21:11:40,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 21:11:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:11:46,193 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 21:11:46,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:11:49,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:11:49,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:11:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 21:11:52,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:11:52,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:53,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:11:53,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:11:53,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:11:55,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:11:56,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:11:57,716 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.918e+02 2.229e+02 2.683e+02 3.939e+02, threshold=4.458e+02, percent-clipped=0.0 2023-10-02 21:11:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:59,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:12:04,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 21:12:05,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:12:07,114 INFO [train.py:1046] (2/4) Epoch 29, batch 4350, loss[loss=0.2148, simple_loss=0.2776, pruned_loss=0.07598, over 19481.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2427, pruned_loss=0.04353, over 4705868.15 frames. ], batch size: 388, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:12:08,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:11,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:12:14,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:12:14,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:12:19,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:12:22,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:12:26,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:12:26,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:12:28,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:12:28,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1020666.6666666666, ans=0.125 2023-10-02 21:12:31,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:12:32,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:12:33,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1020666.6666666666, ans=0.1 2023-10-02 21:12:37,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 21:12:37,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1020733.3333333334, ans=0.125 2023-10-02 21:12:38,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:39,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:12:43,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:12:44,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1020733.3333333334, ans=0.1 2023-10-02 21:12:45,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 21:12:48,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:12:50,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:12:54,886 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 21:12:56,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:12:56,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:12:57,780 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 21:12:57,847 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 21:12:57,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:12:57,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:59,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:12:59,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1020800.0, ans=0.125 2023-10-02 21:13:01,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:01,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:13:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:13:05,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 21:13:05,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:05,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:13:05,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:05,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 21:13:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 21:13:07,293 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 21:13:07,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 21:13:11,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:13:11,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:13:11,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:12,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:13:14,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 21:13:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 21:13:17,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:21,794 INFO [train.py:1046] (2/4) Epoch 29, batch 4400, loss[loss=0.1751, simple_loss=0.2489, pruned_loss=0.05071, over 23513.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2433, pruned_loss=0.04361, over 4718484.91 frames. ], batch size: 120, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 21:13:21,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:13:21,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:23,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:13:25,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 21:13:26,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 21:13:26,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 21:13:27,358 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 21:13:28,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:13:28,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:13:30,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 21:13:33,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:34,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:34,653 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 21:13:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:37,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 21:13:39,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 21:13:41,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 21:13:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 21:13:42,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 21:13:43,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:44,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:44,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:13:46,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 21:13:46,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 21:13:47,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:50,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:13:50,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:52,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:52,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:52,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 21:13:55,256 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 21:13:59,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:05,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:14:08,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 21:14:10,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1021133.3333333334, ans=0.05 2023-10-02 21:14:11,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:14:11,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:14:14,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:14:15,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 21:14:16,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:14:16,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:14:16,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:14:16,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:14:17,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1021133.3333333334, ans=0.2 2023-10-02 21:14:18,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 21:14:22,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 21:14:23,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 21:14:23,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:14:23,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 21:14:24,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:14:26,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.930e+02 2.073e+02 2.466e+02 3.633e+02, threshold=4.147e+02, percent-clipped=0.0 2023-10-02 21:14:28,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:14:29,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 21:14:32,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:14:35,495 INFO [train.py:1046] (2/4) Epoch 29, batch 4450, loss[loss=0.1763, simple_loss=0.2581, pruned_loss=0.04721, over 24338.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2439, pruned_loss=0.04388, over 4719080.03 frames. ], batch size: 77, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:14:35,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:37,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:14:41,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:14:41,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:14:46,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:48,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:14:53,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:14:53,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:14:54,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 21:14:54,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:14:54,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:54,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:14:54,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:14:57,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:14:59,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1021333.3333333334, ans=0.125 2023-10-02 21:15:02,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:02,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:04,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1021400.0, ans=0.1 2023-10-02 21:15:05,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:15:05,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:15:05,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:15:10,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 21:15:10,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 21:15:10,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 21:15:10,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:15:14,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:15:17,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 21:15:20,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:15:23,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:24,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 21:15:26,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:15:26,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:15:26,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:15:26,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:15:27,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:31,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:15:33,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 21:15:33,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1021533.3333333334, ans=0.0 2023-10-02 21:15:34,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:15:36,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:15:37,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:15:39,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:15:39,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:15:42,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:15:45,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 21:15:46,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:15:49,630 INFO [train.py:1046] (2/4) Epoch 29, batch 4500, loss[loss=0.1703, simple_loss=0.2322, pruned_loss=0.05423, over 22657.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2441, pruned_loss=0.04358, over 4722625.17 frames. ], batch size: 322, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:15:52,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:15:53,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 21:15:53,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 21:15:55,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:16:00,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:16:01,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:16:01,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:16:02,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.62 vs. limit=22.5 2023-10-02 21:16:02,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:16:02,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:04,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:04,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.84 vs. limit=15.0 2023-10-02 21:16:16,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:16:16,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:16:19,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:16:19,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:16:20,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:16:26,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:16:31,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:16:34,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:16:37,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:16:38,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 21:16:39,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:39,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:16:40,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:16:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:16:44,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:44,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 21:16:44,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:16:44,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:50,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:16:50,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:16:53,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:54,475 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.911e+02 2.155e+02 2.344e+02 3.258e+02, threshold=4.309e+02, percent-clipped=0.0 2023-10-02 21:16:55,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:16:55,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:16:56,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 21:16:58,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 21:16:58,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 21:17:02,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 21:17:03,821 INFO [train.py:1046] (2/4) Epoch 29, batch 4550, loss[loss=0.1556, simple_loss=0.2246, pruned_loss=0.04328, over 23739.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2434, pruned_loss=0.04363, over 4711971.91 frames. ], batch size: 232, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:17:03,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 21:17:05,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:17:09,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:17:10,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:17:12,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:13,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1021933.3333333334, ans=0.05 2023-10-02 21:17:18,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:17:18,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:17:19,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:19,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:17:19,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:24,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:24,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:17:26,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:17:29,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 21:17:29,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 21:17:31,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:17:34,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 21:17:35,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 21:17:37,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:17:40,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 21:17:41,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:17:44,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:44,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:46,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:17:47,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 21:17:49,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:17:52,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:52,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:17:52,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:53,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1022133.3333333334, ans=0.125 2023-10-02 21:17:54,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 21:17:54,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 21:17:54,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:17:55,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 21:17:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 21:17:58,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:58,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:59,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:17:59,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:59,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:18:01,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:18:02,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 21:18:05,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:18:05,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 21:18:05,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 21:18:05,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:18:05,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 21:18:07,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1022200.0, ans=0.0 2023-10-02 21:18:08,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:18:08,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:18:10,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:18:11,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:18:11,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:18:14,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:18:16,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:18:17,923 INFO [train.py:1046] (2/4) Epoch 29, batch 4600, loss[loss=0.1436, simple_loss=0.2211, pruned_loss=0.03306, over 24301.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2415, pruned_loss=0.04324, over 4689805.26 frames. ], batch size: 56, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:18:19,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:19,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:18:22,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:18:22,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:18:23,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.74 vs. limit=12.0 2023-10-02 21:18:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:24,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 21:18:25,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1022266.6666666666, ans=0.125 2023-10-02 21:18:26,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:18:30,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:18:32,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:34,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:42,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 21:18:44,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:45,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:49,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:18:49,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:53,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 21:18:53,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:18:54,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:18:55,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1022400.0, ans=0.125 2023-10-02 21:18:55,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1022400.0, ans=0.0 2023-10-02 21:18:59,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:00,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:19:02,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:19:06,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 21:19:07,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:19:11,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:12,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:19:17,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:17,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 21:19:17,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:17,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 21:19:17,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:19,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:20,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:20,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:19:21,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:23,159 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.896e+02 2.092e+02 2.608e+02 4.694e+02, threshold=4.185e+02, percent-clipped=1.0 2023-10-02 21:19:23,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 21:19:23,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 21:19:23,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 21:19:23,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:24,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:19:26,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:26,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1022533.3333333334, ans=0.125 2023-10-02 21:19:27,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:30,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.48 vs. limit=10.0 2023-10-02 21:19:31,862 INFO [train.py:1046] (2/4) Epoch 29, batch 4650, loss[loss=0.1502, simple_loss=0.2256, pruned_loss=0.0374, over 24415.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2405, pruned_loss=0.04318, over 4696537.54 frames. ], batch size: 58, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:19:38,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:19:40,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:19:41,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:41,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:19:41,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1022600.0, ans=0.0 2023-10-02 21:19:43,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:43,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:19:43,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:46,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 21:19:49,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:19:51,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 21:19:51,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:19:53,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 21:19:53,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:19:54,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 21:19:54,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 21:19:54,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:54,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:19:57,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:19:59,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:00,557 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 21:20:02,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:03,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 21:20:07,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:07,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:20:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 21:20:08,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:20:11,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:20:14,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:18,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:21,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:22,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:22,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:20:26,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 21:20:26,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 21:20:27,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 21:20:27,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 21:20:29,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:20:35,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:20:36,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:20:36,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 21:20:36,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:38,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:20:38,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:20:39,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:20:42,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:20:42,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:20:43,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:45,730 INFO [train.py:1046] (2/4) Epoch 29, batch 4700, loss[loss=0.149, simple_loss=0.2373, pruned_loss=0.03039, over 24483.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2414, pruned_loss=0.04312, over 4708861.96 frames. ], batch size: 63, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:20:47,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:20:49,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:20:49,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:20:50,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 21:20:51,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:20:51,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 21:21:00,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:00,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:21:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:01,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:21:02,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:21:07,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1023000.0, ans=0.125 2023-10-02 21:21:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 21:21:08,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 21:21:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:11,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:21:12,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:21:12,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1023000.0, ans=0.0 2023-10-02 21:21:14,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:20,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:21:21,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:21:25,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:21:29,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 21:21:29,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:21:33,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:36,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 21:21:38,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:21:43,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:21:43,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 21:21:44,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:44,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:48,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:48,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:21:48,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 21:21:50,434 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 21:21:51,669 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.909e+02 2.162e+02 2.551e+02 3.062e+02, threshold=4.324e+02, percent-clipped=0.0 2023-10-02 21:21:51,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:54,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:54,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:54,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 21:21:54,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:59,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 21:22:00,436 INFO [train.py:1046] (2/4) Epoch 29, batch 4750, loss[loss=0.1683, simple_loss=0.2552, pruned_loss=0.0407, over 24014.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2419, pruned_loss=0.04289, over 4708954.39 frames. ], batch size: 80, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:22:01,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:22:01,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:06,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:06,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:22:07,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 21:22:09,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:11,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 21:22:13,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:22:14,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:22:14,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:22:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 21:22:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:22:27,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 21:22:27,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=44.44 vs. limit=22.5 2023-10-02 21:22:28,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:22:32,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:22:32,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:22:32,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:33,899 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 21:22:33,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 21:22:38,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 21:22:41,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:41,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1023400.0, ans=0.125 2023-10-02 21:22:42,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:22:45,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:22:45,582 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 21:22:45,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:22:47,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:22:48,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:22:49,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1023466.6666666666, ans=0.125 2023-10-02 21:22:50,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 21:22:50,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 21:22:52,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:52,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:22:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:55,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:22:55,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 21:22:58,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 21:22:59,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:01,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:23:01,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 21:23:02,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:23:04,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:05,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:23:05,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:07,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:23:10,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:11,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 21:23:11,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 21:23:11,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 21:23:14,249 INFO [train.py:1046] (2/4) Epoch 29, batch 4800, loss[loss=0.1677, simple_loss=0.2564, pruned_loss=0.03955, over 24651.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2438, pruned_loss=0.04362, over 4704596.83 frames. ], batch size: 68, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:23:15,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:23:15,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:17,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 21:23:17,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.75 vs. limit=22.5 2023-10-02 21:23:20,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:21,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:24,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1023600.0, ans=0.125 2023-10-02 21:23:27,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:23:28,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:23:28,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:29,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 21:23:30,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:23:30,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:23:31,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:23:34,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:23:37,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:37,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:23:37,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:37,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 21:23:37,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:38,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:23:41,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:44,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:45,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:45,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:23:47,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:23:47,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:48,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 21:23:48,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 21:23:50,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:23:52,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:23:52,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:23:52,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:23:54,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:23:55,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:23:58,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:59,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:02,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:02,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1023800.0, ans=0.0 2023-10-02 21:24:07,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 21:24:07,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:24:08,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:08,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:24:09,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:24:12,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1023866.6666666666, ans=0.125 2023-10-02 21:24:14,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:24:14,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:24:14,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:15,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:24:16,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:24:16,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:24:21,852 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.889e+02 2.092e+02 2.309e+02 3.142e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 21:24:21,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:21,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:21,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:24:23,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 21:24:26,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 21:24:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:24:26,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:24:26,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:24:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:29,472 INFO [train.py:1046] (2/4) Epoch 29, batch 4850, loss[loss=0.1753, simple_loss=0.2542, pruned_loss=0.04818, over 23506.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2445, pruned_loss=0.04389, over 4699325.51 frames. ], batch size: 106, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:24:30,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:24:38,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 21:24:39,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:44,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:24:45,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:24:45,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:48,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:48,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:24:49,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:24:49,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 21:24:53,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:24:56,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:24:56,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:24:57,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:24:57,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 21:25:00,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:25:01,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:06,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:06,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 21:25:06,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 21:25:07,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:25:13,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:25:13,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 21:25:15,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:25:15,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:25:15,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:25:16,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 21:25:16,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:18,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 21:25:18,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:25:20,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:25:20,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 21:25:31,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:37,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:25:37,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:25:41,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 21:25:41,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:25:42,581 INFO [train.py:1046] (2/4) Epoch 29, batch 4900, loss[loss=0.1744, simple_loss=0.2596, pruned_loss=0.04459, over 24360.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2432, pruned_loss=0.0436, over 4710433.12 frames. ], batch size: 77, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:25:45,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:25:46,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:25:46,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1024266.6666666666, ans=0.125 2023-10-02 21:25:48,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:25:51,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 21:25:58,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 21:25:59,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1024333.3333333334, ans=0.2 2023-10-02 21:26:00,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1024333.3333333334, ans=0.125 2023-10-02 21:26:03,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 21:26:04,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 21:26:04,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:26:04,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:26:05,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:26:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:26:05,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:26:05,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 21:26:09,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 21:26:09,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:26:10,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:26:12,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:26:13,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:26:13,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:26:15,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1024400.0, ans=0.1 2023-10-02 21:26:16,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 21:26:17,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:26:18,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:26:18,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 21:26:18,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 21:26:23,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 21:26:25,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:26:25,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:26:26,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:26:26,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:26:26,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 21:26:26,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:26:27,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 21:26:29,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:31,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:26:32,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:26:37,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 21:26:37,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:26:38,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 21:26:39,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 21:26:45,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:26:47,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:26:48,527 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.893e+02 2.056e+02 2.305e+02 3.001e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-02 21:26:48,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 21:26:48,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:26:48,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:26:50,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:54,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:26:54,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:26:54,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:26:54,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 21:26:55,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1024600.0, ans=0.125 2023-10-02 21:26:56,264 INFO [train.py:1046] (2/4) Epoch 29, batch 4950, loss[loss=0.1609, simple_loss=0.2236, pruned_loss=0.04907, over 22698.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2414, pruned_loss=0.04322, over 4695374.89 frames. ], batch size: 322, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:26:56,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:26:59,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:26:59,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:27:03,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 21:27:03,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 21:27:03,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:27:05,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 21:27:05,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:05,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:27:07,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:27:07,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:10,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:27:10,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:27:11,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:27:13,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:27:14,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:14,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:27:18,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:27:22,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:23,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:27:26,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:26,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:27,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:27:30,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 21:27:32,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 21:27:35,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:36,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:27:36,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:27:37,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:27:37,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:27:39,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:27:41,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:27:42,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:27:45,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:27:46,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:48,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:48,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 21:27:48,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:27:48,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1024800.0, ans=0.1 2023-10-02 21:27:49,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:27:52,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:27:54,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:27:54,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:27:54,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:56,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:27:56,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:27:57,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:27:58,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:27:58,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:28:00,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 21:28:03,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:08,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 21:28:08,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 21:28:11,011 INFO [train.py:1046] (2/4) Epoch 29, batch 5000, loss[loss=0.1561, simple_loss=0.2073, pruned_loss=0.05246, over 19313.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2409, pruned_loss=0.04316, over 4696211.72 frames. ], batch size: 388, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:28:15,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:28:15,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:28:16,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 21:28:17,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 21:28:21,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:28:23,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 21:28:23,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:28:23,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:28:25,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 21:28:25,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:26,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:28:26,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 21:28:26,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:27,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:28:29,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 21:28:29,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 21:28:29,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:28:30,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 21:28:30,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:28:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:31,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1025000.0, ans=0.0 2023-10-02 21:28:32,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:28:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 21:28:32,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 21:28:33,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 21:28:33,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:35,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:36,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 21:28:36,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:28:38,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:41,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:42,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 21:28:43,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 21:28:43,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:28:45,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:28:47,332 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.82 vs. limit=15.0 2023-10-02 21:28:47,945 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 21:28:51,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:28:53,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:53,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:28:58,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 21:28:58,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:58,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:28:58,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:28:59,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 21:29:01,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:29:03,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:29:05,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:11,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 21:29:14,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:17,604 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.798e+02 1.947e+02 2.210e+02 2.765e+02, threshold=3.894e+02, percent-clipped=0.0 2023-10-02 21:29:20,631 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:29:24,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1025266.6666666666, ans=0.1 2023-10-02 21:29:25,094 INFO [train.py:1046] (2/4) Epoch 29, batch 5050, loss[loss=0.1786, simple_loss=0.2606, pruned_loss=0.04829, over 23379.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2419, pruned_loss=0.04314, over 4711122.54 frames. ], batch size: 93, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:29:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:29:26,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:26,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:29:26,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:29:26,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:29:26,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:29:28,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:30,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 21:29:32,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:29:33,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:29:35,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:29:35,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 21:29:35,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:29:39,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:29:41,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:29:41,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:29:53,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 21:29:53,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:29:53,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:29:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 21:29:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:29:56,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:29:56,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:56,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:29:56,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 21:29:58,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 21:29:59,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:30:02,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:05,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:30:05,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 21:30:06,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:30:08,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1025466.6666666666, ans=0.035 2023-10-02 21:30:09,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 21:30:11,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:30:11,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:30:11,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:30:11,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1025466.6666666666, ans=0.125 2023-10-02 21:30:13,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:30:15,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:30:17,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:30:19,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:19,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:30:20,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:30:20,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 21:30:20,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:30:23,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:30:25,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1025533.3333333334, ans=0.125 2023-10-02 21:30:26,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:30:26,533 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 21:30:26,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:30:27,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:30:29,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 21:30:32,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:32,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 21:30:32,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:33,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1025533.3333333334, ans=0.0 2023-10-02 21:30:36,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:30:36,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:36,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 21:30:37,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 21:30:39,562 INFO [train.py:1046] (2/4) Epoch 29, batch 5100, loss[loss=0.1653, simple_loss=0.2469, pruned_loss=0.04184, over 23394.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.243, pruned_loss=0.04316, over 4718039.01 frames. ], batch size: 93, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:30:40,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:30:40,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:30:41,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:30:43,734 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 21:30:44,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1025600.0, ans=0.125 2023-10-02 21:30:46,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:46,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1025600.0, ans=0.125 2023-10-02 21:30:48,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 21:30:49,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 21:30:51,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:30:51,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:30:52,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1025666.6666666666, ans=0.2 2023-10-02 21:30:54,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:30:54,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 21:30:55,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 21:30:58,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1025666.6666666666, ans=0.07 2023-10-02 21:31:00,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:31:01,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:31:04,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:31:07,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1025733.3333333334, ans=0.1 2023-10-02 21:31:08,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 21:31:08,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:31:10,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:31:10,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:31:11,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:11,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:11,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 21:31:14,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 21:31:14,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:16,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 21:31:16,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 21:31:19,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:31:20,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1025733.3333333334, ans=0.125 2023-10-02 21:31:26,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:31:28,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 21:31:28,777 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 21:31:28,785 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 21:31:31,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 21:31:31,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:32,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 21:31:33,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1025800.0, ans=0.125 2023-10-02 21:31:37,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 21:31:39,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.63 vs. limit=10.0 2023-10-02 21:31:40,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:31:41,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:31:44,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 21:31:46,092 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.819e+02 1.985e+02 2.208e+02 2.966e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-02 21:31:47,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:31:47,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 21:31:52,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:31:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:31:52,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:31:53,616 INFO [train.py:1046] (2/4) Epoch 29, batch 5150, loss[loss=0.1709, simple_loss=0.259, pruned_loss=0.04144, over 24658.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2444, pruned_loss=0.04407, over 4707423.29 frames. ], batch size: 73, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:31:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:31:53,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:31:55,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:31:56,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 21:31:56,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 21:31:57,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 21:31:57,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:31:57,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 21:31:59,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:00,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 21:32:02,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:05,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:09,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:32:09,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 21:32:09,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1026000.0, ans=0.125 2023-10-02 21:32:10,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:10,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:32:12,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:32:12,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:32:12,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:32:12,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:32:12,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:32:12,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 21:32:15,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:32:15,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:32:18,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:32:20,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 21:32:23,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:32:27,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1026066.6666666666, ans=0.125 2023-10-02 21:32:28,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:32:29,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 21:32:33,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:32:39,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:32:39,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1026133.3333333334, ans=0.1 2023-10-02 21:32:40,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:42,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:32:44,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:32:47,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 21:32:50,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:51,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:32:51,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:32:52,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1026200.0, ans=0.0 2023-10-02 21:32:54,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:32:55,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:32:55,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 21:33:01,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:33:03,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:33:05,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:33:05,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:33:06,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:33:06,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:33:06,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1026266.6666666666, ans=0.1 2023-10-02 21:33:07,674 INFO [train.py:1046] (2/4) Epoch 29, batch 5200, loss[loss=0.142, simple_loss=0.2257, pruned_loss=0.02918, over 24329.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2454, pruned_loss=0.04426, over 4710845.70 frames. ], batch size: 56, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:33:07,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:33:07,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:33:10,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:33:12,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:33:14,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:19,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 21:33:19,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1026266.6666666666, ans=0.0 2023-10-02 21:33:21,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:33:22,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:24,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:24,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1026333.3333333334, ans=0.05 2023-10-02 21:33:25,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:33:25,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:26,300 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.63 vs. limit=12.0 2023-10-02 21:33:28,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 21:33:31,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:33:31,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:33:31,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1026333.3333333334, ans=0.125 2023-10-02 21:33:32,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 21:33:34,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:33:37,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:33:37,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 21:33:37,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 21:33:39,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 21:33:40,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:33:40,508 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 21:33:40,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:41,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:33:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:33:43,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 21:33:44,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:33:46,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:46,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1026400.0, ans=0.0 2023-10-02 21:33:49,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1026400.0, ans=0.0 2023-10-02 21:33:50,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 21:33:51,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 21:33:51,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 21:33:56,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 21:33:56,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:33:58,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1026466.6666666666, ans=0.125 2023-10-02 21:34:02,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:34:03,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:03,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 21:34:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:34:04,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 21:34:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:05,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:34:09,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:34:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:34:11,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1026533.3333333334, ans=0.0 2023-10-02 21:34:12,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:34:12,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:12,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:14,166 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.881e+02 2.113e+02 2.405e+02 3.374e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-02 21:34:17,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:18,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 21:34:20,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:34:20,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:34:20,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:22,023 INFO [train.py:1046] (2/4) Epoch 29, batch 5250, loss[loss=0.1507, simple_loss=0.209, pruned_loss=0.04615, over 19428.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2447, pruned_loss=0.0443, over 4696477.96 frames. ], batch size: 388, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:34:22,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:34:23,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:34:24,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:34:26,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:26,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:34:29,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:34:33,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:37,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:34:38,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:34:39,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:34:42,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 21:34:42,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:43,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:08,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1026800.0, ans=0.1 2023-10-02 21:35:21,137 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.04 vs. limit=15.0 2023-10-02 21:35:23,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1026866.6666666666, ans=0.125 2023-10-02 21:35:24,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1026866.6666666666, ans=0.07 2023-10-02 21:35:30,836 INFO [train.py:1046] (2/4) Epoch 29, batch 5300, loss[loss=0.1551, simple_loss=0.2377, pruned_loss=0.03626, over 24468.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2422, pruned_loss=0.0438, over 4688368.28 frames. ], batch size: 63, lr: 3.49e-03, grad_scale: 32.0 2023-10-02 21:35:31,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1026933.3333333334, ans=0.2 2023-10-02 21:35:45,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:35:45,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 21:35:45,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 21:35:45,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:45,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:45,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:45,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:45,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:45,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:35:45,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:45,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:35:46,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:35:46,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 21:35:46,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 21:35:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 21:35:46,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:35:46,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 21:35:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 21:35:46,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:47,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:47,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:47,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:35:47,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:35:47,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:35:47,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:47,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:48,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:48,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:48,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:35:48,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:48,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:35:48,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 21:35:48,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:35:48,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:48,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 21:35:48,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 21:35:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:35:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:35:49,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 21:35:49,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 21:35:49,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:35:49,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:35:49,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:35:49,884 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 21:35:49,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 21:35:49,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:35:50,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:50,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 21:35:50,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 21:35:50,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 21:35:50,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:35:52,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1027013.3333333334, ans=0.2 2023-10-02 21:35:57,017 INFO [train.py:1046] (2/4) Epoch 30, batch 0, loss[loss=0.1627, simple_loss=0.245, pruned_loss=0.04017, over 23364.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.245, pruned_loss=0.04017, over 23364.00 frames. ], batch size: 119, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:35:57,017 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 21:36:08,954 INFO [train.py:1078] (2/4) Epoch 30, validation: loss=0.3201, simple_loss=0.2693, pruned_loss=0.1854, over 1125622.00 frames. 2023-10-02 21:36:08,954 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 21:36:10,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 21:36:11,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:36:14,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:36:17,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1027013.3333333334, ans=0.125 2023-10-02 21:36:20,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:20,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:36:22,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:23,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 21:36:24,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 21:36:26,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:27,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:31,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:31,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:33,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:36:33,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:36:33,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 21:36:35,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1027080.0, ans=0.125 2023-10-02 21:36:36,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:36:38,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1027146.6666666666, ans=0.0 2023-10-02 21:36:41,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:36:41,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:42,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 21:36:46,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:36:46,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:36:47,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:36:50,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:36:54,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:36:58,860 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 2.060e+02 2.425e+02 2.930e+02 5.326e+02, threshold=4.849e+02, percent-clipped=3.0 2023-10-02 21:37:00,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 21:37:04,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 21:37:05,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:37:05,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:06,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:37:06,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:37:09,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 21:37:10,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:11,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:17,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:37:19,810 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 21:37:21,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:37:22,487 INFO [train.py:1046] (2/4) Epoch 30, batch 50, loss[loss=0.1761, simple_loss=0.248, pruned_loss=0.05209, over 23777.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2456, pruned_loss=0.0453, over 1058770.77 frames. ], batch size: 212, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:37:23,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:37:25,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:37:25,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 21:37:26,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:37:26,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:37:29,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:37:30,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:37:33,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:37:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 21:37:36,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:42,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:37:44,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 21:37:45,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 21:37:47,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:37:50,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:37:50,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:52,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:37:52,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:37:53,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:37:53,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:56,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1027480.0, ans=0.0 2023-10-02 21:37:56,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1027480.0, ans=0.1 2023-10-02 21:38:00,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:38:00,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:01,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:38:03,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 21:38:04,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:38:04,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:38:04,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 21:38:05,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:38:08,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 21:38:12,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1027546.6666666666, ans=0.125 2023-10-02 21:38:14,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:38:14,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:38:14,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.03 vs. limit=12.0 2023-10-02 21:38:15,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:18,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:38:18,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:38:20,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1027613.3333333334, ans=0.125 2023-10-02 21:38:21,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 21:38:22,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 21:38:23,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:23,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:38:24,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:38:24,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:38:26,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 21:38:26,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 21:38:27,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 21:38:30,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:30,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:38:31,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 21:38:31,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 21:38:31,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:33,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:34,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:38:34,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:38:36,011 INFO [train.py:1046] (2/4) Epoch 30, batch 100, loss[loss=0.1798, simple_loss=0.2644, pruned_loss=0.04762, over 23233.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2462, pruned_loss=0.04512, over 1869773.18 frames. ], batch size: 93, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:38:37,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:38:40,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:38:43,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:38:44,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 21:38:44,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:49,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:38:49,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:38:49,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:49,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:38:49,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:38:51,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 21:38:52,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:38:52,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:52,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:38:52,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:38:57,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 21:38:58,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:58,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:00,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:39:01,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:39:05,861 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 21:39:05,875 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 21:39:05,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:05,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:39:11,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:39:13,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:39:13,371 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:39:14,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:19,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:19,439 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 21:39:22,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 21:39:25,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:39:26,743 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.810e+02 1.965e+02 2.263e+02 3.377e+02, threshold=3.931e+02, percent-clipped=0.0 2023-10-02 21:39:26,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:39:29,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:32,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:33,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:39:35,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:39:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:38,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1027946.6666666666, ans=0.0 2023-10-02 21:39:39,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:40,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:40,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:39:40,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:40,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 21:39:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 21:39:42,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:39:42,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:42,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:42,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1027946.6666666666, ans=0.2 2023-10-02 21:39:44,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 21:39:44,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:39:45,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:39:45,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:45,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:47,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:47,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:39:48,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:39:50,452 INFO [train.py:1046] (2/4) Epoch 30, batch 150, loss[loss=0.1693, simple_loss=0.241, pruned_loss=0.04878, over 23943.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2475, pruned_loss=0.04492, over 2500417.41 frames. ], batch size: 180, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:39:50,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:53,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:39:53,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:39:53,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:56,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:56,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:59,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1028013.3333333334, ans=0.2 2023-10-02 21:40:00,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:40:00,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:04,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 21:40:04,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 21:40:04,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 21:40:07,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:40:07,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:40:08,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:40:08,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:40:08,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:10,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:12,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:13,577 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 21:40:14,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:19,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.13 vs. limit=22.5 2023-10-02 21:40:21,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:40:24,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1028146.6666666666, ans=0.125 2023-10-02 21:40:25,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:40:27,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 21:40:30,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:40:30,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:40:30,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:40:31,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:40:33,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:40:34,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.99 vs. limit=15.0 2023-10-02 21:40:34,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:40:34,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:36,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 21:40:41,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:43,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:40:43,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:40:43,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:40:46,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:46,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1028213.3333333334, ans=0.125 2023-10-02 21:40:47,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 21:40:49,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:40:50,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:40:51,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:40:54,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:40:54,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 21:40:54,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:40:54,444 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 21:40:57,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:58,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1028280.0, ans=0.2 2023-10-02 21:40:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:41:00,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:41:02,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 21:41:02,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1028280.0, ans=0.125 2023-10-02 21:41:03,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:41:04,557 INFO [train.py:1046] (2/4) Epoch 30, batch 200, loss[loss=0.2017, simple_loss=0.2698, pruned_loss=0.06676, over 19544.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2466, pruned_loss=0.04463, over 2993688.67 frames. ], batch size: 389, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:41:04,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:06,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 21:41:08,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:41:10,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:10,305 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:41:11,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:41:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:41:17,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:41:17,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:22,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1028413.3333333334, ans=0.125 2023-10-02 21:41:35,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:41:37,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:41:37,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:41:37,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:41:39,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 21:41:39,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:41:41,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:41:42,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:41:43,312 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-10-02 21:41:43,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:41:43,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:41:45,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 21:41:46,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:41:46,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:51,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:41:53,900 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.828e+02 1.977e+02 2.173e+02 2.870e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-02 21:41:55,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:42:00,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:00,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:42:02,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.71 vs. limit=12.0 2023-10-02 21:42:07,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:10,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 21:42:10,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:42:11,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:42:11,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:42:13,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:42:14,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 21:42:14,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:42:14,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1028613.3333333334, ans=0.2 2023-10-02 21:42:15,816 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 21:42:17,145 INFO [train.py:1046] (2/4) Epoch 30, batch 250, loss[loss=0.1782, simple_loss=0.2598, pruned_loss=0.04831, over 24611.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2449, pruned_loss=0.04398, over 3383698.16 frames. ], batch size: 68, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:42:17,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:19,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:42:19,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:19,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:42:22,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:42:22,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.04 vs. limit=15.0 2023-10-02 21:42:23,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:24,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:42:28,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:42:35,260 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-10-02 21:42:38,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:42:41,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:42:41,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:42:47,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:42:47,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1028813.3333333334, ans=0.125 2023-10-02 21:42:48,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:42:49,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:42:49,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:42:51,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:42:51,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:42:51,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:42:54,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:42:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 21:42:57,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:42:59,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:42:59,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:42:59,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:43:00,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:43:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:43:03,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:43:04,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:05,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:43:05,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:10,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:43:10,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1028880.0, ans=0.125 2023-10-02 21:43:12,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:15,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.00 vs. limit=15.0 2023-10-02 21:43:16,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:43:19,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:21,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1028946.6666666666, ans=0.0 2023-10-02 21:43:22,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:43:28,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 21:43:30,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:43:30,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:43:31,448 INFO [train.py:1046] (2/4) Epoch 30, batch 300, loss[loss=0.1625, simple_loss=0.2431, pruned_loss=0.04099, over 24007.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.242, pruned_loss=0.04352, over 3670615.39 frames. ], batch size: 86, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:43:31,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 21:43:31,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:43:33,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:43:34,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 21:43:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:38,475 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.40 vs. limit=15.0 2023-10-02 21:43:39,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:43:42,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:43:43,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 21:43:44,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:44,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:43:44,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1029080.0, ans=0.0 2023-10-02 21:43:46,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 21:43:46,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:43:46,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1029080.0, ans=0.2 2023-10-02 21:43:49,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1029080.0, ans=0.125 2023-10-02 21:43:49,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1029080.0, ans=0.0 2023-10-02 21:43:50,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:43:54,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:43:54,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 21:43:57,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 21:43:57,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:01,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:02,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:02,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 21:44:02,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:44:06,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:44:07,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:44:07,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:44:10,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1029146.6666666666, ans=0.125 2023-10-02 21:44:11,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 21:44:11,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 21:44:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:44:14,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:15,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 21:44:15,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:21,391 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.890e+02 2.076e+02 2.319e+02 2.784e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 21:44:22,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:44:24,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:44:24,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 21:44:28,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:28,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:44:31,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:33,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:44:33,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 21:44:35,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:44:35,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:44:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 21:44:38,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:38,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:41,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:41,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:41,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:43,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1029280.0, ans=0.04949747468305833 2023-10-02 21:44:45,534 INFO [train.py:1046] (2/4) Epoch 30, batch 350, loss[loss=0.1663, simple_loss=0.2555, pruned_loss=0.03856, over 24334.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2421, pruned_loss=0.04303, over 3902170.92 frames. ], batch size: 74, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:44:46,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:44:46,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 21:44:49,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:55,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:56,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:58,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:59,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 21:45:00,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1029413.3333333334, ans=0.1 2023-10-02 21:45:01,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:45:02,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 21:45:05,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:06,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 21:45:07,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:45:10,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 21:45:11,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:45:13,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:45:14,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:45:14,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:16,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:16,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:45:16,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:17,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:45:18,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:45:18,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:24,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:45:24,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:45:25,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:45:27,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:31,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 21:45:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:35,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1029546.6666666666, ans=0.125 2023-10-02 21:45:38,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:38,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:38,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:45:38,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 21:45:38,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1029546.6666666666, ans=0.1 2023-10-02 21:45:41,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:43,163 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 21:45:43,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 21:45:44,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:44,933 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:45:44,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1029613.3333333334, ans=0.1 2023-10-02 21:45:46,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:45:46,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 21:45:47,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:50,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:45:51,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:52,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:52,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:54,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:57,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:45:57,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1029680.0, ans=0.2 2023-10-02 21:45:58,320 INFO [train.py:1046] (2/4) Epoch 30, batch 400, loss[loss=0.1496, simple_loss=0.231, pruned_loss=0.0341, over 24462.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2417, pruned_loss=0.04265, over 4087799.60 frames. ], batch size: 63, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:46:00,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:46:01,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 21:46:01,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:46:01,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:03,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:46:05,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:06,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:46:06,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:10,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 21:46:11,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 21:46:11,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:12,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 21:46:14,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:18,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:46:18,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:19,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 21:46:19,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:46:19,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:19,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:20,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1029746.6666666666, ans=0.2 2023-10-02 21:46:21,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:46:22,743 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 21:46:22,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 21:46:27,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:29,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:46:29,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 21:46:31,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 21:46:35,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:46:37,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:46:44,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 21:46:48,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:46:49,950 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.876e+02 2.050e+02 2.332e+02 4.522e+02, threshold=4.101e+02, percent-clipped=1.0 2023-10-02 21:46:50,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 21:46:51,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1029880.0, ans=0.2 2023-10-02 21:46:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:55,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:46:55,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 21:46:55,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1029880.0, ans=0.1 2023-10-02 21:46:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:47:01,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:47:02,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:47:02,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1029946.6666666666, ans=0.125 2023-10-02 21:47:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:05,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 21:47:06,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:47:07,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 21:47:10,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:47:10,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:47:11,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 21:47:13,225 INFO [train.py:1046] (2/4) Epoch 30, batch 450, loss[loss=0.1653, simple_loss=0.2381, pruned_loss=0.04628, over 23573.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2421, pruned_loss=0.04321, over 4219357.65 frames. ], batch size: 256, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:47:15,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:47:15,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:47:15,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:47:16,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 21:47:16,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:47:18,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:47:18,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:47:18,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 21:47:19,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:47:20,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:47:21,561 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.02 vs. limit=15.0 2023-10-02 21:47:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:47:31,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1030080.0, ans=0.0 2023-10-02 21:47:32,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:33,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:47:35,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 21:47:35,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 21:47:39,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:47:42,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:45,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:47:48,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:47:50,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:47:51,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 21:47:52,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 21:47:53,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 21:47:53,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:47:55,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:47:55,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:47:57,222 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 21:47:57,231 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 21:47:57,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:58,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:48:00,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:48:02,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:48:04,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:48:04,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 21:48:04,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 21:48:07,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:48:09,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:48:10,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:48:12,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 21:48:12,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1030280.0, ans=0.0 2023-10-02 21:48:14,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:48:16,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 21:48:16,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 21:48:18,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:48:22,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:48:23,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:48:25,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:48:25,119 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 21:48:26,419 INFO [train.py:1046] (2/4) Epoch 30, batch 500, loss[loss=0.1645, simple_loss=0.2342, pruned_loss=0.04744, over 23476.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2428, pruned_loss=0.04347, over 4322683.64 frames. ], batch size: 285, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:48:29,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:48:30,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:48:30,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:48:30,934 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 21:48:32,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 21:48:32,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:48:35,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:48:36,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1030346.6666666666, ans=0.1 2023-10-02 21:48:40,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:48:41,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:48:44,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:48:44,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:48:44,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:48:54,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1030480.0, ans=0.0 2023-10-02 21:48:55,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:48:55,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:48:55,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:48:57,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:48:57,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 21:48:57,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:49:00,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:49:01,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:49:01,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:49:01,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:02,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 21:49:04,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1030480.0, ans=0.0 2023-10-02 21:49:06,975 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 21:49:09,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:10,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:49:15,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 21:49:16,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.03 vs. limit=10.0 2023-10-02 21:49:16,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:49:18,449 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.804e+02 1.945e+02 2.150e+02 2.589e+02, threshold=3.890e+02, percent-clipped=0.0 2023-10-02 21:49:18,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:21,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:25,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:25,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1030613.3333333334, ans=0.1 2023-10-02 21:49:29,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:31,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 21:49:31,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:31,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:35,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 21:49:35,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:49:38,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:39,880 INFO [train.py:1046] (2/4) Epoch 30, batch 550, loss[loss=0.1886, simple_loss=0.2563, pruned_loss=0.06041, over 23724.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2441, pruned_loss=0.04395, over 4405782.23 frames. ], batch size: 164, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:49:42,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 21:49:44,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 21:49:44,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:44,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 21:49:44,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:49:44,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:46,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:46,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:47,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:49:47,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:49:50,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:51,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1030680.0, ans=0.125 2023-10-02 21:49:52,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 21:49:52,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:49:52,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1030680.0, ans=0.125 2023-10-02 21:49:56,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:49:56,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:59,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:50:00,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:50:05,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 21:50:07,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 21:50:08,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:50:09,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.08 vs. limit=15.0 2023-10-02 21:50:14,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:50:14,647 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:50:14,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1030813.3333333334, ans=0.2 2023-10-02 21:50:15,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.85 vs. limit=15.0 2023-10-02 21:50:15,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:50:15,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:50:18,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:18,880 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 21:50:20,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:50:22,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 21:50:24,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:50:24,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:50:24,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:50:26,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:27,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 21:50:29,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 21:50:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:50:30,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:50:30,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:50:30,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:50:32,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:50:33,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:50:35,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1030880.0, ans=0.125 2023-10-02 21:50:36,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:50:36,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:38,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:50:38,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:50:41,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:50:41,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:50:42,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:44,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:50:44,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 21:50:44,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1030946.6666666666, ans=0.125 2023-10-02 21:50:45,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1030946.6666666666, ans=0.1 2023-10-02 21:50:47,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1030946.6666666666, ans=0.125 2023-10-02 21:50:50,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 21:50:53,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1031013.3333333334, ans=0.1 2023-10-02 21:50:54,244 INFO [train.py:1046] (2/4) Epoch 30, batch 600, loss[loss=0.1617, simple_loss=0.2477, pruned_loss=0.03785, over 24656.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2439, pruned_loss=0.04387, over 4475369.76 frames. ], batch size: 73, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:50:54,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 21:50:54,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:50:55,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:50:55,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:02,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:51:02,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1031013.3333333334, ans=0.1 2023-10-02 21:51:05,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:51:05,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 21:51:06,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:51:09,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:51:11,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:13,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1031080.0, ans=0.125 2023-10-02 21:51:14,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 21:51:14,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:51:20,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 21:51:24,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:51:24,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:26,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:51:30,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:51:30,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:51:31,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:37,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:51:37,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1031213.3333333334, ans=0.0 2023-10-02 21:51:41,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:42,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:51:42,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:44,815 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.850e+02 2.032e+02 2.229e+02 3.048e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-02 21:51:47,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1031213.3333333334, ans=0.07 2023-10-02 21:51:49,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 21:51:55,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:51:55,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:51:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 21:51:59,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:52:00,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1031280.0, ans=0.0 2023-10-02 21:52:01,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 21:52:02,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:52:02,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:52:06,828 INFO [train.py:1046] (2/4) Epoch 30, batch 650, loss[loss=0.1543, simple_loss=0.2401, pruned_loss=0.03428, over 24455.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2428, pruned_loss=0.04332, over 4509949.26 frames. ], batch size: 66, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:52:07,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 21:52:09,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:52:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:52:11,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:52:13,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:16,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 21:52:17,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:52:22,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:52:22,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:52:26,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:28,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1031413.3333333334, ans=0.125 2023-10-02 21:52:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 21:52:30,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:52:32,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:52:34,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:52:35,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 21:52:35,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1031480.0, ans=0.125 2023-10-02 21:52:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:38,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:40,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:52:41,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:42,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:52:45,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:52:45,471 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 21:52:45,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:45,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:52:47,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:47,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1031480.0, ans=0.125 2023-10-02 21:52:48,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:52:48,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:52:50,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:52:51,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 21:52:52,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1031546.6666666666, ans=0.0 2023-10-02 21:52:53,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:52:53,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:52:54,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:52:54,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:52:56,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:52:56,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 21:52:56,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.92 vs. limit=22.5 2023-10-02 21:52:57,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 21:52:58,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:58,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:52:59,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:52:59,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:52:59,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1031546.6666666666, ans=0.035 2023-10-02 21:53:01,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:53:06,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:53:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:53:11,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:53:11,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 21:53:12,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:53:15,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1031613.3333333334, ans=0.04949747468305833 2023-10-02 21:53:19,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:53:19,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:53:19,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:53:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:53:21,521 INFO [train.py:1046] (2/4) Epoch 30, batch 700, loss[loss=0.1538, simple_loss=0.2428, pruned_loss=0.03238, over 24629.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2412, pruned_loss=0.04338, over 4540987.28 frames. ], batch size: 68, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:53:24,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 21:53:24,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 21:53:27,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 21:53:28,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:29,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:53:31,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 21:53:31,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1031680.0, ans=0.125 2023-10-02 21:53:35,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:53:39,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:53:41,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:43,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:53:44,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:53:46,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:47,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1031746.6666666666, ans=0.125 2023-10-02 21:53:48,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 21:53:48,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:53:50,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1031813.3333333334, ans=0.2 2023-10-02 21:53:52,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 21:53:52,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1031813.3333333334, ans=0.125 2023-10-02 21:53:54,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 21:53:58,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:53:58,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:53:59,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:54:02,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.82 vs. limit=6.0 2023-10-02 21:54:04,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:54:05,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 21:54:09,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:09,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:54:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 21:54:12,183 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.810e+02 2.003e+02 2.229e+02 3.158e+02, threshold=4.006e+02, percent-clipped=0.0 2023-10-02 21:54:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:54:15,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:18,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:54:18,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1031946.6666666666, ans=0.0 2023-10-02 21:54:22,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:54:22,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 21:54:24,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 21:54:26,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 21:54:27,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:29,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:54:30,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:54:33,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:33,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 21:54:34,945 INFO [train.py:1046] (2/4) Epoch 30, batch 750, loss[loss=0.1532, simple_loss=0.232, pruned_loss=0.0372, over 24447.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2405, pruned_loss=0.04283, over 4574476.45 frames. ], batch size: 58, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:54:38,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 21:54:38,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1032013.3333333334, ans=0.1 2023-10-02 21:54:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 21:54:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 21:54:39,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1032013.3333333334, ans=0.125 2023-10-02 21:54:41,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 21:54:42,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 21:54:43,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:54:43,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 21:54:45,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:45,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:54:47,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:54:47,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1032013.3333333334, ans=0.1 2023-10-02 21:54:47,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1032013.3333333334, ans=0.125 2023-10-02 21:54:48,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1032080.0, ans=0.0 2023-10-02 21:54:50,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:50,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:54:51,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:54:53,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:54:53,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:54:55,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:54:58,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:54:58,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:55:00,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 21:55:01,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:55:01,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:55:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:55:04,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:55:06,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 21:55:06,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:55:07,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 21:55:07,477 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 21:55:08,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 21:55:08,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:55:08,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:55:10,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:55:16,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:55:17,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.10 vs. limit=15.0 2023-10-02 21:55:18,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:18,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:55:20,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:55:22,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:55:22,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 21:55:24,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:55:24,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 21:55:25,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:55:28,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:55:28,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 21:55:29,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:34,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:55:35,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:55:37,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:55:38,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:55:43,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 21:55:43,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:55:44,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:55:47,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:55:47,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:55:48,922 INFO [train.py:1046] (2/4) Epoch 30, batch 800, loss[loss=0.1717, simple_loss=0.2595, pruned_loss=0.04197, over 24400.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.241, pruned_loss=0.04284, over 4602764.37 frames. ], batch size: 77, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:55:49,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:49,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:55:56,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1032346.6666666666, ans=0.125 2023-10-02 21:55:58,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:58,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:00,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:56:00,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:56:00,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:02,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:04,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:08,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:08,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:56:12,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 21:56:12,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:13,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:56:13,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:56:15,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:56:15,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 21:56:15,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:16,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 21:56:19,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:21,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:24,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:56:24,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:56:26,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:27,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:31,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:56:31,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:56:31,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 21:56:32,767 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 21:56:32,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 21:56:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:56:32,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:56:34,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1032546.6666666666, ans=0.125 2023-10-02 21:56:35,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:35,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:56:40,117 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.899e+02 2.201e+02 2.718e+02 4.038e+02, threshold=4.403e+02, percent-clipped=2.0 2023-10-02 21:56:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 21:56:41,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 21:56:43,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:56:44,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:56:49,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:56:51,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:51,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 21:56:52,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:56:56,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 21:56:59,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1032613.3333333334, ans=0.125 2023-10-02 21:57:00,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.11 vs. limit=22.5 2023-10-02 21:57:00,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:57:02,196 INFO [train.py:1046] (2/4) Epoch 30, batch 850, loss[loss=0.1285, simple_loss=0.205, pruned_loss=0.02602, over 24303.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2422, pruned_loss=0.04304, over 4640947.99 frames. ], batch size: 56, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:57:02,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:57:03,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 21:57:04,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:57:05,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:57:05,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 21:57:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:07,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:57:08,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:09,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:57:11,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:57:11,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1032680.0, ans=15.0 2023-10-02 21:57:12,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 21:57:12,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 21:57:12,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 21:57:14,728 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:57:15,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:57:15,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:57:19,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:19,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:57:19,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:57:24,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:24,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:24,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 21:57:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 21:57:28,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:28,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1032746.6666666666, ans=0.0 2023-10-02 21:57:29,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 21:57:29,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1032746.6666666666, ans=0.0 2023-10-02 21:57:31,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.35 vs. limit=10.0 2023-10-02 21:57:33,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 21:57:35,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 21:57:35,250 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 21:57:35,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:57:37,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:57:37,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 21:57:37,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.83 vs. limit=15.0 2023-10-02 21:57:38,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:39,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:39,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 21:57:43,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:57:43,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:44,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1032813.3333333334, ans=0.0 2023-10-02 21:57:46,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:57:46,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:57:47,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:57:49,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:57:49,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 21:57:53,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:57:53,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:57:53,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:57:55,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:57:55,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:56,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:59,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:58:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:58:02,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:02,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:58:05,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1032946.6666666666, ans=0.0 2023-10-02 21:58:09,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:58:09,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:58:09,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1032946.6666666666, ans=0.125 2023-10-02 21:58:10,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 21:58:10,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:58:10,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:58:13,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 21:58:17,198 INFO [train.py:1046] (2/4) Epoch 30, batch 900, loss[loss=0.147, simple_loss=0.232, pruned_loss=0.03101, over 24464.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2439, pruned_loss=0.04355, over 4656703.26 frames. ], batch size: 63, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 21:58:22,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:58:24,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:24,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 21:58:27,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:58:27,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 21:58:27,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1033013.3333333334, ans=0.125 2023-10-02 21:58:28,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:58:30,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:58:30,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:58:30,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:58:31,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:58:35,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1033080.0, ans=0.0 2023-10-02 21:58:41,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:58:41,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:41,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:58:42,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1033080.0, ans=0.04949747468305833 2023-10-02 21:58:44,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.08 vs. limit=15.0 2023-10-02 21:58:45,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:58:46,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1033146.6666666666, ans=0.1 2023-10-02 21:58:49,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 21:58:54,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:58:56,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.15 vs. limit=6.0 2023-10-02 21:58:58,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:58:58,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:58:58,477 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 21:58:59,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 21:59:05,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:59:05,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:59:06,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:59:11,417 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.769e+02 1.935e+02 2.161e+02 3.184e+02, threshold=3.870e+02, percent-clipped=0.0 2023-10-02 21:59:12,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:12,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:14,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 21:59:14,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:59:17,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 21:59:20,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:59:20,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:22,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:59:22,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:25,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 21:59:25,482 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 21:59:28,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 21:59:28,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 21:59:30,954 INFO [train.py:1046] (2/4) Epoch 30, batch 950, loss[loss=0.181, simple_loss=0.2649, pruned_loss=0.04852, over 24462.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2445, pruned_loss=0.04374, over 4680677.00 frames. ], batch size: 69, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 21:59:31,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:35,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 21:59:39,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:59:41,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:42,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:42,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:59:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 21:59:47,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:48,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:59:50,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:59:50,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:59:50,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 21:59:52,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 21:59:53,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:54,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 21:59:56,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:58,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:58,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:58,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:58,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 21:59:59,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:00:00,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.23 vs. limit=15.0 2023-10-02 22:00:01,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:00:02,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:00:07,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:00:07,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:00:09,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1033480.0, ans=0.125 2023-10-02 22:00:12,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 22:00:14,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 22:00:14,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:00:14,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:00:16,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:16,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:00:22,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 22:00:23,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:00:26,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:00:26,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:27,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 22:00:28,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:00:28,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:00:28,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 22:00:32,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:00:33,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:00:39,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:00:39,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 22:00:40,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 22:00:43,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:45,198 INFO [train.py:1046] (2/4) Epoch 30, batch 1000, loss[loss=0.1625, simple_loss=0.2532, pruned_loss=0.03585, over 24656.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2434, pruned_loss=0.04341, over 4691260.84 frames. ], batch size: 73, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:00:47,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1033680.0, ans=0.0 2023-10-02 22:00:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 22:00:48,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:00:53,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:00:55,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 22:00:55,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 22:00:59,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:00,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:01:01,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:03,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 22:01:06,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 22:01:07,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 22:01:07,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:01:09,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 22:01:10,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:01:11,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 22:01:13,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:14,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:24,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:26,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:01:26,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:27,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:27,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 22:01:27,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:01:29,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:01:30,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:30,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1033880.0, ans=0.125 2023-10-02 22:01:31,768 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 22:01:33,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 22:01:36,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 22:01:36,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 22:01:38,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:01:40,104 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.798e+02 1.994e+02 2.237e+02 2.934e+02, threshold=3.988e+02, percent-clipped=0.0 2023-10-02 22:01:44,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:44,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:01:45,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:45,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:01:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 22:01:50,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:01:52,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 22:01:52,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 22:01:54,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:01:54,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:56,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:01:59,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:02:00,383 INFO [train.py:1046] (2/4) Epoch 30, batch 1050, loss[loss=0.1629, simple_loss=0.2288, pruned_loss=0.04852, over 23748.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2414, pruned_loss=0.04317, over 4702317.23 frames. ], batch size: 164, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:02:00,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:02:03,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:02:03,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:02:04,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:02:06,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:02:07,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:02:10,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:02:11,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:02:13,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:02:14,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:02:14,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:02:15,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:02:17,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 22:02:17,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:02:17,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 22:02:20,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:02:20,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 22:02:20,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:02:28,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:02:28,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:02:28,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:02:31,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 22:02:31,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 22:02:32,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:02:33,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 22:02:36,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 22:02:38,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:02:40,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 22:02:42,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:02:42,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:02:43,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:02:47,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:02:53,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 22:02:54,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 22:02:56,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 22:02:56,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:02:56,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:02:57,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 22:03:00,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:03:02,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:03:02,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:03:03,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:03:03,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:05,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1034280.0, ans=0.0 2023-10-02 22:03:07,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:07,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 22:03:09,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:03:09,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 22:03:09,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 22:03:09,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1034280.0, ans=0.0 2023-10-02 22:03:09,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1034280.0, ans=0.0 2023-10-02 22:03:10,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:03:13,107 INFO [train.py:1046] (2/4) Epoch 30, batch 1100, loss[loss=0.1584, simple_loss=0.2386, pruned_loss=0.03909, over 24483.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.241, pruned_loss=0.04301, over 4710973.52 frames. ], batch size: 63, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:03:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:03:19,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:03:19,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1034346.6666666666, ans=0.1 2023-10-02 22:03:23,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:03:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:03:26,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:03:27,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 22:03:29,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:03:31,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 22:03:32,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:03:35,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:03:36,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 22:03:38,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:03:38,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:03:39,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:03:40,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:03:43,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:03:47,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:03:48,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1034480.0, ans=0.2 2023-10-02 22:03:51,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 22:03:52,409 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 22:03:52,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:53,491 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.19 vs. limit=5.0 2023-10-02 22:03:55,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:57,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:03:57,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:03:57,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 22:03:59,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:04:00,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:04:00,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:04:00,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:00,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 22:04:06,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:04:07,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 22:04:08,725 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.832e+02 2.001e+02 2.230e+02 3.107e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-02 22:04:08,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:04:12,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:04:15,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 22:04:15,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:04:18,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:20,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:04:20,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:04:21,154 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.25 vs. limit=15.0 2023-10-02 22:04:21,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.20 vs. limit=15.0 2023-10-02 22:04:21,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 22:04:23,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:04:23,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:04:24,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 22:04:25,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:04:26,785 INFO [train.py:1046] (2/4) Epoch 30, batch 1150, loss[loss=0.1383, simple_loss=0.2248, pruned_loss=0.02589, over 24347.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2417, pruned_loss=0.04333, over 4711369.00 frames. ], batch size: 61, lr: 3.42e-03, grad_scale: 4.0 2023-10-02 22:04:26,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 22:04:28,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:04:28,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:04:29,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:04:34,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:34,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1034680.0, ans=0.2 2023-10-02 22:04:35,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:04:37,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:04:37,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:04:38,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 22:04:39,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:04:41,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 22:04:42,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:42,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:04:42,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1034746.6666666666, ans=0.125 2023-10-02 22:04:48,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 22:04:49,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:54,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:54,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:04:55,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 22:04:55,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:04:55,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:04:59,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 22:05:01,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:05:02,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:05:02,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1034813.3333333334, ans=0.0 2023-10-02 22:05:08,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.63 vs. limit=15.0 2023-10-02 22:05:12,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:05:16,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:05:17,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 22:05:17,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:19,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:19,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1034880.0, ans=0.1 2023-10-02 22:05:23,554 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 22:05:24,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:33,445 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 22:05:36,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:05:37,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:05:37,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:05:39,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:05:40,413 INFO [train.py:1046] (2/4) Epoch 30, batch 1200, loss[loss=0.2087, simple_loss=0.2748, pruned_loss=0.0713, over 19468.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2429, pruned_loss=0.04353, over 4717172.68 frames. ], batch size: 388, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:05:40,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:05:44,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:05:44,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:05:46,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:05:46,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:05:46,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:05:47,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:05:48,300 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.71 vs. limit=10.0 2023-10-02 22:05:48,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:05:50,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:05:52,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:53,486 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 22:05:58,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 22:06:01,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:06:04,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:06:06,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:06:07,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:06:07,706 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 22:06:09,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:06:16,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:06:16,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:06:16,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 22:06:17,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:06:20,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 22:06:26,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 22:06:26,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:06:26,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:06:28,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:06:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:06:29,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:06:31,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:06:31,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:06:31,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 22:06:33,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:06:33,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:06:33,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:06:36,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:06:36,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:06:37,824 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.850e+02 2.052e+02 2.209e+02 3.387e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 22:06:40,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:06:42,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:06:44,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 22:06:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 22:06:49,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.whiten.whitening_limit, batch_count=1035280.0, ans=12.0 2023-10-02 22:06:50,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:06:52,048 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=12.0 2023-10-02 22:06:52,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:06:53,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:06:55,165 INFO [train.py:1046] (2/4) Epoch 30, batch 1250, loss[loss=0.1779, simple_loss=0.2673, pruned_loss=0.0443, over 24330.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2439, pruned_loss=0.04417, over 4706478.38 frames. ], batch size: 74, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:06:55,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:06:55,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1035346.6666666666, ans=0.125 2023-10-02 22:06:57,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 22:07:00,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:07:01,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:02,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 22:07:04,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:07:04,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:07:10,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:07:10,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:11,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:07:11,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:07:11,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1035413.3333333334, ans=0.05 2023-10-02 22:07:14,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:07:18,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:07:18,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:07:18,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:07:21,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:07:22,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:23,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:24,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:07:29,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1035480.0, ans=0.125 2023-10-02 22:07:30,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 22:07:32,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:07:32,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:07:34,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 22:07:34,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:34,348 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 22:07:35,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:35,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:40,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:44,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:45,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:07:46,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 22:07:46,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 22:07:46,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 22:07:50,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:07:51,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 22:07:52,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 22:07:55,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:07:56,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 22:07:56,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:07:56,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:07:56,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:07:57,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1035613.3333333334, ans=0.09899494936611666 2023-10-02 22:07:58,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:07:59,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 22:08:02,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:08:04,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:08:06,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:08:08,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:08:09,414 INFO [train.py:1046] (2/4) Epoch 30, batch 1300, loss[loss=0.1622, simple_loss=0.2309, pruned_loss=0.04677, over 23760.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.244, pruned_loss=0.04429, over 4710724.68 frames. ], batch size: 232, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:08:12,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:08:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 22:08:16,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:08:17,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:08:19,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:08:19,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1035680.0, ans=0.1 2023-10-02 22:08:20,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:08:20,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:08:21,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 22:08:25,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:08:26,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:08:28,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 22:08:28,851 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.72 vs. limit=12.0 2023-10-02 22:08:32,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:08:34,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:08:36,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:08:36,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1035746.6666666666, ans=0.125 2023-10-02 22:08:38,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:08:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:08:39,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:08:40,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:08:41,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 22:08:45,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:08:45,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:08:45,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1035813.3333333334, ans=0.0 2023-10-02 22:08:47,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 22:08:47,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:08:50,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:08:53,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:08:53,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 22:08:54,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:08:55,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 22:08:56,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:09:00,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:09:00,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:09:04,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 22:09:04,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1035880.0, ans=0.0 2023-10-02 22:09:05,529 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.946e+02 2.321e+02 2.829e+02 4.906e+02, threshold=4.642e+02, percent-clipped=3.0 2023-10-02 22:09:05,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 22:09:05,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1035880.0, ans=0.0 2023-10-02 22:09:07,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 22:09:13,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:09:14,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 22:09:17,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:09:20,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1035946.6666666666, ans=0.125 2023-10-02 22:09:23,157 INFO [train.py:1046] (2/4) Epoch 30, batch 1350, loss[loss=0.1612, simple_loss=0.2322, pruned_loss=0.04512, over 23916.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2428, pruned_loss=0.04374, over 4714343.78 frames. ], batch size: 195, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:09:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 22:09:27,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:09:29,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:09:29,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1036013.3333333334, ans=0.125 2023-10-02 22:09:33,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:09:33,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:09:34,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:09:36,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:09:40,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:09:41,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 22:09:43,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:09:43,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:09:46,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 22:09:46,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:09:46,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=15.0 2023-10-02 22:09:47,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:09:47,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 22:09:48,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 22:09:51,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 22:09:51,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:09:53,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 22:10:02,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1036146.6666666666, ans=0.0 2023-10-02 22:10:04,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:10:13,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1036213.3333333334, ans=0.1 2023-10-02 22:10:14,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:10:15,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:15,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 22:10:18,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:19,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 22:10:19,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:10:21,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:10:24,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:10:25,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 22:10:27,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:10:30,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 22:10:31,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 22:10:37,768 INFO [train.py:1046] (2/4) Epoch 30, batch 1400, loss[loss=0.1701, simple_loss=0.2495, pruned_loss=0.04533, over 23238.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2413, pruned_loss=0.04297, over 4720083.10 frames. ], batch size: 105, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:10:39,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 22:10:39,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.39 vs. limit=15.0 2023-10-02 22:10:41,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:41,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1036346.6666666666, ans=0.125 2023-10-02 22:10:43,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:10:44,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:10:44,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1036346.6666666666, ans=0.0 2023-10-02 22:10:50,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 22:10:51,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 22:10:53,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1036413.3333333334, ans=0.0 2023-10-02 22:11:01,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:11:03,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:11:04,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:11:04,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:11:07,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:11:09,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 22:11:14,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1036480.0, ans=0.0 2023-10-02 22:11:18,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:19,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:23,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 22:11:25,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:11:26,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:11:26,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:11:26,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:11:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:11:28,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:11:29,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.52 vs. limit=15.0 2023-10-02 22:11:29,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:11:31,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 22:11:32,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:11:32,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1036546.6666666666, ans=0.1 2023-10-02 22:11:33,815 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.837e+02 2.123e+02 2.555e+02 3.782e+02, threshold=4.246e+02, percent-clipped=0.0 2023-10-02 22:11:35,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:38,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:11:39,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1036613.3333333334, ans=0.125 2023-10-02 22:11:44,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 22:11:46,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:11:46,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1036613.3333333334, ans=0.125 2023-10-02 22:11:47,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:11:50,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 22:11:51,967 INFO [train.py:1046] (2/4) Epoch 30, batch 1450, loss[loss=0.1635, simple_loss=0.2187, pruned_loss=0.05413, over 19117.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.24, pruned_loss=0.0427, over 4708940.51 frames. ], batch size: 388, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:11:52,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:11:53,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:11:54,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:11:57,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:11:57,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:57,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 22:12:02,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:12:02,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:12:03,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:12:03,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 22:12:04,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:12:05,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 22:12:06,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:06,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:06,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 22:12:08,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1036746.6666666666, ans=0.07 2023-10-02 22:12:09,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:12:09,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:12:09,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1036746.6666666666, ans=0.125 2023-10-02 22:12:10,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.35 vs. limit=15.0 2023-10-02 22:12:11,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 22:12:11,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:12,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:12:13,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:17,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:20,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:12:20,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:12:22,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:12:22,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:26,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:26,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:12:26,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:27,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:28,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 22:12:31,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:12:33,066 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 22:12:35,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:12:36,432 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.18 vs. limit=22.5 2023-10-02 22:12:37,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:12:38,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:12:39,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 22:12:40,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1036880.0, ans=0.0 2023-10-02 22:12:40,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1036880.0, ans=0.0 2023-10-02 22:12:44,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:44,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 22:12:44,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1036880.0, ans=0.0 2023-10-02 22:12:48,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 22:12:48,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:12:52,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:12:53,024 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.71 vs. limit=22.5 2023-10-02 22:12:53,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:12:54,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 22:12:55,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 22:12:57,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 22:12:58,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:59,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:13:05,507 INFO [train.py:1046] (2/4) Epoch 30, batch 1500, loss[loss=0.1808, simple_loss=0.2589, pruned_loss=0.05135, over 24647.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2407, pruned_loss=0.04277, over 4710231.30 frames. ], batch size: 65, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:13:08,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 22:13:10,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:13:10,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:13:11,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:13:12,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:13:12,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:13:13,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1037013.3333333334, ans=0.125 2023-10-02 22:13:14,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 22:13:14,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:13:15,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:13:15,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:13:15,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:13:17,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:13:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:13:24,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:13:24,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 22:13:25,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:13:26,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:13:27,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1037080.0, ans=0.0 2023-10-02 22:13:28,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:13:29,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 22:13:32,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 22:13:35,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:13:35,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 22:13:38,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:13:41,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:13:41,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:13:41,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:13:42,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 22:13:42,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1037146.6666666666, ans=0.1 2023-10-02 22:13:43,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:13:43,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:13:44,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 22:13:45,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:13:51,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:13:51,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 22:13:59,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:14:00,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:14:02,132 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.861e+02 2.140e+02 2.435e+02 4.119e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-02 22:14:04,977 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 22:14:05,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:05,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 22:14:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:06,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1037280.0, ans=0.125 2023-10-02 22:14:07,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:14:09,033 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 22:14:09,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:14:11,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 22:14:15,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:14:18,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:14:18,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:19,537 INFO [train.py:1046] (2/4) Epoch 30, batch 1550, loss[loss=0.1672, simple_loss=0.2469, pruned_loss=0.04372, over 23773.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2418, pruned_loss=0.04337, over 4701129.80 frames. ], batch size: 179, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:14:19,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:14:19,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 22:14:21,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 22:14:21,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:14:22,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 22:14:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 22:14:26,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:14:27,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:27,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:14:27,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:14:27,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:29,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:31,824 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 22:14:31,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:31,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:14:33,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:14:34,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1037413.3333333334, ans=0.07 2023-10-02 22:14:35,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:14:35,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 22:14:37,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:14:37,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1037413.3333333334, ans=0.125 2023-10-02 22:14:37,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1037413.3333333334, ans=0.1 2023-10-02 22:14:37,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1037413.3333333334, ans=0.125 2023-10-02 22:14:38,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 22:14:38,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 22:14:38,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 22:14:38,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:40,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:14:44,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:14:46,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 22:14:46,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 22:14:50,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1037480.0, ans=0.0 2023-10-02 22:14:55,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:00,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:15:00,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:15:00,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:15:01,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 22:15:03,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.56 vs. limit=15.0 2023-10-02 22:15:04,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:15:04,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1037546.6666666666, ans=0.0 2023-10-02 22:15:05,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:07,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:15:08,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:15:10,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:10,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 22:15:10,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:15:13,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:15:13,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:13,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.28 vs. limit=22.5 2023-10-02 22:15:14,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 22:15:16,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 22:15:16,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1037546.6666666666, ans=0.5 2023-10-02 22:15:18,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:15:19,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-10-02 22:15:22,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 22:15:28,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:15:30,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:30,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 22:15:32,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:15:33,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:15:33,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:15:33,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:15:34,333 INFO [train.py:1046] (2/4) Epoch 30, batch 1600, loss[loss=0.144, simple_loss=0.2303, pruned_loss=0.02889, over 24324.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2428, pruned_loss=0.04381, over 4705146.55 frames. ], batch size: 61, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:15:34,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:15:34,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1037680.0, ans=0.0 2023-10-02 22:15:37,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:15:37,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 22:15:39,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 22:15:41,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 22:15:43,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:15:45,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 22:15:45,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1037680.0, ans=0.0 2023-10-02 22:15:47,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:15:49,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:15:54,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:57,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 22:16:00,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:16:00,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 22:16:02,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:02,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 22:16:07,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 22:16:14,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:16:16,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 22:16:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:16:16,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:16:16,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:16:17,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.79 vs. limit=15.0 2023-10-02 22:16:19,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 22:16:24,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:16:25,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:16:25,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1037880.0, ans=0.0 2023-10-02 22:16:27,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:27,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:27,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:16:27,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1037880.0, ans=0.0 2023-10-02 22:16:28,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:16:30,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.857e+02 2.011e+02 2.284e+02 3.695e+02, threshold=4.022e+02, percent-clipped=0.0 2023-10-02 22:16:31,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:16:33,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:16:38,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:39,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:16:40,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 22:16:40,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:16:41,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1037946.6666666666, ans=0.2 2023-10-02 22:16:42,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 22:16:46,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:16:48,076 INFO [train.py:1046] (2/4) Epoch 30, batch 1650, loss[loss=0.1616, simple_loss=0.2382, pruned_loss=0.04246, over 23389.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2438, pruned_loss=0.0438, over 4694763.50 frames. ], batch size: 93, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:16:49,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:16:49,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:16:49,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 22:16:50,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 22:16:50,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 22:16:50,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 22:16:52,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1038013.3333333334, ans=10.0 2023-10-02 22:16:56,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:56,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:16:58,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:16:58,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:16:59,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:17:01,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 22:17:03,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:17:04,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:17:04,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:17:04,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:17:04,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 22:17:06,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 22:17:12,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:17:14,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:17:23,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 22:17:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:25,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 22:17:25,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1038146.6666666666, ans=0.1 2023-10-02 22:17:28,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:31,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:17:31,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:17:32,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:17:32,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:17:32,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:36,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:17:36,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:37,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:17:38,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:17:38,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:17:40,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:17:44,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:17:44,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 22:17:45,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:17:46,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 22:17:47,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 22:17:47,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 22:17:47,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:17:48,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:17:48,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:50,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:50,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 22:17:53,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:56,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:17:56,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:17:59,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 22:18:01,892 INFO [train.py:1046] (2/4) Epoch 30, batch 1700, loss[loss=0.1613, simple_loss=0.2529, pruned_loss=0.03482, over 24560.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2434, pruned_loss=0.04387, over 4702970.15 frames. ], batch size: 71, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:18:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:18:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:18:03,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 22:18:05,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:18:05,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:18:05,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:18:07,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:18:07,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:18:07,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 22:18:11,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:18:11,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1038346.6666666666, ans=0.125 2023-10-02 22:18:14,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1038346.6666666666, ans=0.0 2023-10-02 22:18:19,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:18:21,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1038413.3333333334, ans=0.125 2023-10-02 22:18:22,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:18:27,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:18:27,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:18:28,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:18:30,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:18:31,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 22:18:32,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.81 vs. limit=8.0 2023-10-02 22:18:32,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:18:32,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:33,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1038480.0, ans=0.125 2023-10-02 22:18:34,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:18:35,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:18:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 22:18:39,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 22:18:41,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:43,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 22:18:43,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:18:50,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:18:51,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:18:53,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:18:53,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1038546.6666666666, ans=0.0 2023-10-02 22:18:54,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:18:54,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 22:18:54,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:18:55,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-10-02 22:18:57,824 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.894e+02 2.111e+02 2.412e+02 3.601e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-02 22:18:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:57,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 22:18:58,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:18:58,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:18:59,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:00,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:19:00,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:19:02,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:02,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:19:02,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:07,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:08,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 22:19:10,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:11,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1038613.3333333334, ans=0.04949747468305833 2023-10-02 22:19:13,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:14,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 22:19:15,694 INFO [train.py:1046] (2/4) Epoch 30, batch 1750, loss[loss=0.15, simple_loss=0.2218, pruned_loss=0.03913, over 23336.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2414, pruned_loss=0.04342, over 4688874.79 frames. ], batch size: 134, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:19:21,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:22,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:22,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:19:23,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=15.0 2023-10-02 22:19:24,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 22:19:24,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:19:26,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:19:27,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:31,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 22:19:34,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:36,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 22:19:36,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:19:37,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:19:40,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:19:40,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 22:19:42,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1038746.6666666666, ans=0.2 2023-10-02 22:19:43,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:19:43,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 22:19:51,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:19:52,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:19:52,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:54,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1038813.3333333334, ans=0.125 2023-10-02 22:19:55,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:55,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:58,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:58,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:01,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:20:01,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:20:01,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 22:20:02,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:20:06,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 22:20:06,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:20:08,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:20:08,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:20:13,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:20:13,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 22:20:15,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:16,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:20:19,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:20:22,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:20:24,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:20:25,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 22:20:25,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:20:27,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:20:27,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:27,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:20:27,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:20:28,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:20:30,047 INFO [train.py:1046] (2/4) Epoch 30, batch 1800, loss[loss=0.1732, simple_loss=0.2583, pruned_loss=0.04401, over 24454.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.241, pruned_loss=0.04275, over 4706844.26 frames. ], batch size: 66, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:20:31,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:20:31,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:32,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:20:33,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1039013.3333333334, ans=0.1 2023-10-02 22:20:34,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1039013.3333333334, ans=0.1 2023-10-02 22:20:37,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:20:42,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:20:43,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:20:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:20:48,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:48,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:49,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:20:51,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:20:52,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 22:20:53,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:20:56,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:02,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 22:21:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 22:21:03,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 22:21:03,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:06,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:21:06,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:21:06,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:21:09,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1039146.6666666666, ans=0.125 2023-10-02 22:21:12,805 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 22:21:14,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:21:17,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:17,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1039213.3333333334, ans=0.0 2023-10-02 22:21:19,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 22:21:19,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 22:21:20,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:21:22,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:21:22,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:21:22,710 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.39 vs. limit=22.5 2023-10-02 22:21:26,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.833e+02 1.992e+02 2.212e+02 2.839e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 22:21:26,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 22:21:32,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:21:33,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 22:21:33,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:21:33,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:34,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:21:35,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 22:21:38,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:21:38,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:21:40,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 22:21:40,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:41,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1039280.0, ans=0.2 2023-10-02 22:21:43,963 INFO [train.py:1046] (2/4) Epoch 30, batch 1850, loss[loss=0.1809, simple_loss=0.2699, pruned_loss=0.04591, over 24325.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2412, pruned_loss=0.04284, over 4708575.53 frames. ], batch size: 74, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:21:44,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:21:44,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:21:44,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:46,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:46,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:21:48,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:21:48,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:21:52,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:21:53,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:21:53,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1039346.6666666666, ans=0.125 2023-10-02 22:22:00,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:22:00,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 22:22:02,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 22:22:05,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 22:22:06,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1039413.3333333334, ans=0.0 2023-10-02 22:22:06,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1039413.3333333334, ans=0.0 2023-10-02 22:22:07,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:22:07,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 22:22:07,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 22:22:19,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:22:21,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 22:22:24,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:22:26,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:22:28,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 22:22:30,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:30,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:22:31,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:22:34,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:22:36,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1039546.6666666666, ans=0.0 2023-10-02 22:22:37,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:22:37,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1039546.6666666666, ans=0.125 2023-10-02 22:22:40,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:22:40,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:41,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:22:41,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:22:42,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1039613.3333333334, ans=0.1 2023-10-02 22:22:43,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:22:44,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:22:45,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 22:22:46,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:22:50,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:22:50,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1039613.3333333334, ans=0.2 2023-10-02 22:22:52,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:22:52,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 22:22:52,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 22:22:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 22:22:55,585 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 22:22:55,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:22:56,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:22:56,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:22:56,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:58,251 INFO [train.py:1046] (2/4) Epoch 30, batch 1900, loss[loss=0.1707, simple_loss=0.2492, pruned_loss=0.04609, over 23368.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2416, pruned_loss=0.04294, over 4706976.73 frames. ], batch size: 93, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:22:58,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 22:22:58,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:22:58,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:58,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1039680.0, ans=0.5 2023-10-02 22:22:59,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:23:01,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:23:01,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:23:02,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 22:23:06,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:23:06,018 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 22:23:06,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:23:07,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:23:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:23:11,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1039746.6666666666, ans=0.125 2023-10-02 22:23:12,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:23:13,009 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 22:23:14,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 22:23:15,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:23:17,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:23:17,154 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 22:23:17,191 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 22:23:20,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1039746.6666666666, ans=0.125 2023-10-02 22:23:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 22:23:24,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:23:27,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1039813.3333333334, ans=0.0 2023-10-02 22:23:28,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 22:23:29,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1039813.3333333334, ans=0.1 2023-10-02 22:23:30,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 22:23:39,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 22:23:41,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 22:23:41,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:23:41,209 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 22:23:41,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 22:23:42,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 22:23:42,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 22:23:42,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:23:45,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 22:23:50,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:23:51,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:23:51,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 22:23:54,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:23:56,205 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.917e+02 2.185e+02 2.667e+02 3.803e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-02 22:23:57,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 22:23:57,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:24:05,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:24:05,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:24:05,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:24:07,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:24:07,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:24:08,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:24:08,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:24:14,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:24:14,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:24:15,600 INFO [train.py:1046] (2/4) Epoch 30, batch 1950, loss[loss=0.1562, simple_loss=0.2447, pruned_loss=0.03384, over 24637.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2428, pruned_loss=0.04342, over 4709410.11 frames. ], batch size: 68, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:24:15,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:24:15,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:24:17,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:24:19,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:24:21,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:24:24,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:24:24,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:24,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:24:27,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 22:24:28,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 22:24:28,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:29,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:24:32,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:24:32,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:35,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:24:37,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1040080.0, ans=0.0 2023-10-02 22:24:38,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:24:38,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:24:40,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:24:40,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:42,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:43,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.39 vs. limit=15.0 2023-10-02 22:24:44,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:24:44,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:24:44,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:24:44,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 22:24:45,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:24:45,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:24:47,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:49,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:51,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:24:54,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:24:57,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:24:59,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:24:59,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 22:24:59,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:24:59,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1040213.3333333334, ans=0.125 2023-10-02 22:25:02,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:25:04,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:25:05,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:25:13,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:13,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:16,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:18,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:25:20,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:25:22,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:25:22,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 22:25:22,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:25:23,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:25:23,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1040280.0, ans=0.125 2023-10-02 22:25:25,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 22:25:26,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:25:29,857 INFO [train.py:1046] (2/4) Epoch 30, batch 2000, loss[loss=0.164, simple_loss=0.2442, pruned_loss=0.04193, over 23368.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2433, pruned_loss=0.04395, over 4704099.71 frames. ], batch size: 119, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:25:29,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:25:31,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:25:32,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:25:33,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:25:33,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1040346.6666666666, ans=0.07 2023-10-02 22:25:36,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:36,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1040346.6666666666, ans=0.0 2023-10-02 22:25:40,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 22:25:40,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:25:42,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1040346.6666666666, ans=0.125 2023-10-02 22:25:43,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:25:44,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 22:25:46,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:25:47,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:25:49,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:25:51,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 22:25:53,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:54,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:55,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:55,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 22:25:57,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:25:59,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1040480.0, ans=0.125 2023-10-02 22:26:00,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 22:26:00,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:26:01,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:05,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:26:05,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:05,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:26:08,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 22:26:09,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 22:26:11,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:26:11,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:15,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:16,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:26:16,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:26:18,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:26:18,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:19,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:19,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:26:19,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:22,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:25,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:26:25,882 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.36 vs. limit=6.0 2023-10-02 22:26:26,526 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.011e+02 2.210e+02 2.489e+02 4.189e+02, threshold=4.420e+02, percent-clipped=0.0 2023-10-02 22:26:26,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 22:26:31,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:26:32,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:34,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:34,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:26:39,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:41,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:41,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:41,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:26:41,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:26:43,988 INFO [train.py:1046] (2/4) Epoch 30, batch 2050, loss[loss=0.1837, simple_loss=0.2657, pruned_loss=0.05088, over 24341.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2432, pruned_loss=0.04387, over 4693909.30 frames. ], batch size: 77, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:26:44,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:45,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:48,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:49,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:52,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:53,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:26:53,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:55,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:26:55,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1040680.0, ans=0.125 2023-10-02 22:26:56,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1040746.6666666666, ans=0.0 2023-10-02 22:26:58,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 22:26:58,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:26:59,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:27:00,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:27:10,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:27:11,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:27:13,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 22:27:13,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.30 vs. limit=15.0 2023-10-02 22:27:15,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:27:15,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 22:27:15,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:27:18,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:27:20,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:21,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:27:23,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:27:24,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:27:25,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:27:25,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:27:27,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1040880.0, ans=0.125 2023-10-02 22:27:29,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:32,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:27:33,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:27:34,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:27:35,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1040880.0, ans=0.2 2023-10-02 22:27:38,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:27:42,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:27:44,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 22:27:48,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:27:49,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:27:52,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:27:53,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 22:27:56,639 INFO [train.py:1046] (2/4) Epoch 30, batch 2100, loss[loss=0.1635, simple_loss=0.2394, pruned_loss=0.04384, over 23460.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2416, pruned_loss=0.04339, over 4698109.89 frames. ], batch size: 134, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:27:56,721 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 22:27:56,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:27:56,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:59,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:28:00,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:28:00,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 22:28:00,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 22:28:01,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:28:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:28:06,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:28:08,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1041013.3333333334, ans=0.125 2023-10-02 22:28:09,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:28:10,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 22:28:12,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:28:12,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 22:28:12,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 22:28:13,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:13,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:28:13,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 22:28:13,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 22:28:19,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 22:28:19,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:28:22,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:28:23,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:28:26,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:28:26,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 22:28:28,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:28,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 22:28:29,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 22:28:29,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:29,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 22:28:29,776 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:28:31,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 22:28:31,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 22:28:34,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:28:35,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:28:38,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:28:40,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:28:41,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:42,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:42,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 22:28:43,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:43,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:43,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:43,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 22:28:44,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 22:28:46,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 22:28:50,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:28:52,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:28:52,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 22:28:53,983 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.874e+02 2.097e+02 2.519e+02 4.862e+02, threshold=4.194e+02, percent-clipped=1.0 2023-10-02 22:28:55,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1041280.0, ans=0.0 2023-10-02 22:28:57,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:01,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:29:01,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:01,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:29:01,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 22:29:03,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:29:04,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:06,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:29:06,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:29:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:08,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 22:29:09,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1041346.6666666666, ans=0.0 2023-10-02 22:29:11,601 INFO [train.py:1046] (2/4) Epoch 30, batch 2150, loss[loss=0.1642, simple_loss=0.2373, pruned_loss=0.04552, over 23863.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2414, pruned_loss=0.04309, over 4710781.53 frames. ], batch size: 179, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:29:11,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 22:29:11,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:14,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:29:14,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:29:15,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:29:15,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:29:16,407 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.87 vs. limit=10.0 2023-10-02 22:29:21,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 22:29:22,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:24,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:26,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:29:26,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:26,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:29:28,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:29,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:29:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:29:32,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:33,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 22:29:37,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:39,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:29:40,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:42,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:42,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:43,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:29:43,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:44,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:29:45,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:45,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1041480.0, ans=0.125 2023-10-02 22:29:46,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 22:29:48,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:29:48,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:48,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1041480.0, ans=0.0 2023-10-02 22:29:49,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:49,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:29:51,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:29:52,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1041480.0, ans=0.125 2023-10-02 22:29:53,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:53,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:29:56,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:56,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 22:29:56,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:29:59,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:59,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:59,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1041546.6666666666, ans=0.1 2023-10-02 22:30:00,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:30:00,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:30:02,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:02,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:02,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 22:30:05,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 22:30:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:30:05,189 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 22:30:05,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:05,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:30:07,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 22:30:07,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:30:07,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 22:30:07,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 22:30:07,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 22:30:09,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 22:30:10,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:10,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:30:10,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:30:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:13,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:30:14,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:14,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:23,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:30:23,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 22:30:24,667 INFO [train.py:1046] (2/4) Epoch 30, batch 2200, loss[loss=0.1623, simple_loss=0.2299, pruned_loss=0.04729, over 18070.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2415, pruned_loss=0.04314, over 4713358.52 frames. ], batch size: 39, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:30:27,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:30:30,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:31,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:30:31,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:30:34,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:30:38,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:38,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:30:38,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 22:30:41,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1041746.6666666666, ans=0.1 2023-10-02 22:30:42,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 22:30:44,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:30:49,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 22:30:52,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:54,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:30:54,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:30:57,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:30:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 22:30:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:31:01,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:03,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 22:31:05,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:31:07,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:31:10,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:31:12,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 22:31:15,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:15,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 22:31:18,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:19,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:31:19,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:21,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:31:22,480 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.843e+02 2.023e+02 2.325e+02 3.252e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-02 22:31:22,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:31:22,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:22,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:24,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:31:25,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:31:26,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1041946.6666666666, ans=0.1 2023-10-02 22:31:28,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:31:30,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:31:30,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:31:35,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:31:35,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 22:31:38,055 INFO [train.py:1046] (2/4) Epoch 30, batch 2250, loss[loss=0.1858, simple_loss=0.2733, pruned_loss=0.04916, over 24576.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2424, pruned_loss=0.04288, over 4712895.04 frames. ], batch size: 71, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:31:38,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:31:38,169 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 22:31:39,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:31:40,901 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 22:31:41,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1042013.3333333334, ans=0.125 2023-10-02 22:31:42,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:42,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:31:44,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:46,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 22:31:48,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:31:48,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1042013.3333333334, ans=0.025 2023-10-02 22:31:50,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:31:55,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:31:56,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:31:59,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:31:59,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:32:00,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:32:02,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 22:32:02,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:32:02,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:32:05,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 22:32:05,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1042080.0, ans=0.2 2023-10-02 22:32:06,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:32:06,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:32:07,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:32:12,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:32:12,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1042146.6666666666, ans=0.0 2023-10-02 22:32:15,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:32:15,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:32:16,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 22:32:18,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:32:20,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:32:24,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:32:25,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:32:26,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:32:27,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:32:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:32:31,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:32:35,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:32:37,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:32:37,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1042280.0, ans=0.1 2023-10-02 22:32:38,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1042280.0, ans=0.2 2023-10-02 22:32:41,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:32:41,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:32:42,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:32:44,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1042280.0, ans=0.125 2023-10-02 22:32:47,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:32:49,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:32:49,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 22:32:50,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:32:50,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:32:52,151 INFO [train.py:1046] (2/4) Epoch 30, batch 2300, loss[loss=0.1459, simple_loss=0.2273, pruned_loss=0.03226, over 24603.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2427, pruned_loss=0.04321, over 4718235.89 frames. ], batch size: 60, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:32:53,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 22:32:55,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:32:55,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1042346.6666666666, ans=0.0 2023-10-02 22:32:56,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:01,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:01,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:33:03,415 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 22:33:06,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:13,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:33:13,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:33:14,435 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.44 vs. limit=15.0 2023-10-02 22:33:15,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:15,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:15,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 22:33:16,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:33:19,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:33:19,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:33:24,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:33:26,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:33:28,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:33:32,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:33:32,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:35,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:33:38,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:42,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:33:42,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:33:42,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:33:44,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 22:33:49,141 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.877e+02 2.187e+02 2.420e+02 3.762e+02, threshold=4.375e+02, percent-clipped=0.0 2023-10-02 22:33:49,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:33:49,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:49,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:33:49,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:33:49,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:33:50,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 22:33:50,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:33:50,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 22:33:50,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:33:52,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:53,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 22:33:58,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:34:02,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:34:05,038 INFO [train.py:1046] (2/4) Epoch 30, batch 2350, loss[loss=0.1725, simple_loss=0.2584, pruned_loss=0.04327, over 24034.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2437, pruned_loss=0.04385, over 4707202.36 frames. ], batch size: 80, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:34:05,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:34:05,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:34:06,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:34:06,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:34:06,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:34:06,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1042680.0, ans=0.125 2023-10-02 22:34:07,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:34:08,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 22:34:16,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:34:16,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 22:34:19,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 22:34:19,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1042746.6666666666, ans=0.0 2023-10-02 22:34:22,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1042746.6666666666, ans=0.025 2023-10-02 22:34:24,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:34:25,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:25,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:25,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:34:26,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:34:27,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1042746.6666666666, ans=0.0 2023-10-02 22:34:28,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 22:34:31,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:34:31,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1042746.6666666666, ans=0.0 2023-10-02 22:34:35,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 22:34:36,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:34:36,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1042813.3333333334, ans=0.125 2023-10-02 22:34:41,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:34:41,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:34:42,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:34:44,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 22:34:44,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:34:48,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:34:48,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:34:48,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:34:52,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:34:53,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 22:34:55,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:34:56,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:56,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:34:58,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 22:34:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:35:01,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1042880.0, ans=0.0 2023-10-02 22:35:02,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 22:35:02,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:35:06,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 22:35:11,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 22:35:12,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:35:12,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 22:35:12,687 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 22:35:12,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 22:35:15,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 22:35:17,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:35:17,670 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:35:19,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1043013.3333333334, ans=0.0 2023-10-02 22:35:20,518 INFO [train.py:1046] (2/4) Epoch 30, batch 2400, loss[loss=0.1696, simple_loss=0.2541, pruned_loss=0.04255, over 23801.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2436, pruned_loss=0.04432, over 4707689.22 frames. ], batch size: 85, lr: 3.41e-03, grad_scale: 32.0 2023-10-02 22:35:23,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:35:26,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:35:27,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:35:27,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 22:35:28,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1043013.3333333334, ans=0.2 2023-10-02 22:35:29,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 22:35:32,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1043013.3333333334, ans=0.125 2023-10-02 22:35:36,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:35:36,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:35:36,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1043080.0, ans=0.125 2023-10-02 22:35:38,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 22:35:38,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:35:39,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:35:39,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 22:35:42,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1043080.0, ans=0.1 2023-10-02 22:35:45,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:35:46,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 22:35:50,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1043146.6666666666, ans=0.125 2023-10-02 22:35:51,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:35:54,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 22:35:57,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:35:59,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:03,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:36:03,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 22:36:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:36:12,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:13,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1043213.3333333334, ans=0.1 2023-10-02 22:36:13,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:36:15,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1043213.3333333334, ans=0.125 2023-10-02 22:36:16,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:18,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:36:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:36:18,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:36:18,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:19,277 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.363e+02 1.815e+02 2.119e+02 2.399e+02 3.814e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 22:36:19,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:36:19,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:36:25,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:36:25,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:36:25,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 22:36:26,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 22:36:28,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1043280.0, ans=0.1 2023-10-02 22:36:29,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:36:29,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:29,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 22:36:29,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 22:36:31,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 22:36:31,463 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 22:36:31,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 22:36:32,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:36:34,343 INFO [train.py:1046] (2/4) Epoch 30, batch 2450, loss[loss=0.1731, simple_loss=0.2447, pruned_loss=0.05079, over 23710.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2426, pruned_loss=0.04379, over 4696633.51 frames. ], batch size: 149, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:36:34,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:34,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:36:35,861 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 22:36:37,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:38,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:36:41,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:36:41,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:36:45,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:45,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:36:46,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 22:36:52,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:36:52,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:52,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1043413.3333333334, ans=0.1 2023-10-02 22:36:54,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1043413.3333333334, ans=0.2 2023-10-02 22:36:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:36:55,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:36:55,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:36:55,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 22:36:59,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:37:01,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:37:02,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:37:03,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=1043480.0, ans=0.1 2023-10-02 22:37:05,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:37:05,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:07,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:07,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:37:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 22:37:10,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:37:14,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.51 vs. limit=22.5 2023-10-02 22:37:17,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:19,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:37:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:37:20,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:37:20,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:22,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:37:22,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 22:37:24,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.91 vs. limit=15.0 2023-10-02 22:37:25,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:26,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:37:29,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:37:29,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:37:34,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:37:34,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 22:37:36,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:37:36,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:37:36,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 22:37:37,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:37:37,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:37:38,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1043613.3333333334, ans=0.125 2023-10-02 22:37:41,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:37:43,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:44,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:37:47,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 22:37:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:37:49,178 INFO [train.py:1046] (2/4) Epoch 30, batch 2500, loss[loss=0.1603, simple_loss=0.2338, pruned_loss=0.04344, over 23727.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2414, pruned_loss=0.04365, over 4686661.34 frames. ], batch size: 164, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:37:55,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:38:03,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:38:05,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:38:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:38:05,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 22:38:12,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:38:12,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:38:12,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1043746.6666666666, ans=0.125 2023-10-02 22:38:13,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 22:38:14,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:38:14,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1043746.6666666666, ans=0.125 2023-10-02 22:38:15,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 22:38:15,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:15,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1043746.6666666666, ans=0.04949747468305833 2023-10-02 22:38:16,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:38:16,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 22:38:18,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:18,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 22:38:18,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:24,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:38:25,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:38:26,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1043813.3333333334, ans=0.0 2023-10-02 22:38:28,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:38:28,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 22:38:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:38:30,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:34,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:38,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:41,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:38:46,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:38:48,327 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.895e+02 2.033e+02 2.318e+02 3.238e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 22:38:49,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 22:38:49,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:38:49,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:38:50,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1043946.6666666666, ans=0.125 2023-10-02 22:38:51,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:38:51,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:38:53,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 22:38:53,821 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 22:38:53,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 22:38:56,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:58,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 22:38:58,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 22:38:59,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.30 vs. limit=15.0 2023-10-02 22:39:00,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:39:00,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 22:39:02,814 INFO [train.py:1046] (2/4) Epoch 30, batch 2550, loss[loss=0.1649, simple_loss=0.2363, pruned_loss=0.04674, over 23323.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2418, pruned_loss=0.04359, over 4694636.24 frames. ], batch size: 119, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:39:04,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 22:39:06,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:39:07,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:39:09,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:39:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:39:11,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 22:39:13,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:39:18,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 22:39:18,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:39:21,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:22,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1044080.0, ans=0.025 2023-10-02 22:39:23,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:39:23,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 22:39:25,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:39:25,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:39:25,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:39:26,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:39:26,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 22:39:28,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:39:28,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:28,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 22:39:40,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:39:43,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:39:44,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:44,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:39:46,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:39:52,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:39:56,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:39:56,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:39:56,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:39:56,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:39:57,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:39:59,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1044213.3333333334, ans=15.0 2023-10-02 22:40:02,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:40:02,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:40:05,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:40:05,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 22:40:05,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:40:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:40:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:40:08,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:40:11,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:16,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:40:17,447 INFO [train.py:1046] (2/4) Epoch 30, batch 2600, loss[loss=0.1774, simple_loss=0.2595, pruned_loss=0.04769, over 23968.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2429, pruned_loss=0.04357, over 4703597.37 frames. ], batch size: 86, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:40:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:22,081 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 22:40:23,486 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 22:40:23,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:40:23,538 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 22:40:24,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 22:40:24,890 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 22:40:27,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:40:29,095 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 22:40:29,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 22:40:31,103 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 22:40:32,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:40:33,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 22:40:35,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1044413.3333333334, ans=0.0 2023-10-02 22:40:36,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 22:40:37,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:40:39,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 22:40:41,266 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 22:40:41,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 22:40:47,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:40:47,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:47,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:40:47,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 22:40:49,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:40:52,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 22:40:59,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:59,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:00,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 22:41:00,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:41:00,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:41:02,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 22:41:02,905 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.51 vs. limit=15.0 2023-10-02 22:41:04,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:41:04,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:41:07,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:11,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1044546.6666666666, ans=0.2 2023-10-02 22:41:12,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 22:41:12,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:12,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:41:15,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1044613.3333333334, ans=0.125 2023-10-02 22:41:16,858 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.854e+02 2.030e+02 2.260e+02 3.001e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 22:41:17,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:41:20,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:41:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 22:41:20,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:41:23,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:41:23,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:41:28,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 22:41:29,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:31,766 INFO [train.py:1046] (2/4) Epoch 30, batch 2650, loss[loss=0.1518, simple_loss=0.2393, pruned_loss=0.03218, over 24503.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2437, pruned_loss=0.04362, over 4719945.52 frames. ], batch size: 66, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:41:31,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:41:34,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1044680.0, ans=0.07 2023-10-02 22:41:36,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 22:41:36,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:37,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:41:37,499 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 22:41:37,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:41:40,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:42,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:41:42,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:41:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:46,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 22:41:46,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:41:46,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1044746.6666666666, ans=0.0 2023-10-02 22:41:47,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:41:49,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 22:41:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 22:41:54,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:41:57,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 22:41:57,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:41:57,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 22:42:00,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:01,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:42:01,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:01,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:07,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 22:42:07,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 22:42:10,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1044813.3333333334, ans=0.125 2023-10-02 22:42:11,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:42:13,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1044813.3333333334, ans=0.0 2023-10-02 22:42:16,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 22:42:16,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:17,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:17,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:42:18,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:42:18,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:42:20,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:42:22,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:42:23,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:42:24,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:42:25,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:42:26,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:28,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:42:29,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:29,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:42:29,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:42:33,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:33,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:42:33,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:35,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 22:42:38,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:42:40,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:42,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:43,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:44,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:42:44,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:45,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.99 vs. limit=15.0 2023-10-02 22:42:46,101 INFO [train.py:1046] (2/4) Epoch 30, batch 2700, loss[loss=0.2112, simple_loss=0.279, pruned_loss=0.07166, over 19733.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2448, pruned_loss=0.04411, over 4714890.30 frames. ], batch size: 388, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:42:47,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:42:47,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 22:42:50,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:42:52,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 22:42:53,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:53,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:53,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:55,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:42:56,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:57,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:42:57,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:42:58,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 22:42:58,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:42:58,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1045013.3333333334, ans=0.0 2023-10-02 22:43:00,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.01 vs. limit=22.5 2023-10-02 22:43:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:43:01,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:43:02,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:43:05,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:43:06,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 22:43:06,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:43:11,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:43:11,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:17,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1045146.6666666666, ans=0.125 2023-10-02 22:43:18,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:43:18,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:43:18,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:43:18,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:43:21,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:43:24,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1045146.6666666666, ans=0.0 2023-10-02 22:43:26,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:43:26,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:43:26,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:43:29,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:29,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:43:36,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:43:36,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:43:41,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:43:41,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:43:45,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:45,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:43:46,897 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.911e+02 2.096e+02 2.404e+02 3.352e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-02 22:43:46,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:43:48,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:43:49,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:49,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:43:52,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:43:54,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:54,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:58,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 22:43:58,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1045280.0, ans=0.1 2023-10-02 22:43:59,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:00,638 INFO [train.py:1046] (2/4) Epoch 30, batch 2750, loss[loss=0.1483, simple_loss=0.2111, pruned_loss=0.04281, over 22732.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2439, pruned_loss=0.04392, over 4725042.38 frames. ], batch size: 322, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:44:00,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:44:00,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 22:44:03,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 22:44:03,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:04,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:05,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:44:08,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:09,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:44:09,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:11,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:12,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:44:12,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:44:12,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:12,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 22:44:12,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:44:12,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:18,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 22:44:18,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1045413.3333333334, ans=0.0 2023-10-02 22:44:21,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:44:21,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:21,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:44:22,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:44:22,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:44:25,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:44:26,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:26,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:28,717 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=22.5 2023-10-02 22:44:30,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:44:30,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:44:30,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:44:32,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:32,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:44:39,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:42,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:44:42,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:46,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:46,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:44:47,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:44:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:44:52,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:44:52,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 22:44:56,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:57,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1045546.6666666666, ans=0.0 2023-10-02 22:44:58,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 22:45:01,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 22:45:05,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:45:05,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 22:45:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:45:08,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:45:09,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 22:45:09,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:45:12,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 22:45:12,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:12,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:45:14,071 INFO [train.py:1046] (2/4) Epoch 30, batch 2800, loss[loss=0.1609, simple_loss=0.2357, pruned_loss=0.04302, over 23540.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2423, pruned_loss=0.04378, over 4720322.75 frames. ], batch size: 120, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:45:14,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 22:45:14,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:14,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:15,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:17,075 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 22:45:17,076 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 22:45:21,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.49 vs. limit=22.5 2023-10-02 22:45:21,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:23,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:45:23,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:45:28,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:45:29,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 22:45:31,010 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.52 vs. limit=5.0 2023-10-02 22:45:31,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 22:45:33,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 22:45:34,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:34,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:45:34,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:45:38,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:45:39,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-10-02 22:45:40,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:40,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:45:41,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:45:44,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-10-02 22:45:48,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:45:51,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:54,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:54,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:45:55,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:45:59,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:45:59,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 22:46:00,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:02,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:46:02,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:46:06,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:06,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:08,547 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:46:09,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:46:12,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:46:13,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:13,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:46:13,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:46:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:46:14,907 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.953e+02 2.195e+02 2.519e+02 3.830e+02, threshold=4.390e+02, percent-clipped=0.0 2023-10-02 22:46:14,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:46:15,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 22:46:15,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:17,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:46:17,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:19,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 22:46:20,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:46:20,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:46:20,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:46:22,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 22:46:28,637 INFO [train.py:1046] (2/4) Epoch 30, batch 2850, loss[loss=0.1412, simple_loss=0.1907, pruned_loss=0.0458, over 19108.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2415, pruned_loss=0.04344, over 4718448.01 frames. ], batch size: 388, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:46:28,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:46:30,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:46:30,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:46:32,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:46:35,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:46:35,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:46:35,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:38,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:46:39,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:42,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:46:42,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 22:46:49,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 22:46:49,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:46:50,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 22:46:50,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:53,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 22:46:54,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 22:46:56,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:07,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:47:09,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:47:09,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:47:09,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:47:10,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:47:10,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:47:11,513 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.95 vs. limit=15.0 2023-10-02 22:47:12,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:47:12,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 22:47:14,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:47:14,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:47:16,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:47:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:18,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:18,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:19,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.71 vs. limit=22.5 2023-10-02 22:47:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:24,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:47:24,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:47:25,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:27,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:29,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:47:34,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:47:36,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 22:47:36,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 22:47:39,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:47:40,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:47:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 22:47:40,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:47:41,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:47:41,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:47:41,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:47:41,951 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 22:47:41,983 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 22:47:41,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:47:43,296 INFO [train.py:1046] (2/4) Epoch 30, batch 2900, loss[loss=0.145, simple_loss=0.2258, pruned_loss=0.03209, over 24339.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2413, pruned_loss=0.04328, over 4714792.23 frames. ], batch size: 61, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:47:43,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:47,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:47:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:47:47,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:47:48,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 22:47:51,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1046346.6666666666, ans=0.2 2023-10-02 22:47:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:54,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 22:47:55,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 22:47:57,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:47:57,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:47:58,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:59,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:48:00,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.77 vs. limit=15.0 2023-10-02 22:48:03,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:48:05,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:48:06,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:48:07,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 22:48:07,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:48:12,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:13,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 22:48:13,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 22:48:16,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:48:16,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 22:48:16,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:48:18,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:48:18,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:48:20,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:48:20,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1046480.0, ans=0.125 2023-10-02 22:48:22,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:24,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:48:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:48:27,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 22:48:28,495 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.00 vs. limit=15.0 2023-10-02 22:48:29,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 22:48:29,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:48:33,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:48:36,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 22:48:37,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:48:43,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:45,104 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.984e+02 2.334e+02 2.809e+02 4.390e+02, threshold=4.669e+02, percent-clipped=1.0 2023-10-02 22:48:48,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1046613.3333333334, ans=0.1 2023-10-02 22:48:50,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:48:50,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:48:52,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 22:48:54,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:48:55,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 22:48:55,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:48:55,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:48:55,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1046680.0, ans=10.0 2023-10-02 22:48:56,366 INFO [train.py:1046] (2/4) Epoch 30, batch 2950, loss[loss=0.1823, simple_loss=0.2515, pruned_loss=0.05654, over 23415.00 frames. ], tot_loss[loss=0.164, simple_loss=0.242, pruned_loss=0.04305, over 4724858.75 frames. ], batch size: 285, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:49:02,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:49:02,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.32 vs. limit=15.0 2023-10-02 22:49:03,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 22:49:05,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:49:05,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:07,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:07,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:49:10,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 22:49:11,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 22:49:11,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:49:11,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:49:17,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:49:19,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:49:20,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:49:21,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:49:24,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:49:24,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:49:24,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:25,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1046813.3333333334, ans=0.125 2023-10-02 22:49:26,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:26,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:49:28,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 22:49:30,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1046813.3333333334, ans=0.125 2023-10-02 22:49:35,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 22:49:35,179 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 22:49:35,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:49:35,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1046813.3333333334, ans=0.125 2023-10-02 22:49:37,955 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 22:49:39,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 22:49:39,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:49:41,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:49:41,253 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 22:49:41,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:49:43,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 22:49:45,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:49:45,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:49:47,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:48,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:49:48,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:49:50,019 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 22:49:50,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:51,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 22:49:55,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:49:57,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:49:57,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 22:49:57,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:49:58,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 22:50:01,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:50:02,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:50:02,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:50:06,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:50:06,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:50:06,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:50:07,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:50:07,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:50:08,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:50:09,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:50:10,826 INFO [train.py:1046] (2/4) Epoch 30, batch 3000, loss[loss=0.1514, simple_loss=0.2433, pruned_loss=0.02977, over 24440.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2419, pruned_loss=0.04238, over 4745586.79 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:50:10,826 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 22:50:22,553 INFO [train.py:1078] (2/4) Epoch 30, validation: loss=0.3782, simple_loss=0.2831, pruned_loss=0.2366, over 1125622.00 frames. 2023-10-02 22:50:22,554 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 22:50:22,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:22,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 22:50:24,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:25,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:50:25,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:50:30,151 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 22:50:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 22:50:33,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:50:33,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:50:33,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 22:50:34,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:50:37,510 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-02 22:50:40,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:50:48,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:50:55,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 22:50:56,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:50:59,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:50:59,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:50:59,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1047146.6666666666, ans=0.0 2023-10-02 22:51:01,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:51:02,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:51:02,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 22:51:06,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 22:51:07,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:51:07,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:51:09,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1047213.3333333334, ans=0.035 2023-10-02 22:51:10,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:51:10,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:51:12,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:12,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:51:15,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.65 vs. limit=10.0 2023-10-02 22:51:16,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:51:17,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:51:17,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:51:19,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:51:20,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 22:51:20,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:51:21,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1047280.0, ans=0.125 2023-10-02 22:51:22,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:22,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:51:24,943 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.821e+02 2.047e+02 2.427e+02 4.890e+02, threshold=4.095e+02, percent-clipped=1.0 2023-10-02 22:51:25,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1047280.0, ans=0.07 2023-10-02 22:51:26,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:26,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:27,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 22:51:29,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 22:51:29,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:51:30,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 22:51:30,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:51:31,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 22:51:32,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1047280.0, ans=0.125 2023-10-02 22:51:33,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:51:35,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 22:51:35,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 22:51:35,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 22:51:35,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:51:35,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:51:37,306 INFO [train.py:1046] (2/4) Epoch 30, batch 3050, loss[loss=0.1638, simple_loss=0.2467, pruned_loss=0.04049, over 24291.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2427, pruned_loss=0.04282, over 4740981.49 frames. ], batch size: 61, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:51:38,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:38,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:51:38,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:38,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:51:40,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1047346.6666666666, ans=0.125 2023-10-02 22:51:40,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1047346.6666666666, ans=0.1 2023-10-02 22:51:41,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 22:51:44,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:51:47,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:51:47,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:51:50,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:50,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1047413.3333333334, ans=0.125 2023-10-02 22:51:52,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 22:51:58,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 22:51:58,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 22:51:58,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:02,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:52:06,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:06,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:52:06,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:11,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:52:11,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:52:12,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:12,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:52:12,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:13,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:15,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:18,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:18,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 22:52:19,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:19,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:52:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:52:21,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:52:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:52:23,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:29,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:29,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:34,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:34,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:52:34,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:52:38,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:52:38,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:52:39,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 22:52:41,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:52:42,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 22:52:44,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:44,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1047613.3333333334, ans=0.05 2023-10-02 22:52:49,791 INFO [train.py:1046] (2/4) Epoch 30, batch 3100, loss[loss=0.161, simple_loss=0.2418, pruned_loss=0.04015, over 24356.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2426, pruned_loss=0.04296, over 4734146.55 frames. ], batch size: 77, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:52:49,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:51,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:52:51,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1047680.0, ans=0.125 2023-10-02 22:52:52,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:52:54,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 22:52:54,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1047680.0, ans=0.125 2023-10-02 22:52:56,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 22:52:58,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 22:53:00,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:53:01,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.07 vs. limit=12.0 2023-10-02 22:53:04,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:53:04,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:06,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:53:10,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:15,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 22:53:19,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 22:53:19,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:20,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:53:20,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:53:22,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:53:22,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:53:22,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 22:53:22,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:53:23,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:25,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 22:53:26,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:53:29,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1047813.3333333334, ans=0.125 2023-10-02 22:53:30,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:53:32,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 22:53:32,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 22:53:34,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:34,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:36,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:53:36,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:36,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:53:39,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:53:39,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:53:41,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:53:41,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:53:41,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:41,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 22:53:41,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1047880.0, ans=0.125 2023-10-02 22:53:43,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1047880.0, ans=0.1 2023-10-02 22:53:44,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1047880.0, ans=0.0 2023-10-02 22:53:45,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:53:45,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1047880.0, ans=0.125 2023-10-02 22:53:46,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 22:53:48,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:53:48,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 22:53:49,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:53:49,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:49,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 22:53:51,049 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.847e+02 2.082e+02 2.389e+02 3.344e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 22:53:52,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1047946.6666666666, ans=0.125 2023-10-02 22:54:02,412 INFO [train.py:1046] (2/4) Epoch 30, batch 3150, loss[loss=0.1598, simple_loss=0.2516, pruned_loss=0.03399, over 24445.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2411, pruned_loss=0.04264, over 4719863.43 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:54:02,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 22:54:05,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:05,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:54:07,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:54:07,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:54:07,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 22:54:09,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:09,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:54:10,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 22:54:12,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 22:54:18,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 22:54:18,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:54:19,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 22:54:19,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 22:54:22,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 22:54:22,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 22:54:22,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 22:54:22,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:22,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:54:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:25,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 22:54:26,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:27,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:29,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:54:32,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:54:35,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 22:54:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:54:39,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:54:39,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:54:39,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 22:54:43,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 22:54:43,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:54:44,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:54:44,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 22:54:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:54:44,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:54:46,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:54:46,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:54:48,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 22:54:48,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:54:49,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:54:50,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:54:50,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:54:51,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 22:54:53,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:54:54,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 22:54:54,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:54:56,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 22:54:56,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1048213.3333333334, ans=0.125 2023-10-02 22:54:57,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 22:54:58,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:54:58,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:55:00,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1048280.0, ans=0.0 2023-10-02 22:55:01,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 22:55:01,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 22:55:03,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:55:04,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:55:06,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:08,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:55:11,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:55:11,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:14,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 22:55:17,440 INFO [train.py:1046] (2/4) Epoch 30, batch 3200, loss[loss=0.1689, simple_loss=0.2418, pruned_loss=0.04804, over 23839.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2408, pruned_loss=0.04271, over 4721614.92 frames. ], batch size: 164, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:55:18,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:55:18,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:55:22,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:24,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:55:24,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 22:55:25,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:55:29,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:55:34,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:38,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1048413.3333333334, ans=0.0 2023-10-02 22:55:42,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:55:45,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1048480.0, ans=0.025 2023-10-02 22:55:47,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1048480.0, ans=0.125 2023-10-02 22:55:52,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 22:55:52,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:55:55,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 22:55:57,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:55:59,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:56:01,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:56:01,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:56:06,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 22:56:07,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:56:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 22:56:10,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1048546.6666666666, ans=0.2 2023-10-02 22:56:14,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 22:56:14,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:56:19,393 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.819e+02 2.030e+02 2.429e+02 3.151e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 22:56:19,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:21,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:56:21,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:21,421 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 22:56:21,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 22:56:24,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:56:26,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 22:56:26,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 22:56:28,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 22:56:29,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1048680.0, ans=0.0 2023-10-02 22:56:30,792 INFO [train.py:1046] (2/4) Epoch 30, batch 3250, loss[loss=0.1687, simple_loss=0.2573, pruned_loss=0.04006, over 24468.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2407, pruned_loss=0.04294, over 4714437.63 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:56:30,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 22:56:32,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:56:34,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:56:34,286 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 22:56:35,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:56:35,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:37,116 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 22:56:41,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:56:43,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:56:50,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:56:50,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 22:56:52,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:56:53,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:56:55,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:56:55,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:56:55,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1048746.6666666667, ans=0.0 2023-10-02 22:56:58,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:58,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:56:59,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:56:59,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:59,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:57:00,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:57:02,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:02,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:57:05,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:57:05,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:57:06,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:57:06,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:57:06,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:57:11,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 22:57:13,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:57:13,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:57:14,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:14,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:57:21,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:57:26,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:57:26,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:26,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 22:57:26,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:57:26,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:57:27,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:30,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 22:57:30,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 22:57:30,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:57:32,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:34,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:57:34,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:57:34,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1048946.6666666667, ans=0.125 2023-10-02 22:57:35,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:57:38,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:57:38,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:57:41,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 22:57:41,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:57:44,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:57:44,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 22:57:45,603 INFO [train.py:1046] (2/4) Epoch 30, batch 3300, loss[loss=0.177, simple_loss=0.2608, pruned_loss=0.04661, over 24350.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2413, pruned_loss=0.04301, over 4718668.64 frames. ], batch size: 77, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:57:47,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:57:47,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 22:57:48,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 22:57:49,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 22:57:49,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:55,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:57:56,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:57:56,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:59,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:57:59,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:58:01,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:03,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:58:06,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 22:58:08,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:58:08,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1049080.0, ans=0.125 2023-10-02 22:58:09,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:10,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:10,624 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 22:58:10,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1049080.0, ans=0.0 2023-10-02 22:58:11,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:58:13,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:58:15,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:58:15,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:58:15,211 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 22:58:19,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:58:19,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:58:20,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:20,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 22:58:22,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 22:58:22,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:24,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:58:26,891 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 22:58:27,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 22:58:27,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:58:30,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 22:58:32,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:58:34,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1049213.3333333333, ans=0.07 2023-10-02 22:58:35,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:58:37,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:58:38,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:58:40,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:40,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:58:40,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:58:41,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:58:41,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:43,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:58:45,095 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 22:58:46,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 22:58:47,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:58:49,137 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.469e+02 1.888e+02 2.006e+02 2.224e+02 2.991e+02, threshold=4.012e+02, percent-clipped=0.0 2023-10-02 22:58:49,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:58:49,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:58:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:49,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:58:49,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1049280.0, ans=0.125 2023-10-02 22:58:50,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:58:50,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:58:50,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:58:52,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:54,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:58:58,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 22:58:58,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:00,259 INFO [train.py:1046] (2/4) Epoch 30, batch 3350, loss[loss=0.1693, simple_loss=0.2575, pruned_loss=0.04056, over 24379.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2424, pruned_loss=0.04314, over 4721944.34 frames. ], batch size: 77, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:59:00,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:00,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1049346.6666666667, ans=0.0 2023-10-02 22:59:01,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:59:01,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:59:03,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:04,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:59:04,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:08,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:59:10,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:10,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1049346.6666666667, ans=0.0 2023-10-02 22:59:12,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:59:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:14,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:59:16,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:18,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:59:19,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 22:59:19,744 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 22:59:21,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:24,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 22:59:24,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 22:59:24,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:59:24,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:59:25,353 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.54 vs. limit=10.0 2023-10-02 22:59:25,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:25,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 22:59:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:59:29,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:31,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:31,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:31,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:59:33,979 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.99 vs. limit=22.5 2023-10-02 22:59:34,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:36,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:36,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:40,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:59:42,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:43,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:43,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:44,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:46,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 22:59:48,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:59:48,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 22:59:48,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:59:48,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.16 vs. limit=15.0 2023-10-02 22:59:49,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 22:59:50,013 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.68 vs. limit=15.0 2023-10-02 22:59:50,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:50,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1049546.6666666667, ans=0.125 2023-10-02 22:59:52,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:57,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1049546.6666666667, ans=0.125 2023-10-02 23:00:00,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:00:00,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 23:00:01,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:00:03,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:00:04,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:00:07,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1049613.3333333333, ans=0.1 2023-10-02 23:00:10,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:00:12,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 23:00:12,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:00:13,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:00:14,795 INFO [train.py:1046] (2/4) Epoch 30, batch 3400, loss[loss=0.1807, simple_loss=0.2664, pruned_loss=0.04755, over 24360.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2436, pruned_loss=0.04323, over 4723882.84 frames. ], batch size: 74, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:00:14,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:00:14,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 23:00:16,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:00:16,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 23:00:17,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:00:17,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:00:18,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:00:20,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:00:20,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 23:00:25,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1049680.0, ans=0.2 2023-10-02 23:00:26,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 23:00:26,839 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 23:00:26,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:00:29,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1049746.6666666667, ans=0.0 2023-10-02 23:00:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:00:31,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:00:32,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:00:34,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:00:38,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:00:39,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 23:00:41,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1049746.6666666667, ans=0.0 2023-10-02 23:00:44,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:00:46,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:00:47,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:00:48,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:00:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:00:55,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 23:01:01,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:01:01,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:01:03,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 23:01:03,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:01:03,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1049880.0, ans=0.125 2023-10-02 23:01:04,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:01:05,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:01:07,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:01:11,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:01:11,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:01:16,992 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=12.0 2023-10-02 23:01:17,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:01:19,052 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 1.969e+02 2.278e+02 2.618e+02 4.115e+02, threshold=4.556e+02, percent-clipped=1.0 2023-10-02 23:01:19,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 23:01:22,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1049946.6666666667, ans=0.2 2023-10-02 23:01:25,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:01:29,865 INFO [train.py:1046] (2/4) Epoch 30, batch 3450, loss[loss=0.1653, simple_loss=0.2262, pruned_loss=0.05219, over 23413.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2442, pruned_loss=0.04363, over 4714654.85 frames. ], batch size: 285, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:01:29,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 23:01:33,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 23:01:33,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:01:34,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:01:34,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 23:01:35,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:01:38,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:01:44,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:01:46,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:01:47,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:01:47,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:48,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:53,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 23:01:59,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 23:01:59,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:01:59,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:02:01,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1050146.6666666667, ans=0.125 2023-10-02 23:02:02,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:02,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1050146.6666666667, ans=0.125 2023-10-02 23:02:07,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 23:02:07,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:02:07,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1050146.6666666667, ans=0.0 2023-10-02 23:02:07,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1050146.6666666667, ans=0.125 2023-10-02 23:02:10,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1050146.6666666667, ans=0.125 2023-10-02 23:02:10,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1050146.6666666667, ans=0.0 2023-10-02 23:02:11,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:02:11,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:02:13,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:02:13,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1050213.3333333333, ans=0.125 2023-10-02 23:02:15,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:02:17,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 23:02:17,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:02:19,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:02:21,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:02:24,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 23:02:27,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:02:32,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:02:32,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:36,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:41,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:41,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:02:41,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:02:43,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:02:44,601 INFO [train.py:1046] (2/4) Epoch 30, batch 3500, loss[loss=0.178, simple_loss=0.2601, pruned_loss=0.04793, over 23297.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2424, pruned_loss=0.04363, over 4696946.75 frames. ], batch size: 105, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:02:47,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:48,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:02:50,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 23:02:53,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:02:57,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:02:59,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:59,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 23:03:02,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:03:02,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:03:03,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:03:03,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:03,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:03:05,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:05,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:03:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 23:03:09,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:10,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:03:10,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:03:13,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:14,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 23:03:14,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:03:18,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:03:20,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:03:21,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:22,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:03:24,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:03:26,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 23:03:26,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1050480.0, ans=0.125 2023-10-02 23:03:27,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 23:03:27,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 23:03:27,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:03:28,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.69 vs. limit=15.0 2023-10-02 23:03:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:30,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:30,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:03:32,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:03:33,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:03:38,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:03:38,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 23:03:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 23:03:38,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:03:41,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:03:42,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:03:43,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:43,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1050613.3333333333, ans=0.125 2023-10-02 23:03:46,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 23:03:46,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:03:47,844 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.822e+02 2.025e+02 2.259e+02 3.457e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 23:03:48,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:48,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 23:03:51,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 23:03:54,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:56,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:03:56,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:03:57,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:03:58,746 INFO [train.py:1046] (2/4) Epoch 30, batch 3550, loss[loss=0.1616, simple_loss=0.2327, pruned_loss=0.04526, over 23850.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2403, pruned_loss=0.04306, over 4693882.40 frames. ], batch size: 195, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:04:00,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:04:08,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:10,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 23:04:13,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:04:13,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:04:15,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:16,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:04:16,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:04:18,765 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.87 vs. limit=15.0 2023-10-02 23:04:19,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:04:19,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:04:20,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:20,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:04:21,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:04:26,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:04:26,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:04:29,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:04:29,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:29,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:04:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 23:04:29,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:31,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:32,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 23:04:38,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:04:38,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:04:40,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:04:42,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 23:04:42,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:04:43,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 23:04:44,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:04:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:04:47,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:04:50,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 23:04:50,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:04:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:04:58,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 23:04:58,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:00,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1050946.6666666667, ans=0.0 2023-10-02 23:05:01,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:05:02,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 23:05:05,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1050946.6666666667, ans=0.125 2023-10-02 23:05:08,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 23:05:08,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:05:08,565 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.19 vs. limit=15.0 2023-10-02 23:05:09,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:05:11,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:12,530 INFO [train.py:1046] (2/4) Epoch 30, batch 3600, loss[loss=0.1666, simple_loss=0.2383, pruned_loss=0.04741, over 23482.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2407, pruned_loss=0.04289, over 4698575.24 frames. ], batch size: 285, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 23:05:12,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:12,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:05:17,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:05:17,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:19,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:05:19,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:05:20,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:20,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 23:05:22,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:05:24,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:26,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:05:29,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:05:30,155 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.78 vs. limit=22.5 2023-10-02 23:05:30,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:05:30,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:05:30,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 23:05:32,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:05:33,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:33,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1051080.0, ans=0.0 2023-10-02 23:05:35,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:05:36,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:05:39,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:05:41,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:05:42,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 23:05:42,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1051146.6666666667, ans=0.125 2023-10-02 23:05:49,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:05:51,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:05:53,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 23:05:56,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:05:56,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1051213.3333333333, ans=0.0 2023-10-02 23:05:57,094 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.39 vs. limit=10.0 2023-10-02 23:05:59,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1051213.3333333333, ans=0.125 2023-10-02 23:06:00,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:01,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.43 vs. limit=12.0 2023-10-02 23:06:02,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1051213.3333333333, ans=0.125 2023-10-02 23:06:03,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:08,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:06:08,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:06:08,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 23:06:10,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 23:06:10,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1051280.0, ans=0.125 2023-10-02 23:06:11,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1051280.0, ans=0.125 2023-10-02 23:06:12,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 23:06:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:06:15,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:06:16,665 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.870e+02 2.077e+02 2.507e+02 3.555e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 23:06:16,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 23:06:16,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:06:18,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:06:18,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:06:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 23:06:21,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 23:06:21,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1051280.0, ans=0.125 2023-10-02 23:06:25,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:25,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 23:06:27,010 INFO [train.py:1046] (2/4) Epoch 30, batch 3650, loss[loss=0.148, simple_loss=0.2396, pruned_loss=0.02816, over 24657.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2411, pruned_loss=0.04252, over 4709876.34 frames. ], batch size: 68, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 23:06:30,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 23:06:32,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:06:34,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 23:06:36,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 23:06:39,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:06:39,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:06:39,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:06:39,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1051346.6666666667, ans=0.0 2023-10-02 23:06:42,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 23:06:42,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:06:42,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1051413.3333333333, ans=0.125 2023-10-02 23:06:44,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 23:06:44,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:06:45,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:06:45,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 23:06:45,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1051413.3333333333, ans=0.2 2023-10-02 23:06:47,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:06:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:06:47,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:06:47,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1051413.3333333333, ans=0.125 2023-10-02 23:06:47,852 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.43 vs. limit=22.5 2023-10-02 23:06:50,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:06:50,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1051413.3333333333, ans=0.0 2023-10-02 23:06:51,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 23:06:53,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 23:06:54,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:06:55,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 23:06:59,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:06:59,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:07:02,572 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.61 vs. limit=22.5 2023-10-02 23:07:04,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:07:06,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1051480.0, ans=0.125 2023-10-02 23:07:07,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:07:07,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:07:08,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:07:10,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:07:11,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:07:14,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:07:15,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:15,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:07:17,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:07:19,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:07:19,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1051546.6666666667, ans=0.125 2023-10-02 23:07:21,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:07:25,495 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 23:07:25,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1051613.3333333333, ans=0.125 2023-10-02 23:07:28,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:07:28,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:07:28,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1051613.3333333333, ans=0.0 2023-10-02 23:07:29,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:07:31,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:32,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:07:34,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:35,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 23:07:35,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:38,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:07:39,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:07:41,128 INFO [train.py:1046] (2/4) Epoch 30, batch 3700, loss[loss=0.1545, simple_loss=0.2317, pruned_loss=0.03864, over 24431.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.242, pruned_loss=0.04236, over 4725567.51 frames. ], batch size: 58, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:07:41,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:07:42,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:42,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 23:07:42,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:42,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1051680.0, ans=0.0 2023-10-02 23:07:43,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:07:43,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:07:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:07:51,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:07:51,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:07:51,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:07:52,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1051680.0, ans=0.2 2023-10-02 23:07:53,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:53,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:07:55,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:07:58,089 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 23:08:00,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1051746.6666666667, ans=0.125 2023-10-02 23:08:04,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:08:04,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:08:05,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:08:06,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 23:08:06,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:08:09,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:09,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1051813.3333333333, ans=0.0 2023-10-02 23:08:10,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 23:08:12,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:13,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:08:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:16,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:08:18,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:08:23,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:08:23,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 23:08:23,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:08:25,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 23:08:28,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1051880.0, ans=0.07 2023-10-02 23:08:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:08:30,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:08:32,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:08:33,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 23:08:34,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:08:35,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:08:35,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:08:35,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:08:38,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:08:39,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 23:08:39,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 23:08:40,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:08:40,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:08:42,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:08:42,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:08:44,986 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.850e+02 2.059e+02 2.330e+02 3.629e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 23:08:45,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:46,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:08:47,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:08:50,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 23:08:50,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 23:08:53,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:08:53,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 23:08:55,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:08:56,508 INFO [train.py:1046] (2/4) Epoch 30, batch 3750, loss[loss=0.1515, simple_loss=0.2441, pruned_loss=0.02945, over 24669.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2437, pruned_loss=0.0432, over 4713532.52 frames. ], batch size: 68, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:08:56,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:08:58,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1052013.3333333333, ans=0.0 2023-10-02 23:08:59,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:09:00,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:09:00,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1052013.3333333333, ans=0.0 2023-10-02 23:09:03,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:09:08,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:09:08,749 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.86 vs. limit=22.5 2023-10-02 23:09:09,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:09:10,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:09:15,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:09:16,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 23:09:17,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:09:19,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:09:19,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:09:21,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 23:09:25,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 23:09:28,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:09:29,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:09:31,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:09:37,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:09:39,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 23:09:40,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 23:09:43,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1052213.3333333333, ans=0.0 2023-10-02 23:09:44,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:09:46,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:09:47,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:09:49,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:09:51,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1052213.3333333333, ans=0.2 2023-10-02 23:09:55,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:09:56,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:09:58,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:09:59,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:10:01,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:10:09,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:10:10,384 INFO [train.py:1046] (2/4) Epoch 30, batch 3800, loss[loss=0.1677, simple_loss=0.2477, pruned_loss=0.04382, over 23235.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2428, pruned_loss=0.04281, over 4712807.82 frames. ], batch size: 105, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:10:13,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:13,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:10:14,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 23:10:14,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:10:17,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:10:20,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:10:22,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 23:10:22,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:22,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:10:22,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1052346.6666666667, ans=0.0 2023-10-02 23:10:25,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:10:25,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:10:26,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:27,076 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:10:28,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 23:10:29,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 23:10:31,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:10:31,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1052413.3333333333, ans=0.125 2023-10-02 23:10:34,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:10:35,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:10:37,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:10:38,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:10:38,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:47,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 23:10:47,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 23:10:50,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:10:56,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:11:03,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:11:04,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 23:11:07,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1052546.6666666667, ans=0.125 2023-10-02 23:11:08,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 23:11:08,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:09,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:11:09,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:13,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 23:11:15,491 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.837e+02 2.037e+02 2.244e+02 3.093e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 23:11:16,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 23:11:16,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 23:11:16,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:17,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:11:20,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1052613.3333333333, ans=0.2 2023-10-02 23:11:21,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:11:23,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:11:24,324 INFO [train.py:1046] (2/4) Epoch 30, batch 3850, loss[loss=0.1675, simple_loss=0.2622, pruned_loss=0.03639, over 24325.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2425, pruned_loss=0.04297, over 4716491.39 frames. ], batch size: 74, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:11:28,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.00 vs. limit=15.0 2023-10-02 23:11:30,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:11:30,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 23:11:31,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:11:32,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:32,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.25 vs. limit=22.5 2023-10-02 23:11:32,839 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.90 vs. limit=22.5 2023-10-02 23:11:33,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1052680.0, ans=0.025 2023-10-02 23:11:34,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:11:36,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:38,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:11:40,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 23:11:47,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:11:48,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:51,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:11:51,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:11:51,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1052746.6666666667, ans=0.125 2023-10-02 23:11:55,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:11:55,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:11:57,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:57,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:11:58,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:11:59,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:01,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:01,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:12:01,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 23:12:01,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 23:12:03,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:12:03,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:05,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:05,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:07,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 23:12:09,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 23:12:12,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:14,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 23:12:14,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1052880.0, ans=0.0 2023-10-02 23:12:16,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:12:17,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1052880.0, ans=0.125 2023-10-02 23:12:18,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1052880.0, ans=0.0 2023-10-02 23:12:19,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:21,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:22,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1052946.6666666667, ans=0.125 2023-10-02 23:12:25,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:25,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 23:12:28,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 23:12:29,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:30,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1052946.6666666667, ans=0.025 2023-10-02 23:12:33,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:12:33,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:12:34,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:36,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:36,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:12:36,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 23:12:37,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:12:38,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1053013.3333333333, ans=0.125 2023-10-02 23:12:39,443 INFO [train.py:1046] (2/4) Epoch 30, batch 3900, loss[loss=0.1647, simple_loss=0.2348, pruned_loss=0.04729, over 23358.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2416, pruned_loss=0.04254, over 4721959.81 frames. ], batch size: 285, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:12:39,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 23:12:39,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:39,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:40,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:12:41,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:44,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:12:44,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1053013.3333333333, ans=0.05 2023-10-02 23:12:45,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:45,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:46,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:12:46,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 23:12:48,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:12:51,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:12:51,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:12:52,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:12:54,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1053080.0, ans=0.125 2023-10-02 23:12:55,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:12:55,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:56,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:12:58,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 23:12:58,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:12:59,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 23:13:01,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:13:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 23:13:04,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 23:13:08,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:13:08,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:13:10,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:13:10,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:13,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:13:16,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:13:18,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:13:18,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:13:18,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:13:18,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1053146.6666666667, ans=0.0 2023-10-02 23:13:21,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1053146.6666666667, ans=0.125 2023-10-02 23:13:23,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:13:24,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:13:31,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:13:34,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:13:35,032 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.82 vs. limit=15.0 2023-10-02 23:13:44,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:13:45,995 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.921e+02 2.166e+02 2.503e+02 3.662e+02, threshold=4.332e+02, percent-clipped=0.0 2023-10-02 23:13:47,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:47,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 23:13:47,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 23:13:47,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:50,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 23:13:52,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:13:52,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1053346.6666666667, ans=0.1 2023-10-02 23:13:53,560 INFO [train.py:1046] (2/4) Epoch 30, batch 3950, loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.04081, over 23686.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2425, pruned_loss=0.04207, over 4735292.83 frames. ], batch size: 149, lr: 3.39e-03, grad_scale: 4.0 2023-10-02 23:13:53,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 23:13:59,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:14:00,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 23:14:01,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:14:02,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1053346.6666666667, ans=0.05 2023-10-02 23:14:04,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:14:05,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:14:10,919 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 23:14:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:14:11,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 23:14:12,330 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 23:14:12,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:14:13,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:14:13,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:14:13,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:14:18,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 23:14:19,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:14:19,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:14:19,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:14:21,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:14:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:14:25,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.02 vs. limit=12.0 2023-10-02 23:14:32,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:14:32,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:14:36,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1053546.6666666667, ans=0.05 2023-10-02 23:14:37,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 23:14:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 23:14:43,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 23:14:44,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:14:44,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:14:50,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1053613.3333333333, ans=0.2 2023-10-02 23:14:51,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:14:51,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:14:52,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.69 vs. limit=22.5 2023-10-02 23:14:53,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:14:53,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:14:53,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 23:14:58,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:14:58,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:15:03,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 23:15:06,712 INFO [train.py:1046] (2/4) Epoch 30, batch 4000, loss[loss=0.1483, simple_loss=0.2305, pruned_loss=0.033, over 24576.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2426, pruned_loss=0.04205, over 4737760.16 frames. ], batch size: 60, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:15:12,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:14,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1053680.0, ans=0.125 2023-10-02 23:15:18,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:24,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:15:25,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:15:25,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 23:15:26,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:15:26,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 23:15:26,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:15:28,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 23:15:29,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:15:32,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:15:32,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:15:32,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:15:32,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:15:32,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:15:33,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:15:35,865 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 23:15:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:15:38,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:15:40,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1053813.3333333333, ans=0.125 2023-10-02 23:15:41,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 23:15:41,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:15:41,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:15:47,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 23:15:49,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:15:51,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:15:52,004 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 23:15:54,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:15:54,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 23:15:54,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:15:57,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:15:58,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:16:00,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:16:00,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:16:00,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:16:01,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 23:16:01,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:16:03,334 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 23:16:07,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:16:12,362 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.926e+02 2.122e+02 2.388e+02 3.312e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-02 23:16:12,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 23:16:15,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:16:15,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:16:15,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.88 vs. limit=22.5 2023-10-02 23:16:16,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:16:18,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:16:20,283 INFO [train.py:1046] (2/4) Epoch 30, batch 4050, loss[loss=0.1756, simple_loss=0.2604, pruned_loss=0.04539, over 24004.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2427, pruned_loss=0.04252, over 4729323.39 frames. ], batch size: 80, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:16:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:16:25,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:16:25,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 23:16:28,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:16:28,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:16:28,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1054013.3333333333, ans=0.09899494936611666 2023-10-02 23:16:30,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:16:31,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:16:31,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:16:34,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:16:37,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:16:37,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 23:16:39,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:16:39,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:16:41,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:16:43,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:16:46,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 23:16:49,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 23:16:49,371 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 23:16:51,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:16:52,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1054146.6666666667, ans=0.0 2023-10-02 23:16:58,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 23:16:59,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:17:02,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:17:05,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:17:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:17:07,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:17:08,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1054213.3333333333, ans=0.2 2023-10-02 23:17:09,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:17:10,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1054213.3333333333, ans=0.125 2023-10-02 23:17:13,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 23:17:13,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:17:15,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:17:17,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 23:17:20,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:17:28,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 23:17:28,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:17:28,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:17:28,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1054280.0, ans=0.125 2023-10-02 23:17:29,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 23:17:30,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 23:17:30,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:33,695 INFO [train.py:1046] (2/4) Epoch 30, batch 4100, loss[loss=0.1585, simple_loss=0.2449, pruned_loss=0.0361, over 24470.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2431, pruned_loss=0.04281, over 4718941.16 frames. ], batch size: 63, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:17:33,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:17:35,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:35,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:17:36,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1054346.6666666667, ans=0.125 2023-10-02 23:17:41,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 23:17:41,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1054346.6666666667, ans=0.0 2023-10-02 23:17:42,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 23:17:46,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 23:17:47,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 23:17:47,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:47,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:47,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1054413.3333333333, ans=0.125 2023-10-02 23:17:48,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:48,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:17:48,873 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 23:17:52,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:17:53,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:17:53,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:54,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:17:58,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:17:58,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1054413.3333333333, ans=0.0 2023-10-02 23:17:58,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1054413.3333333333, ans=0.125 2023-10-02 23:17:59,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:17:59,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:17:59,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 23:17:59,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:59,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:18:00,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:18:00,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:18:02,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 23:18:05,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:06,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 23:18:08,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:18:11,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:18:11,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 23:18:11,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:18:11,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:18:12,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:18:13,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 23:18:13,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:18:16,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:18:18,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 23:18:20,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:18:20,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:18:23,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:28,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:18:32,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:18:33,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:18:39,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:18:39,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:41,141 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.912e+02 2.226e+02 2.556e+02 3.636e+02, threshold=4.451e+02, percent-clipped=0.0 2023-10-02 23:18:43,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:18:46,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:18:48,256 INFO [train.py:1046] (2/4) Epoch 30, batch 4150, loss[loss=0.1557, simple_loss=0.2493, pruned_loss=0.03104, over 24305.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2437, pruned_loss=0.04335, over 4710940.18 frames. ], batch size: 74, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:18:51,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:18:51,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:18:53,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:18:53,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:18:54,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 23:18:54,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:18:55,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 23:18:57,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 23:18:57,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 23:18:58,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:19:03,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:19:03,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:06,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:06,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:19:08,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:19:10,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:19:10,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:19:11,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:19:17,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:19,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:19:21,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 23:19:24,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 23:19:24,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:19:25,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 23:19:25,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:19:25,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:19:28,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:29,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:32,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1054880.0, ans=0.05 2023-10-02 23:19:34,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 23:19:34,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1054880.0, ans=0.1 2023-10-02 23:19:37,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:19:38,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.33 vs. limit=15.0 2023-10-02 23:19:39,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:19:39,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 23:19:39,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:19:42,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 23:19:43,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:19:43,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:19:45,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:46,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 23:19:46,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:19:46,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:19:49,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:19:51,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 23:19:51,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:19:51,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:19:52,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 23:19:52,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:54,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:19:54,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:57,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:57,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 23:19:57,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:20:00,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1054946.6666666667, ans=0.125 2023-10-02 23:20:02,501 INFO [train.py:1046] (2/4) Epoch 30, batch 4200, loss[loss=0.1385, simple_loss=0.2155, pruned_loss=0.03073, over 24317.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2428, pruned_loss=0.04324, over 4697407.75 frames. ], batch size: 56, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:20:02,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:20:04,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 23:20:05,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:20:07,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:20:10,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:20:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:20:10,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:20:10,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1055013.3333333333, ans=0.0 2023-10-02 23:20:12,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 23:20:14,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1055013.3333333333, ans=0.1 2023-10-02 23:20:15,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 23:20:17,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:18,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.50 vs. limit=15.0 2023-10-02 23:20:18,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:20:21,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.46 vs. limit=15.0 2023-10-02 23:20:21,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:20:24,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:20:25,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.57 vs. limit=22.5 2023-10-02 23:20:26,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:20:26,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:26,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1055080.0, ans=0.0 2023-10-02 23:20:27,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 23:20:27,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:20:29,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:29,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:20:29,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:20:30,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:20:33,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 23:20:33,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:38,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:20:40,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:20:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:20:42,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:20:44,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:20:44,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 23:20:45,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:20:45,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:20:51,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:20:52,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:20:58,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:21:01,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 23:21:03,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:21:07,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:21:09,202 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.983e+02 2.287e+02 2.650e+02 3.812e+02, threshold=4.574e+02, percent-clipped=0.0 2023-10-02 23:21:09,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:11,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 23:21:16,271 INFO [train.py:1046] (2/4) Epoch 30, batch 4250, loss[loss=0.1616, simple_loss=0.2473, pruned_loss=0.03799, over 23929.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2425, pruned_loss=0.04303, over 4712307.18 frames. ], batch size: 86, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:21:16,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:21:19,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:21:19,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:21:21,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1055346.6666666667, ans=0.1 2023-10-02 23:21:22,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:26,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:21:27,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1055346.6666666667, ans=0.1 2023-10-02 23:21:28,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 23:21:28,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:21:29,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:31,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:21:37,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:37,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:39,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:21:39,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:21:41,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:42,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:42,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:44,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1055480.0, ans=0.125 2023-10-02 23:21:45,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:21:45,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1055480.0, ans=0.125 2023-10-02 23:21:47,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:21:48,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 23:21:52,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 23:21:52,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:52,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:21:52,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:55,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:21:55,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:55,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:57,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:21:57,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:22:03,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:22:04,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:05,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 23:22:05,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:22:07,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 23:22:09,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:22:11,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:22:12,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:22:12,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:22:12,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1055546.6666666667, ans=0.125 2023-10-02 23:22:15,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 23:22:15,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1055613.3333333333, ans=0.2 2023-10-02 23:22:16,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:22:17,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1055613.3333333333, ans=0.0 2023-10-02 23:22:18,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:22:22,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:22:25,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:26,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:22:27,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:22:29,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:22:30,515 INFO [train.py:1046] (2/4) Epoch 30, batch 4300, loss[loss=0.1568, simple_loss=0.2351, pruned_loss=0.03921, over 24435.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2417, pruned_loss=0.04258, over 4727989.54 frames. ], batch size: 58, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:22:30,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:22:31,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:22:31,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 23:22:33,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:22:33,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1055680.0, ans=0.1 2023-10-02 23:22:38,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:22:38,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:22:42,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:22:49,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:49,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 23:22:51,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:22:54,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:22:54,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:22:54,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 23:22:58,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:23:00,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:23:02,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 23:23:02,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:23:02,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 23:23:05,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:23:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:23:09,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:23:09,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:23:09,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:23:12,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:23:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:23:12,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 23:23:14,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 23:23:16,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:23:16,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1055880.0, ans=0.0 2023-10-02 23:23:20,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:20,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:23:20,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:22,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:23:22,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 23:23:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 23:23:23,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 23:23:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:23:23,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 23:23:23,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1055880.0, ans=0.1 2023-10-02 23:23:24,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 23:23:27,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:23:27,970 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 23:23:29,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:23:32,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:23:35,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 23:23:36,592 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.805e+02 2.042e+02 2.369e+02 3.407e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-02 23:23:36,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:23:36,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:36,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:23:38,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:23:38,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:23:39,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:23:42,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:42,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:42,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:23:42,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1056013.3333333333, ans=0.1 2023-10-02 23:23:44,281 INFO [train.py:1046] (2/4) Epoch 30, batch 4350, loss[loss=0.163, simple_loss=0.2369, pruned_loss=0.0446, over 23466.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.04247, over 4737982.43 frames. ], batch size: 134, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:23:50,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 23:23:50,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:23:56,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:23:57,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:59,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:23:59,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:24:05,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:24:07,643 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.69 vs. limit=15.0 2023-10-02 23:24:08,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:24:09,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:24:09,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:24:12,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:24:13,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:24:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:24:16,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1056146.6666666667, ans=0.125 2023-10-02 23:24:16,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1056146.6666666667, ans=0.0 2023-10-02 23:24:20,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 23:24:21,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:24:23,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:25,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.92 vs. limit=15.0 2023-10-02 23:24:26,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:28,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 23:24:33,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:24:33,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1056213.3333333333, ans=0.125 2023-10-02 23:24:34,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:24:38,959 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 23:24:40,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:24:41,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:24:41,865 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 23:24:41,929 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 23:24:41,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:24:43,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:24:43,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:24:43,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1056280.0, ans=0.0 2023-10-02 23:24:44,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:24:46,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:24:47,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:24:50,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 23:24:51,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:51,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:24:51,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:51,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 23:24:51,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1056280.0, ans=0.125 2023-10-02 23:24:53,113 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 23:24:53,126 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 23:24:53,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 23:24:55,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:24:55,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:24:55,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:24:57,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:24:58,423 INFO [train.py:1046] (2/4) Epoch 30, batch 4400, loss[loss=0.1753, simple_loss=0.2501, pruned_loss=0.05024, over 23813.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2433, pruned_loss=0.04278, over 4743049.16 frames. ], batch size: 164, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:24:59,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 23:25:01,287 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 23:25:01,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:04,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:25:04,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:06,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:25:07,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 23:25:07,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 23:25:07,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 23:25:07,569 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 23:25:08,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:25:08,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:25:11,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 23:25:12,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:13,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1056413.3333333333, ans=0.0 2023-10-02 23:25:14,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:14,763 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 23:25:19,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:19,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 23:25:19,747 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 23:25:21,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1056413.3333333333, ans=0.125 2023-10-02 23:25:22,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 23:25:24,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 23:25:24,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 23:25:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:25,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:25:26,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:25:27,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:25:27,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1056480.0, ans=0.5 2023-10-02 23:25:28,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 23:25:28,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 23:25:29,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:32,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:25:32,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:34,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:34,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:34,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 23:25:35,774 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 23:25:40,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:45,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:25:48,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 23:25:53,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:25:56,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:25:58,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:25:58,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 23:25:58,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:25:58,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:25:58,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:25:59,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:26:02,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 23:26:05,410 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.891e+02 2.096e+02 2.482e+02 3.996e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 23:26:05,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 23:26:06,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 23:26:06,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:06,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 23:26:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:26:10,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:26:12,189 INFO [train.py:1046] (2/4) Epoch 30, batch 4450, loss[loss=0.1845, simple_loss=0.2563, pruned_loss=0.05635, over 23729.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2437, pruned_loss=0.04287, over 4745003.73 frames. ], batch size: 232, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:26:13,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 23:26:16,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:26:16,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1056680.0, ans=0.1 2023-10-02 23:26:18,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:18,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:26:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:26:27,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:26:29,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1056746.6666666667, ans=0.0 2023-10-02 23:26:31,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:33,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:26:33,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1056746.6666666667, ans=0.125 2023-10-02 23:26:34,112 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.24 vs. limit=15.0 2023-10-02 23:26:36,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:26:36,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:37,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 23:26:37,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:26:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:39,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:26:39,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:26:40,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:26:46,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1056813.3333333333, ans=0.125 2023-10-02 23:26:47,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:26:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:26:47,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:26:49,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:50,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:26:54,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 23:26:55,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 23:26:55,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 23:26:55,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:26:57,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:26:59,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 23:27:01,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:27:05,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:27:05,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 23:27:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:06,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:27:06,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:27:06,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:27:09,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:27:09,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1056880.0, ans=0.125 2023-10-02 23:27:10,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:27:10,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 23:27:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:27:14,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:27:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:27:16,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1056946.6666666667, ans=0.1 2023-10-02 23:27:17,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:17,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:27:19,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:27:23,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 23:27:24,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:27:27,509 INFO [train.py:1046] (2/4) Epoch 30, batch 4500, loss[loss=0.1533, simple_loss=0.2429, pruned_loss=0.03192, over 24627.00 frames. ], tot_loss[loss=0.165, simple_loss=0.244, pruned_loss=0.043, over 4727896.90 frames. ], batch size: 68, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:27:31,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:27:31,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1057013.3333333333, ans=0.125 2023-10-02 23:27:33,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 23:27:33,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 23:27:34,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:27:40,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:40,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:27:41,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:27:41,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:27:42,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:27:42,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:27:53,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:27:53,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:27:56,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:27:56,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:27:58,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:28:04,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:28:07,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:28:10,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:28:13,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:28:14,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 23:28:14,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:16,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:17,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:17,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:28:20,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:28:21,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 23:28:21,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:28:21,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:26,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:28:26,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:28:28,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:29,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1057280.0, ans=0.0 2023-10-02 23:28:30,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.27 vs. limit=15.0 2023-10-02 23:28:31,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:28:31,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:28:32,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 23:28:35,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 23:28:35,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 23:28:35,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1057280.0, ans=0.0 2023-10-02 23:28:37,058 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.840e+02 1.982e+02 2.300e+02 3.400e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 23:28:40,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 23:28:41,532 INFO [train.py:1046] (2/4) Epoch 30, batch 4550, loss[loss=0.1645, simple_loss=0.2303, pruned_loss=0.04938, over 23752.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.243, pruned_loss=0.04274, over 4720182.81 frames. ], batch size: 232, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:28:43,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 23:28:44,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:28:45,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1057346.6666666667, ans=0.0 2023-10-02 23:28:47,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:28:48,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:28:50,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1057346.6666666667, ans=0.1 2023-10-02 23:28:51,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:28:53,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:28:56,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:58,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:28:58,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:28:58,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:01,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:01,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:29:04,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:29:06,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 23:29:07,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 23:29:07,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:29:08,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 23:29:10,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1057480.0, ans=0.125 2023-10-02 23:29:12,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 23:29:14,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:29:16,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 23:29:18,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:29:21,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:22,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:22,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:29:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 23:29:26,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.21 vs. limit=22.5 2023-10-02 23:29:27,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:29:30,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:30,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:29:30,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:29:32,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 23:29:33,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 23:29:33,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:29:34,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 23:29:36,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 23:29:36,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:29:39,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:39,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:29:39,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:40,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:29:42,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:29:42,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 23:29:45,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:29:45,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 23:29:46,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 23:29:46,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:29:46,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 23:29:49,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:29:49,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:29:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:29:52,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:52,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:29:53,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:29:54,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:29:55,741 INFO [train.py:1046] (2/4) Epoch 30, batch 4600, loss[loss=0.1452, simple_loss=0.2286, pruned_loss=0.03092, over 24477.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2412, pruned_loss=0.04213, over 4718185.58 frames. ], batch size: 63, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:29:57,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1057680.0, ans=0.125 2023-10-02 23:29:59,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:59,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:30:01,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:30:01,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:30:01,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:03,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 23:30:06,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:30:09,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:30:09,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:11,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:19,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 23:30:19,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:23,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:28,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:30:28,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:33,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 23:30:33,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:30:33,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:30:38,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:30:42,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:30:43,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 23:30:45,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:30:49,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:50,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:30:53,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:53,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 23:30:53,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:54,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 23:30:55,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:55,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:30:57,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:58,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:58,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:30:58,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 23:31:00,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 23:31:00,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 23:31:00,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:01,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:01,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:03,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:31:04,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1057946.6666666667, ans=0.125 2023-10-02 23:31:05,782 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.399e+02 1.837e+02 2.032e+02 2.322e+02 3.938e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 23:31:10,361 INFO [train.py:1046] (2/4) Epoch 30, batch 4650, loss[loss=0.1576, simple_loss=0.2379, pruned_loss=0.03863, over 21562.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.241, pruned_loss=0.04192, over 4719577.80 frames. ], batch size: 47, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:31:13,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:31:14,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:31:16,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:31:16,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:31:16,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:16,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:16,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1058013.3333333333, ans=0.2 2023-10-02 23:31:17,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:31:20,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 23:31:22,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:31:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 23:31:26,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:31:28,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 23:31:28,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:31:28,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 23:31:28,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 23:31:29,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:29,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:31:32,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:31:33,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:33,738 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 23:31:36,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:37,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 23:31:41,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:41,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:31:41,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 23:31:44,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:31:46,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:31:49,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:55,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1058213.3333333333, ans=0.1 2023-10-02 23:31:57,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:59,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:59,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:32:00,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:32:01,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 23:32:01,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 23:32:03,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 23:32:03,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 23:32:03,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:07,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1058280.0, ans=0.125 2023-10-02 23:32:10,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:32:10,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:32:10,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 23:32:11,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:12,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:32:12,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:32:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:32:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:32:16,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:32:17,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:32:21,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:23,210 INFO [train.py:1046] (2/4) Epoch 30, batch 4700, loss[loss=0.1778, simple_loss=0.2474, pruned_loss=0.05411, over 23848.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.242, pruned_loss=0.04237, over 4712166.78 frames. ], batch size: 179, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:32:23,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:32:23,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:32:23,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 23:32:24,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.27 vs. limit=15.0 2023-10-02 23:32:24,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:32:26,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 23:32:34,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:35,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:35,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:32:36,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:32:38,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:32:42,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 23:32:43,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 23:32:45,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:45,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:32:45,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:32:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:55,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:32:57,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:32:59,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:33:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 23:33:06,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:33:08,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:11,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 23:33:12,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:33:16,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:33:18,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 23:33:20,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:20,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:22,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.76 vs. limit=15.0 2023-10-02 23:33:22,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:33:24,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:33:24,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 23:33:24,175 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 23:33:27,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:28,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:28,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:28,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 23:33:28,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:32,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 23:33:34,175 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.856e+02 2.084e+02 2.252e+02 3.247e+02, threshold=4.168e+02, percent-clipped=0.0 2023-10-02 23:33:35,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:33:36,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1058613.3333333333, ans=0.0 2023-10-02 23:33:37,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:33:38,556 INFO [train.py:1046] (2/4) Epoch 30, batch 4750, loss[loss=0.1761, simple_loss=0.2527, pruned_loss=0.04982, over 23760.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2424, pruned_loss=0.04267, over 4703321.98 frames. ], batch size: 179, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:33:40,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:33:41,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:33:42,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 23:33:42,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:33:46,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 23:33:48,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:33:48,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:50,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:33:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 23:33:58,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:34:02,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 23:34:02,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:34:05,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:34:05,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:34:06,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:34:06,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 23:34:06,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 23:34:11,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 23:34:13,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:34:16,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:34:17,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.00 vs. limit=15.0 2023-10-02 23:34:19,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:34:19,432 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 23:34:19,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:34:19,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1058813.3333333333, ans=0.0 2023-10-02 23:34:22,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1058880.0, ans=0.125 2023-10-02 23:34:22,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1058880.0, ans=0.125 2023-10-02 23:34:23,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:34:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:34:27,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 23:34:27,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 23:34:27,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:34:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:34:29,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:34:31,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:34:31,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 23:34:35,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 23:34:36,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:34:39,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:34:39,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 23:34:39,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:34:40,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:34:40,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1058946.6666666667, ans=0.2 2023-10-02 23:34:42,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:34:42,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:34:43,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:34:45,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:34:46,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 23:34:46,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 23:34:47,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 23:34:50,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:34:50,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:34:52,584 INFO [train.py:1046] (2/4) Epoch 30, batch 4800, loss[loss=0.1923, simple_loss=0.2623, pruned_loss=0.06113, over 22785.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2437, pruned_loss=0.04348, over 4687318.69 frames. ], batch size: 322, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:34:52,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 23:34:56,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1059013.3333333333, ans=0.125 2023-10-02 23:35:00,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:00,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:05,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:35:06,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:08,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:08,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 23:35:09,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:35:09,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:35:10,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:35:15,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:16,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:16,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:35:17,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1059080.0, ans=0.125 2023-10-02 23:35:19,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:19,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 23:35:19,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:20,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:20,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1059146.6666666667, ans=0.125 2023-10-02 23:35:23,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:26,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:27,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:27,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:35:29,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:35:29,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:30,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 23:35:30,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 23:35:32,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:32,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:35:34,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:35:34,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:35:34,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:35:36,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:35:36,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:35:40,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:35:43,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:44,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:35:47,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 23:35:48,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:48,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:48,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:35:50,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:54,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:35:54,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:35:54,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:56,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:35:56,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:35:56,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:36:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:00,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:01,343 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.838e+02 2.029e+02 2.262e+02 3.175e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 23:36:01,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:36:03,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 23:36:04,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 23:36:04,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:04,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:06,103 INFO [train.py:1046] (2/4) Epoch 30, batch 4850, loss[loss=0.1706, simple_loss=0.248, pruned_loss=0.04656, over 23299.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2437, pruned_loss=0.04392, over 4695061.01 frames. ], batch size: 93, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:36:06,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:36:06,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:08,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.71 vs. limit=6.0 2023-10-02 23:36:09,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:36:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 23:36:17,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:21,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:36:23,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:36:23,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:26,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:28,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:36:29,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:36:29,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 23:36:33,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:36:36,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:36:36,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:36:37,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:36:37,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 23:36:40,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:36:40,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:36:46,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:36:46,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 23:36:46,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 23:36:46,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1059480.0, ans=0.1 2023-10-02 23:36:48,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:36:54,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:36:56,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 23:36:56,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:56,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:36:57,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:36:59,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 23:36:59,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:02,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 23:37:02,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:03,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:05,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 23:37:12,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:17,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:37:19,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:20,377 INFO [train.py:1046] (2/4) Epoch 30, batch 4900, loss[loss=0.1533, simple_loss=0.2169, pruned_loss=0.04481, over 23607.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2427, pruned_loss=0.04378, over 4698010.30 frames. ], batch size: 256, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:37:23,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 23:37:23,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:37:26,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1059680.0, ans=0.0 2023-10-02 23:37:28,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:28,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:28,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:37:32,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 23:37:36,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1059746.6666666667, ans=0.125 2023-10-02 23:37:38,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 23:37:38,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1059746.6666666667, ans=0.2 2023-10-02 23:37:42,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 23:37:42,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 23:37:42,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:37:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:44,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:37:44,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:44,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:37:44,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 23:37:47,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 23:37:48,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:37:49,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:37:50,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:37:51,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:37:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:55,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:55,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 23:37:57,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:37:58,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:58,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 23:37:58,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 23:38:00,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 23:38:03,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:38:05,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:38:05,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:38:05,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:06,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 23:38:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:38:08,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 23:38:10,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1059880.0, ans=0.0 2023-10-02 23:38:11,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:12,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:38:13,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:38:18,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 23:38:19,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:38:20,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 23:38:20,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 23:38:27,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:38:28,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:38:30,096 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.877e+02 2.022e+02 2.304e+02 3.994e+02, threshold=4.045e+02, percent-clipped=0.0 2023-10-02 23:38:30,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 23:38:30,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:38:30,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:38:31,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:34,462 INFO [train.py:1046] (2/4) Epoch 30, batch 4950, loss[loss=0.1582, simple_loss=0.2476, pruned_loss=0.0344, over 24670.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2422, pruned_loss=0.04352, over 4703331.31 frames. ], batch size: 73, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:38:34,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:38:34,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:38:34,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:38:35,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 23:38:37,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:38:42,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:38:42,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:38:42,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1060013.3333333333, ans=0.1 2023-10-02 23:38:45,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 23:38:45,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 23:38:45,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:38:46,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 23:38:46,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:46,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:38:46,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:38:46,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1060013.3333333333, ans=0.0 2023-10-02 23:38:48,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:38:49,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:50,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1060080.0, ans=0.0 2023-10-02 23:38:51,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:38:51,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:38:52,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:38:54,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:54,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:38:56,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:39:02,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:04,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:39:05,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:07,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:08,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:39:09,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 23:39:10,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 23:39:13,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:15,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:39:15,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:39:16,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:39:16,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:39:17,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:39:21,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:39:22,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:39:23,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:39:24,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1060213.3333333333, ans=0.0 2023-10-02 23:39:26,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:26,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:27,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 23:39:28,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:39:29,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:39:32,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:39:34,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:39:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:39:35,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:39:35,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:39:39,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:39:40,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:39:40,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:39:41,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 23:39:44,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:39:48,818 INFO [train.py:1046] (2/4) Epoch 30, batch 5000, loss[loss=0.143, simple_loss=0.2263, pruned_loss=0.02987, over 24600.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2414, pruned_loss=0.04324, over 4709341.34 frames. ], batch size: 60, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:39:50,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 23:39:50,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:39:57,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:57,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:39:59,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 23:39:59,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-10-02 23:40:00,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 23:40:04,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:40:04,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 23:40:05,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:40:05,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:40:06,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 23:40:06,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:06,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:40:08,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 23:40:08,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:40:09,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:40:09,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 23:40:10,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 23:40:12,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:40:14,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 23:40:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:40:14,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:14,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:40:14,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 23:40:14,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 23:40:14,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1060413.3333333333, ans=0.0 2023-10-02 23:40:17,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.87 vs. limit=15.0 2023-10-02 23:40:18,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 23:40:18,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:18,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:19,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 23:40:19,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:40:21,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:21,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:40:22,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 23:40:25,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 23:40:25,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:40:27,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1060480.0, ans=0.125 2023-10-02 23:40:28,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:40:31,152 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 23:40:34,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:40:34,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:34,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:40:34,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1060546.6666666667, ans=0.125 2023-10-02 23:40:37,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 23:40:37,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:37,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.16 vs. limit=15.0 2023-10-02 23:40:38,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:40:38,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:40:40,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 23:40:41,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:40:45,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:40:46,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:40:50,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 23:40:55,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:40:57,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1060613.3333333333, ans=0.0 2023-10-02 23:40:58,137 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.837e+02 2.079e+02 2.443e+02 4.073e+02, threshold=4.157e+02, percent-clipped=1.0 2023-10-02 23:41:02,686 INFO [train.py:1046] (2/4) Epoch 30, batch 5050, loss[loss=0.1701, simple_loss=0.2587, pruned_loss=0.0408, over 24689.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2416, pruned_loss=0.04274, over 4724077.16 frames. ], batch size: 73, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:41:04,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:41:04,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:05,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:41:05,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:05,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:41:06,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:41:06,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:11,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:11,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 23:41:11,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.62 vs. limit=15.0 2023-10-02 23:41:12,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:41:15,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:17,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:41:17,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 23:41:18,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:41:19,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:41:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:41:21,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:41:22,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1060746.6666666667, ans=0.1 2023-10-02 23:41:23,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:41:32,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 23:41:32,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:41:34,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:41:34,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 23:41:34,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:41:34,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:35,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:41:37,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:41:37,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 23:41:37,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 23:41:38,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:39,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:41:44,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:44,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 23:41:47,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:41:48,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 23:41:48,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:41:48,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:41:50,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:41:51,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:41:51,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:41:55,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:41:55,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:56,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:41:56,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 23:41:56,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1060880.0, ans=0.2 2023-10-02 23:41:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:41:59,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:42:05,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:42:05,260 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 23:42:05,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:42:06,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:42:06,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:07,990 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 23:42:09,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:42:09,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 23:42:09,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:12,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1060946.6666666667, ans=0.0 2023-10-02 23:42:13,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:42:13,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:13,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 23:42:15,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 23:42:15,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1061013.3333333333, ans=0.125 2023-10-02 23:42:16,883 INFO [train.py:1046] (2/4) Epoch 30, batch 5100, loss[loss=0.1761, simple_loss=0.2442, pruned_loss=0.05396, over 23824.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.243, pruned_loss=0.04289, over 4724862.24 frames. ], batch size: 212, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:42:18,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:18,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:18,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:42:19,874 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 23:42:21,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:42:25,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 23:42:25,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 23:42:27,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:27,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:42:30,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:42:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 23:42:31,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 23:42:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:42:37,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:42:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:42,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 23:42:44,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:46,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:47,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:42:50,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:52,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:52,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 23:42:53,637 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 23:42:54,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:56,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 23:42:56,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 23:42:58,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:59,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1061213.3333333333, ans=0.125 2023-10-02 23:43:04,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1061213.3333333333, ans=0.125 2023-10-02 23:43:07,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:07,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1061213.3333333333, ans=0.1 2023-10-02 23:43:08,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 23:43:08,787 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 23:43:08,795 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 23:43:11,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 23:43:11,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:43:12,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1061213.3333333333, ans=0.0 2023-10-02 23:43:15,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 23:43:18,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 23:43:20,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:43:20,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:43:23,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 23:43:25,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:43:25,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 23:43:28,611 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.879e+02 2.162e+02 2.700e+02 3.768e+02, threshold=4.325e+02, percent-clipped=0.0 2023-10-02 23:43:31,373 INFO [train.py:1046] (2/4) Epoch 30, batch 5150, loss[loss=0.1619, simple_loss=0.251, pruned_loss=0.0364, over 24638.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.243, pruned_loss=0.04321, over 4725660.11 frames. ], batch size: 68, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:43:32,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:43:32,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:43:32,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:43:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:43:34,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:43:36,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:43:36,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 23:43:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 23:43:36,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 23:43:36,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:43:36,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 23:43:39,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:39,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 23:43:40,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:43:40,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1061346.6666666667, ans=0.0 2023-10-02 23:43:41,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:43:44,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:43:44,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 23:43:46,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:47,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:43:49,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:43:49,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:43:49,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:43:50,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:43:50,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:43:50,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 23:43:52,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:43:52,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:43:53,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:43:56,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 23:43:56,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:43:56,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1061413.3333333333, ans=0.0 2023-10-02 23:44:02,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:44:04,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1061480.0, ans=0.1 2023-10-02 23:44:05,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 23:44:07,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:44:15,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:44:16,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:44:19,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:44:21,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:44:21,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1061546.6666666667, ans=0.0 2023-10-02 23:44:24,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 23:44:26,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:44:27,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:44:27,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:44:27,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1061546.6666666667, ans=0.125 2023-10-02 23:44:29,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1061613.3333333333, ans=0.0 2023-10-02 23:44:32,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:44:34,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:44:34,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 23:44:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:44:41,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:44:43,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:44:43,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:44:45,157 INFO [train.py:1046] (2/4) Epoch 30, batch 5200, loss[loss=0.1533, simple_loss=0.2249, pruned_loss=0.0408, over 23681.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2432, pruned_loss=0.04317, over 4735825.22 frames. ], batch size: 232, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:44:45,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:44:45,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:44:45,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:44:45,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:44:49,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:44:49,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1061680.0, ans=0.125 2023-10-02 23:44:50,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:44:52,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:44:52,942 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.99 vs. limit=15.0 2023-10-02 23:44:57,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 23:44:58,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:45:00,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:01,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:45:03,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:03,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 23:45:06,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:45:06,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:09,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 23:45:12,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:45:13,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:45:13,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 23:45:13,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 23:45:16,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 23:45:17,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:17,483 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 23:45:17,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:18,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.31 vs. limit=15.0 2023-10-02 23:45:18,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:18,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:45:19,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1061813.3333333333, ans=0.125 2023-10-02 23:45:20,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 23:45:20,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:45:20,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1061813.3333333333, ans=0.125 2023-10-02 23:45:23,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:23,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1061813.3333333333, ans=0.1 2023-10-02 23:45:26,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 23:45:26,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 23:45:26,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 23:45:32,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 23:45:33,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:45:38,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:45:38,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:45:40,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 23:45:41,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:41,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 23:45:41,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:41,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:45:45,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:45:45,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:45:48,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1061946.6666666667, ans=0.05 2023-10-02 23:45:49,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=7.13 vs. limit=12.0 2023-10-02 23:45:49,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:51,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:45:51,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:55,889 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.943e+02 2.112e+02 2.505e+02 3.885e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-02 23:45:56,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:45:57,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 23:45:57,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:45:57,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:45:59,154 INFO [train.py:1046] (2/4) Epoch 30, batch 5250, loss[loss=0.1684, simple_loss=0.2457, pruned_loss=0.0456, over 23448.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2423, pruned_loss=0.04302, over 4732840.19 frames. ], batch size: 93, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:45:59,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:46:00,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:46:00,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:46:03,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:46:05,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:46:07,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:46:07,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1062013.3333333333, ans=0.125 2023-10-02 23:46:08,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:46:09,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1062013.3333333333, ans=0.1 2023-10-02 23:46:11,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:46:12,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.69 vs. limit=22.5 2023-10-02 23:46:14,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:46:16,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:46:17,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1062080.0, ans=0.125 2023-10-02 23:46:18,594 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:46:19,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:46:19,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 23:46:19,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:46:21,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:46:31,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1062146.6666666667, ans=0.2 2023-10-02 23:46:45,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1062213.3333333333, ans=0.2 2023-10-02 23:46:45,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1062213.3333333333, ans=0.04949747468305833 2023-10-02 23:47:07,998 INFO [train.py:1046] (2/4) Epoch 30, batch 5300, loss[loss=0.1463, simple_loss=0.2259, pruned_loss=0.03331, over 24595.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2406, pruned_loss=0.04289, over 4713066.13 frames. ], batch size: 60, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:47:15,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1062346.6666666667, ans=0.07 2023-10-02 23:47:17,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1062346.6666666667, ans=0.125 2023-10-02 23:47:21,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:47:21,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 23:47:21,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 23:47:21,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:22,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:22,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:22,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:22,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:22,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:22,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:22,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:47:23,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:47:23,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 23:47:23,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 23:47:23,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 23:47:23,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:47:23,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 23:47:23,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 23:47:23,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:23,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:23,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:47:23,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:47:23,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:47:24,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:47:24,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:24,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:24,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:47:24,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:24,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:47:24,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:47:25,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 23:47:25,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:47:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:25,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 23:47:25,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 23:47:25,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:47:25,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:47:25,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 23:47:25,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 23:47:25,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:47:26,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:47:26,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:47:26,520 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 23:47:27,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 23:47:27,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:47:27,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:27,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 23:47:27,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 23:47:27,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 23:47:27,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:47:31,431 INFO [train.py:1046] (2/4) Epoch 31, batch 0, loss[loss=0.1571, simple_loss=0.2391, pruned_loss=0.03751, over 23640.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2391, pruned_loss=0.03751, over 23640.00 frames. ], batch size: 149, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:47:31,431 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-02 23:47:41,638 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.3552, 3.0520, 3.3109, 3.4895, 3.1363, 3.7368, 3.4652, 3.8942], device='cuda:2') 2023-10-02 23:47:42,338 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.2879, 4.0220, 3.7281, 3.8341], device='cuda:2') 2023-10-02 23:47:43,361 INFO [train.py:1078] (2/4) Epoch 31, validation: loss=0.3244, simple_loss=0.2676, pruned_loss=0.1906, over 1125622.00 frames. 2023-10-02 23:47:43,361 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-02 23:47:43,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 23:47:44,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:47:47,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:47:52,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:47:52,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:47:53,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:53,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 23:47:55,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 23:47:58,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:58,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:48:01,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:48:01,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:02,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:48:02,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:48:05,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 23:48:06,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:48:15,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:48:15,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:18,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 23:48:22,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:48:22,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:48:24,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:48:28,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:48:33,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:48:33,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1062626.6666666667, ans=0.125 2023-10-02 23:48:36,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1062626.6666666667, ans=0.2 2023-10-02 23:48:37,456 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.906e+02 2.085e+02 2.438e+02 4.411e+02, threshold=4.170e+02, percent-clipped=1.0 2023-10-02 23:48:37,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 23:48:40,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 23:48:40,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:48:40,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:42,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:48:42,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:44,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 23:48:46,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:47,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:47,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1062693.3333333333, ans=0.05 2023-10-02 23:48:50,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:48:53,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1062693.3333333333, ans=0.0 2023-10-02 23:48:55,923 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 23:48:57,137 INFO [train.py:1046] (2/4) Epoch 31, batch 50, loss[loss=0.1706, simple_loss=0.2452, pruned_loss=0.04798, over 23787.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2453, pruned_loss=0.04269, over 1075450.23 frames. ], batch size: 179, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:48:57,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:48:57,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1062760.0, ans=0.125 2023-10-02 23:48:57,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1062760.0, ans=0.0 2023-10-02 23:49:00,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:49:00,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1062760.0, ans=0.1 2023-10-02 23:49:03,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:03,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 23:49:04,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:49:04,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:49:06,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:07,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:10,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:49:13,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 23:49:13,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:19,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:49:20,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 23:49:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 23:49:23,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:49:25,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:49:25,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:27,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:49:27,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:49:28,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:49:28,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:35,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.34 vs. limit=22.5 2023-10-02 23:49:36,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:49:37,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:49:37,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:49:38,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 23:49:39,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1062893.3333333333, ans=0.125 2023-10-02 23:49:40,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:49:41,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:49:41,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 23:49:41,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:43,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1062960.0, ans=0.07 2023-10-02 23:49:44,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 23:49:50,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:49:50,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:49:51,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:54,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:49:54,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:49:56,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 23:49:56,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 23:49:58,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:58,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:49:59,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:49:59,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:59,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 23:50:00,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 23:50:02,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 23:50:03,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:50:04,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1063026.6666666667, ans=0.0 2023-10-02 23:50:05,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 23:50:05,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 23:50:05,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:06,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:50:08,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:50:08,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:50:10,741 INFO [train.py:1046] (2/4) Epoch 31, batch 100, loss[loss=0.1558, simple_loss=0.2259, pruned_loss=0.04289, over 23748.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2451, pruned_loss=0.04321, over 1887494.65 frames. ], batch size: 164, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:50:10,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:50:14,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:50:18,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:50:19,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 23:50:19,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:50:23,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:50:25,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:50:25,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:50:25,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:50:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:50:25,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 23:50:28,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:50:28,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:30,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:50:30,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:50:32,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1063160.0, ans=0.0 2023-10-02 23:50:33,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 23:50:33,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:34,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:50:36,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:50:39,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:50:42,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 23:50:42,232 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 23:50:42,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1063226.6666666667, ans=0.125 2023-10-02 23:50:43,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:50:43,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:50:48,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:50:51,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:53,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:50:56,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:50:57,797 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 23:50:59,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 23:51:04,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:51:06,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:51:06,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1063293.3333333333, ans=0.125 2023-10-02 23:51:07,426 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.861e+02 2.149e+02 2.468e+02 3.325e+02, threshold=4.297e+02, percent-clipped=0.0 2023-10-02 23:51:07,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:09,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1063360.0, ans=0.125 2023-10-02 23:51:10,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:11,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:51:13,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:51:15,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:17,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:18,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:51:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:21,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 23:51:21,489 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 23:51:21,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:21,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:51:22,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:22,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:23,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 23:51:23,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:51:23,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:51:24,290 INFO [train.py:1046] (2/4) Epoch 31, batch 150, loss[loss=0.1502, simple_loss=0.2342, pruned_loss=0.03307, over 24662.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2454, pruned_loss=0.04331, over 2525318.04 frames. ], batch size: 65, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:51:24,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:24,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:26,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:26,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:51:26,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:51:28,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:51:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:51:32,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:35,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:35,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:38,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:51:39,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:43,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 23:51:44,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 23:51:44,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 23:51:47,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:51:47,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:51:48,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:51:49,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:49,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:49,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:49,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:51,354 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 23:51:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:00,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:52:03,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:52:05,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 23:52:07,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:52:07,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:52:07,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:52:10,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.91 vs. limit=15.0 2023-10-02 23:52:11,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:52:11,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:52:12,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:52:13,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:14,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 23:52:18,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:18,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:19,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:52:19,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:52:20,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 23:52:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:52:23,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:52:25,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:52:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:52:28,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 23:52:28,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:52:28,520 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 23:52:31,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1063693.3333333333, ans=0.0 2023-10-02 23:52:33,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:34,973 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-02 23:52:37,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:52:37,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:52:38,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1063760.0, ans=0.0 2023-10-02 23:52:39,130 INFO [train.py:1046] (2/4) Epoch 31, batch 200, loss[loss=0.1802, simple_loss=0.2486, pruned_loss=0.0559, over 23776.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.245, pruned_loss=0.04377, over 3008525.13 frames. ], batch size: 179, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:52:40,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 23:52:41,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:52:41,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:44,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 23:52:46,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:52:47,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:47,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:51,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:52:51,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:51,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:04,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1063826.6666666667, ans=0.0 2023-10-02 23:53:11,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:53:11,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:53:14,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:53:14,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:53:15,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:53:15,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:53:17,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:17,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:53:18,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:53:18,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:53:20,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 23:53:20,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:53:21,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:25,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:53:32,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:53:34,994 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.946e+02 2.118e+02 2.313e+02 3.373e+02, threshold=4.235e+02, percent-clipped=0.0 2023-10-02 23:53:38,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:38,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:53:44,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:45,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 23:53:47,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:47,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:53:48,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:53:48,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:53:50,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 23:53:51,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:53:51,377 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 23:53:52,682 INFO [train.py:1046] (2/4) Epoch 31, batch 250, loss[loss=0.1576, simple_loss=0.2216, pruned_loss=0.04677, over 23374.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2432, pruned_loss=0.04329, over 3388734.52 frames. ], batch size: 285, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:53:52,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:54,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:53:54,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1064093.3333333333, ans=0.0 2023-10-02 23:53:55,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:55,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:56,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:53:56,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:57,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1064093.3333333333, ans=0.125 2023-10-02 23:54:00,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:54:03,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:54:06,948 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=12.0 2023-10-02 23:54:07,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1064160.0, ans=0.0 2023-10-02 23:54:16,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:54:17,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:54:17,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:54:20,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1064226.6666666667, ans=0.125 2023-10-02 23:54:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:54:24,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:54:26,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:54:26,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:54:27,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:54:27,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:54:28,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:54:29,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1064226.6666666667, ans=0.125 2023-10-02 23:54:32,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:54:35,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 23:54:35,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:54:37,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:54:37,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:54:38,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:54:39,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:54:39,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:54:39,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:54:42,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:54:44,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:54:44,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:54:48,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:54:50,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1064360.0, ans=0.2 2023-10-02 23:54:51,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:54:54,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:54:58,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:55:00,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:55:05,075 INFO [train.py:1046] (2/4) Epoch 31, batch 300, loss[loss=0.1811, simple_loss=0.2551, pruned_loss=0.0535, over 24021.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2418, pruned_loss=0.04264, over 3679875.72 frames. ], batch size: 86, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:55:05,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 23:55:07,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:55:07,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:55:09,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 23:55:10,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:55:10,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:55:10,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 23:55:15,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:55:17,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:55:20,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:55:20,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 23:55:21,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:55:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:55:23,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 23:55:23,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:55:25,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.80 vs. limit=15.0 2023-10-02 23:55:27,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:55:30,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:55:30,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1064493.3333333333, ans=0.1 2023-10-02 23:55:31,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 23:55:33,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 23:55:34,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:37,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:55:39,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:39,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 23:55:39,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:55:42,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:55:45,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:55:45,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:55:47,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1064560.0, ans=0.05 2023-10-02 23:55:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:55:48,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 23:55:48,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1064560.0, ans=0.1 2023-10-02 23:55:49,003 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.92 vs. limit=15.0 2023-10-02 23:55:49,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:55:52,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:52,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 23:55:52,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1064626.6666666667, ans=0.1 2023-10-02 23:55:53,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:55:56,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:55:59,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:55:59,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 23:55:59,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1064626.6666666667, ans=0.125 2023-10-02 23:56:02,024 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.918e+02 2.245e+02 2.640e+02 3.587e+02, threshold=4.490e+02, percent-clipped=0.0 2023-10-02 23:56:04,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:04,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:56:05,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:08,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:56:08,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 23:56:08,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:56:08,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:11,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 23:56:12,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:12,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:14,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:56:16,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:16,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:20,418 INFO [train.py:1046] (2/4) Epoch 31, batch 350, loss[loss=0.1559, simple_loss=0.226, pruned_loss=0.04289, over 23588.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2408, pruned_loss=0.04261, over 3898221.73 frames. ], batch size: 256, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:56:20,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:56:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 23:56:24,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:24,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1064760.0, ans=0.125 2023-10-02 23:56:29,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1064760.0, ans=0.125 2023-10-02 23:56:29,145 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:56:30,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:56:31,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:31,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:34,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 23:56:35,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1064826.6666666667, ans=0.035 2023-10-02 23:56:36,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:56:37,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 23:56:40,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:40,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 23:56:41,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:45,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 23:56:47,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:56:47,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:48,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:56:50,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:56:50,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:56:50,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:56:50,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:51,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:56:53,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:56:53,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:56:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:57:00,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:57:01,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:05,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 23:57:07,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:57:11,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:11,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:11,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:57:11,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1064960.0, ans=0.0 2023-10-02 23:57:13,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 23:57:16,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:17,940 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 23:57:18,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 23:57:18,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:22,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:57:22,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 23:57:24,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:27,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:57:27,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:29,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:29,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:30,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:33,327 INFO [train.py:1046] (2/4) Epoch 31, batch 400, loss[loss=0.1567, simple_loss=0.2365, pruned_loss=0.03847, over 23187.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2405, pruned_loss=0.04209, over 4091552.45 frames. ], batch size: 119, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:57:33,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:57:34,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:57:36,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 23:57:36,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:37,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:57:40,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:57:41,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:43,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:45,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:46,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 23:57:48,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 23:57:48,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:57:49,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 23:57:51,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:53,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:57:53,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:53,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 23:57:53,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:57:53,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:53,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:55,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:58,199 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 23:57:58,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 23:58:03,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:58:03,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:58:05,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 23:58:06,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 23:58:09,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:58:11,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:58:13,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1065226.6666666667, ans=0.1 2023-10-02 23:58:16,553 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.525e-03 2023-10-02 23:58:19,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 23:58:22,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:58:23,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 23:58:26,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:58:26,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:58:27,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 23:58:29,307 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.935e+02 2.133e+02 2.545e+02 3.728e+02, threshold=4.266e+02, percent-clipped=0.0 2023-10-02 23:58:29,740 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:58:30,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:58:32,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1065360.0, ans=0.125 2023-10-02 23:58:33,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:58:34,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:58:37,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:58:39,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 23:58:40,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:58:42,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 23:58:44,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:58:44,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:58:46,826 INFO [train.py:1046] (2/4) Epoch 31, batch 450, loss[loss=0.1538, simple_loss=0.2355, pruned_loss=0.03608, over 24304.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2409, pruned_loss=0.04205, over 4233013.28 frames. ], batch size: 61, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:58:46,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 23:58:48,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:58:48,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:58:50,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 23:58:50,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 23:58:51,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:58:53,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:58:53,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:58:53,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 23:58:54,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:58:55,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:58:57,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:59:05,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:05,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:06,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 23:59:07,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 23:59:10,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:59:13,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:15,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:59:19,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:59:19,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:59:22,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 23:59:22,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 23:59:24,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 23:59:25,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:59:26,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:59:26,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:59:28,206 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 23:59:28,224 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 23:59:28,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:29,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:59:32,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:59:32,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:59:34,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:59:34,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 23:59:34,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 23:59:36,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:38,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:59:40,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:59:42,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 23:59:46,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:59:46,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 23:59:46,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1065693.3333333333, ans=0.0 2023-10-02 23:59:49,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 23:59:50,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:52,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1065693.3333333333, ans=0.125 2023-10-02 23:59:55,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:59:56,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:59:58,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:59:58,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 23:59:58,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1065693.3333333333, ans=0.125 2023-10-03 00:00:00,933 INFO [train.py:1046] (2/4) Epoch 31, batch 500, loss[loss=0.156, simple_loss=0.2313, pruned_loss=0.04035, over 23300.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2416, pruned_loss=0.04227, over 4353843.92 frames. ], batch size: 119, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:00:01,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:00:02,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:00:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:02,386 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 00:00:03,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1065760.0, ans=0.0 2023-10-03 00:00:04,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 00:00:04,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:00:11,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:00:11,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:00:14,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:00:14,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:00:14,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:16,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1065826.6666666667, ans=0.1 2023-10-03 00:00:17,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1065826.6666666667, ans=0.125 2023-10-03 00:00:21,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1065826.6666666667, ans=0.0 2023-10-03 00:00:27,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:27,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:00:27,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:00:27,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:28,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 00:00:28,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:00:32,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:00:32,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:00:32,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:00:32,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:34,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 00:00:37,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 00:00:38,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:00:40,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:00:44,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 00:00:47,227 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.73 vs. limit=6.0 2023-10-03 00:00:47,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:00:49,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:00:54,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:54,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1065960.0, ans=0.07 2023-10-03 00:00:57,866 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.909e+02 2.161e+02 2.446e+02 3.681e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-03 00:00:58,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:58,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1066026.6666666667, ans=0.125 2023-10-03 00:01:03,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:01:06,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 00:01:06,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:06,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:01:09,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 00:01:10,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:01:11,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:13,782 INFO [train.py:1046] (2/4) Epoch 31, batch 550, loss[loss=0.1829, simple_loss=0.2548, pruned_loss=0.05547, over 23287.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.242, pruned_loss=0.04263, over 4437720.88 frames. ], batch size: 285, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:01:16,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 00:01:17,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 00:01:18,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:01:19,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 00:01:20,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:01:20,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:01:20,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:21,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:22,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:01:24,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:01:25,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:25,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 00:01:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:01:27,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.20 vs. limit=15.0 2023-10-03 00:01:29,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:30,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:31,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:01:33,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:37,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 00:01:37,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 00:01:37,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:01:44,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:01:45,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:01:45,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:01:48,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:48,705 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 00:01:50,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:51,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:01:55,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:01:56,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:01:56,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:01:57,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:59,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 00:02:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 00:02:02,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:02,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:02:02,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:02:02,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:02:04,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1066293.3333333333, ans=0.0 2023-10-03 00:02:06,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:02:07,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:02:09,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:02:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:11,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 00:02:12,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:02:14,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:14,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:02:16,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:02:18,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 00:02:25,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 00:02:25,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1066360.0, ans=0.125 2023-10-03 00:02:26,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 00:02:28,022 INFO [train.py:1046] (2/4) Epoch 31, batch 600, loss[loss=0.1486, simple_loss=0.2315, pruned_loss=0.03282, over 24571.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2418, pruned_loss=0.0426, over 4488587.68 frames. ], batch size: 60, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:02:29,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:02:29,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:02:29,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:32,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1066426.6666666667, ans=0.125 2023-10-03 00:02:34,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1066426.6666666667, ans=0.1 2023-10-03 00:02:35,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:02:39,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:02:39,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 00:02:42,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:02:43,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:02:45,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:49,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 00:02:49,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:02:55,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 00:02:57,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:02:57,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:59,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:03:04,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:03:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:03:04,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:11,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:03:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:15,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:03:15,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:03:15,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1066626.6666666667, ans=0.125 2023-10-03 00:03:17,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1066626.6666666667, ans=0.0 2023-10-03 00:03:20,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 00:03:27,628 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.826e+02 1.991e+02 2.219e+02 3.636e+02, threshold=3.983e+02, percent-clipped=0.0 2023-10-03 00:03:27,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:03:27,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:03:30,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 00:03:32,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:03:33,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 00:03:34,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:03:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:03:42,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 00:03:43,530 INFO [train.py:1046] (2/4) Epoch 31, batch 650, loss[loss=0.1736, simple_loss=0.2426, pruned_loss=0.05224, over 23854.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2403, pruned_loss=0.0431, over 4520079.25 frames. ], batch size: 195, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:03:43,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:03:45,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:03:47,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1066760.0, ans=0.125 2023-10-03 00:03:48,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:03:50,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:03:52,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1066760.0, ans=0.125 2023-10-03 00:03:53,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 00:03:53,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:57,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:03:57,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:03:59,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1066826.6666666667, ans=0.0 2023-10-03 00:04:01,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:03,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1066826.6666666667, ans=0.125 2023-10-03 00:04:06,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 00:04:08,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:04:08,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:12,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:04:12,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:04:15,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:15,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:16,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:04:18,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:19,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:04:20,446 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.97 vs. limit=15.0 2023-10-03 00:04:21,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:04:21,341 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 00:04:22,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:22,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:04:25,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:25,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:04:25,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:04:28,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 00:04:28,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:04:29,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:04:29,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1066960.0, ans=0.0 2023-10-03 00:04:31,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:04:32,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:04:33,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:04:34,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 00:04:35,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 00:04:35,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:35,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:04:36,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:04:36,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:04:37,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:45,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:46,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:04:46,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:50,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:50,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:04:50,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:54,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.56 vs. limit=15.0 2023-10-03 00:04:56,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1067093.3333333333, ans=15.0 2023-10-03 00:04:57,403 INFO [train.py:1046] (2/4) Epoch 31, batch 700, loss[loss=0.1612, simple_loss=0.2538, pruned_loss=0.03424, over 24546.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2405, pruned_loss=0.04276, over 4575977.77 frames. ], batch size: 71, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:04:57,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:04:57,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:04:58,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:04:58,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:02,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 00:05:04,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 00:05:07,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 00:05:07,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:08,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:05:08,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 00:05:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:05:14,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:05:16,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:19,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:05:21,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:05:22,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:25,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 00:05:25,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:05:26,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 00:05:28,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1067226.6666666667, ans=0.125 2023-10-03 00:05:29,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 00:05:33,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:05:34,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:05:35,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:05:35,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1067226.6666666667, ans=0.0 2023-10-03 00:05:39,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:05:40,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 00:05:47,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:47,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:05:47,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 00:05:50,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:05:51,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:54,704 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.944e+02 2.173e+02 2.571e+02 3.380e+02, threshold=4.347e+02, percent-clipped=0.0 2023-10-03 00:05:54,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:00,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:06:00,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 00:06:00,624 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:06:03,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 00:06:03,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 00:06:05,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:07,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:06:08,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:06:08,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1067426.6666666667, ans=0.2 2023-10-03 00:06:09,786 INFO [train.py:1046] (2/4) Epoch 31, batch 750, loss[loss=0.1663, simple_loss=0.2557, pruned_loss=0.03848, over 24551.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2406, pruned_loss=0.04228, over 4612120.28 frames. ], batch size: 71, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:06:11,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 00:06:15,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 00:06:15,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 00:06:17,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 00:06:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 00:06:19,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 00:06:19,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:06:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 00:06:20,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:22,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:06:24,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:25,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:06:25,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:06:26,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:06:28,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:06:28,538 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:06:29,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:06:31,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:06:33,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:33,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:06:33,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 00:06:35,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:06:35,948 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.79 vs. limit=10.0 2023-10-03 00:06:36,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:38,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:39,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:06:40,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 00:06:40,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:06:42,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 00:06:42,467 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 00:06:43,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 00:06:43,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:06:43,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:06:47,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:06:53,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:06:55,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:06:55,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:06:57,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1067626.6666666667, ans=0.035 2023-10-03 00:06:58,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:07:00,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:00,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 00:07:00,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:07:02,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 00:07:02,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:07:03,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1067626.6666666667, ans=0.2 2023-10-03 00:07:04,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1067626.6666666667, ans=0.0 2023-10-03 00:07:05,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:07:05,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 00:07:05,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:06,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1067626.6666666667, ans=0.1 2023-10-03 00:07:10,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:12,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:07:12,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:13,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:07:18,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 00:07:18,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:07:18,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:19,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1067693.3333333333, ans=0.0 2023-10-03 00:07:20,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:20,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:23,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:25,398 INFO [train.py:1046] (2/4) Epoch 31, batch 800, loss[loss=0.1648, simple_loss=0.2504, pruned_loss=0.03955, over 24545.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2408, pruned_loss=0.04206, over 4637444.30 frames. ], batch size: 71, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:07:25,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:07:31,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:31,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:32,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:07:32,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:35,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:35,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:36,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:39,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:41,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:07:44,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 00:07:45,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:46,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:46,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:07:48,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:07:48,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 00:07:48,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:48,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 00:07:52,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:53,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1067893.3333333333, ans=0.125 2023-10-03 00:07:56,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:56,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1067893.3333333333, ans=0.2 2023-10-03 00:07:57,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:57,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:07:59,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:59,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:08:03,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:08:05,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:08:05,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 00:08:05,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1067893.3333333333, ans=0.125 2023-10-03 00:08:06,447 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 00:08:06,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1067893.3333333333, ans=0.125 2023-10-03 00:08:07,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 00:08:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:08:07,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:08:09,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:09,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:08:09,582 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:08:13,909 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 00:08:15,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 00:08:15,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:08:18,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:08:22,714 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 2.007e+02 2.299e+02 2.659e+02 4.036e+02, threshold=4.599e+02, percent-clipped=0.0 2023-10-03 00:08:24,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:08:25,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1068026.6666666667, ans=0.125 2023-10-03 00:08:27,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:08:27,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1068026.6666666667, ans=0.125 2023-10-03 00:08:28,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 00:08:29,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:08:32,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 00:08:32,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1068026.6666666667, ans=0.125 2023-10-03 00:08:37,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:08:39,168 INFO [train.py:1046] (2/4) Epoch 31, batch 850, loss[loss=0.1602, simple_loss=0.2448, pruned_loss=0.03778, over 24485.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2412, pruned_loss=0.04221, over 4663920.28 frames. ], batch size: 66, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:08:39,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:08:40,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 00:08:40,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:08:41,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:43,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 00:08:43,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:08:44,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:08:46,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:08:48,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:08:48,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:08:49,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 00:08:51,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 00:08:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 00:08:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:08:53,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:08:54,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:08:54,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:55,327 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.02 vs. limit=6.0 2023-10-03 00:08:55,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:08:59,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1068160.0, ans=0.2 2023-10-03 00:09:00,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:09:00,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:01,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 00:09:01,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1068160.0, ans=0.0 2023-10-03 00:09:03,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 00:09:09,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:09:09,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 00:09:13,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 00:09:13,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 00:09:15,103 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 00:09:16,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:09:16,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:09:16,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:09:18,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:19,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:19,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 00:09:22,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:09:23,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:24,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:09:24,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:09:25,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:09:27,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:09:27,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 00:09:32,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:09:32,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:09:33,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:09:33,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:09:33,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:37,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:39,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:09:41,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:09:41,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:09:43,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:09:50,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:09:52,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:09:52,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 00:09:52,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:09:53,674 INFO [train.py:1046] (2/4) Epoch 31, batch 900, loss[loss=0.1815, simple_loss=0.2557, pruned_loss=0.05365, over 22729.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2427, pruned_loss=0.04316, over 4672500.98 frames. ], batch size: 322, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:09:53,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:09:55,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 00:09:56,244 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.49 vs. limit=15.0 2023-10-03 00:09:58,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=15.0 2023-10-03 00:10:03,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:10:06,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:10:07,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 00:10:10,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:10:10,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 00:10:11,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 00:10:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:10:12,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:13,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:10:13,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:10:21,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:10:21,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:10:21,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:10:26,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:30,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 00:10:32,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:10:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:10:37,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:10:38,422 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 00:10:38,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 00:10:40,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.21 vs. limit=6.0 2023-10-03 00:10:42,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:10:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:10:44,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:10:48,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1068626.6666666667, ans=0.125 2023-10-03 00:10:50,571 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.883e+02 2.040e+02 2.401e+02 4.175e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 00:10:50,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:10:50,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:10:53,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 00:10:53,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:56,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 00:10:58,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:10:58,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:00,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:11:00,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:06,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 00:11:06,978 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 00:11:08,233 INFO [train.py:1046] (2/4) Epoch 31, batch 950, loss[loss=0.1632, simple_loss=0.236, pruned_loss=0.04518, over 23202.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2428, pruned_loss=0.04342, over 4673710.26 frames. ], batch size: 105, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:11:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 00:11:08,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 00:11:10,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:13,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 00:11:16,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:18,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:19,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:19,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:11:22,366 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 00:11:25,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:25,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:11:27,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:28,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:11:28,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 00:11:30,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:11:30,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:32,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 00:11:32,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:11:32,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1068826.6666666667, ans=0.125 2023-10-03 00:11:35,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:35,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:11:36,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:36,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 00:11:39,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 00:11:40,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:11:42,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:11:47,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:11:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:50,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 00:11:54,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 00:11:54,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:11:54,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:11:54,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1068960.0, ans=0.125 2023-10-03 00:11:55,316 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.99 vs. limit=15.0 2023-10-03 00:11:56,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:56,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:12:00,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 00:12:01,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:12:03,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:04,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:12:04,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 00:12:04,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:12:04,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:12:04,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1068960.0, ans=0.1 2023-10-03 00:12:05,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 00:12:08,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:12:11,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:12:15,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:12:16,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 00:12:16,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 00:12:17,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1069026.6666666667, ans=0.125 2023-10-03 00:12:20,939 INFO [train.py:1046] (2/4) Epoch 31, batch 1000, loss[loss=0.1496, simple_loss=0.2246, pruned_loss=0.03727, over 23244.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2416, pruned_loss=0.04311, over 4678119.43 frames. ], batch size: 51, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:12:20,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:12:24,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 00:12:24,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:12:28,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:12:30,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 00:12:30,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 00:12:36,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:12:36,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:12:37,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:39,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 00:12:41,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1069160.0, ans=0.1 2023-10-03 00:12:42,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 00:12:42,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 00:12:43,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:12:45,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 00:12:46,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 00:12:46,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 00:12:48,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:12:48,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:12:54,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1069226.6666666667, ans=0.1 2023-10-03 00:12:58,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:13:01,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:01,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:13:02,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.65 vs. limit=15.0 2023-10-03 00:13:02,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 00:13:02,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:13:04,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:13:04,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:13:06,102 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 00:13:07,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 00:13:09,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 00:13:11,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 00:13:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:13:18,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:19,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1069360.0, ans=0.125 2023-10-03 00:13:20,108 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.823e+02 1.971e+02 2.145e+02 3.229e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 00:13:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:13:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:21,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:13:24,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 00:13:26,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:13:26,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 00:13:28,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 00:13:28,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:13:28,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:13:31,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:13:32,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:13:35,564 INFO [train.py:1046] (2/4) Epoch 31, batch 1050, loss[loss=0.165, simple_loss=0.2386, pruned_loss=0.04572, over 23714.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2409, pruned_loss=0.04274, over 4691114.96 frames. ], batch size: 232, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:13:35,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:13:37,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:13:38,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:13:41,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 00:13:42,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:44,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:13:47,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:13:48,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:13:49,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:13:50,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1069493.3333333333, ans=0.125 2023-10-03 00:13:51,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:13:52,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:13:52,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:13:52,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1069493.3333333333, ans=0.125 2023-10-03 00:13:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 00:13:55,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:13:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 00:13:57,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:13:57,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 00:13:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:14:04,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:14:04,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:14:04,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:14:07,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1069560.0, ans=0.0 2023-10-03 00:14:07,594 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-10-03 00:14:08,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 00:14:08,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 00:14:08,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:14:10,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 00:14:11,627 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.19 vs. limit=6.0 2023-10-03 00:14:13,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 00:14:13,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:15,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1069560.0, ans=0.125 2023-10-03 00:14:17,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:14:18,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1069626.6666666667, ans=0.0 2023-10-03 00:14:19,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:14:20,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:14:20,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:14:26,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:14:30,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 00:14:30,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 00:14:31,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 00:14:31,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:14:31,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:14:31,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1069626.6666666667, ans=0.1 2023-10-03 00:14:34,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 00:14:37,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:14:37,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1069693.3333333333, ans=0.125 2023-10-03 00:14:40,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:14:40,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:14:40,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:14:42,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:44,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:45,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 00:14:46,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:14:46,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 00:14:46,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 00:14:47,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:14:49,226 INFO [train.py:1046] (2/4) Epoch 31, batch 1100, loss[loss=0.1671, simple_loss=0.2394, pruned_loss=0.04741, over 23689.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2404, pruned_loss=0.04224, over 4697486.24 frames. ], batch size: 149, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:14:50,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:14:56,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:14:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:15:00,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:15:01,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:01,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 00:15:02,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:04,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:15:04,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1069826.6666666667, ans=0.1 2023-10-03 00:15:06,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:15:10,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:15:10,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 00:15:11,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:15:12,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:12,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:15:15,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:15:17,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:15:21,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:15:21,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1069893.3333333333, ans=0.125 2023-10-03 00:15:24,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 00:15:26,062 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 00:15:26,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:28,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.91 vs. limit=6.0 2023-10-03 00:15:29,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:29,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:15:30,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:15:31,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1069893.3333333333, ans=0.125 2023-10-03 00:15:32,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 00:15:32,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:15:32,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:15:32,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:15:33,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:33,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 00:15:41,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:15:41,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 00:15:44,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:15:48,044 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.833e+02 1.997e+02 2.295e+02 4.959e+02, threshold=3.994e+02, percent-clipped=1.0 2023-10-03 00:15:48,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:15:52,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 00:15:52,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 00:15:53,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:55,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:55,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 00:15:56,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1070026.6666666667, ans=0.025 2023-10-03 00:15:57,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:15:57,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:59,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 00:15:59,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:15:59,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 00:16:01,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:01,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:16:01,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:16:02,548 INFO [train.py:1046] (2/4) Epoch 31, batch 1150, loss[loss=0.179, simple_loss=0.2548, pruned_loss=0.05163, over 23792.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2416, pruned_loss=0.0426, over 4702127.58 frames. ], batch size: 164, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:16:06,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:08,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:16:08,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1070093.3333333333, ans=0.0 2023-10-03 00:16:10,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:16:10,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:16:10,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 00:16:12,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:16:13,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1070093.3333333333, ans=0.0 2023-10-03 00:16:14,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 00:16:16,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:17,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:16:23,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 00:16:25,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:16:27,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:29,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:30,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 00:16:30,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:16:30,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:16:32,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1070226.6666666667, ans=0.0 2023-10-03 00:16:33,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 00:16:35,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:16:36,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:16:46,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:51,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:52,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 00:16:52,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:54,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:00,276 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 00:17:02,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1070360.0, ans=0.0 2023-10-03 00:17:03,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:06,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1070360.0, ans=0.125 2023-10-03 00:17:10,754 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 00:17:13,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:14,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:17:14,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:17:16,186 INFO [train.py:1046] (2/4) Epoch 31, batch 1200, loss[loss=0.1606, simple_loss=0.2444, pruned_loss=0.03836, over 23632.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2419, pruned_loss=0.04246, over 4722610.25 frames. ], batch size: 94, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:17:16,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:17:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:17:23,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:17:23,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:17:26,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:17:26,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:26,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:17:27,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:17:30,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:17:32,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:17:32,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:34,635 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 00:17:36,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 00:17:38,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:17:42,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:17:44,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:17:44,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:17:44,999 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 00:17:47,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:17:53,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:17:53,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 00:17:55,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:17:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 00:18:03,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 00:18:04,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:18:05,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:18:07,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:07,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:18:08,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:18:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:18:08,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:18:08,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 00:18:10,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:18:10,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:18:10,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:18:13,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:18:13,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:16,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:18:17,598 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.961e+02 2.154e+02 2.388e+02 3.166e+02, threshold=4.308e+02, percent-clipped=0.0 2023-10-03 00:18:17,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:18:21,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 00:18:24,008 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 00:18:24,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:18:26,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:18:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:18:29,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:18:31,672 INFO [train.py:1046] (2/4) Epoch 31, batch 1250, loss[loss=0.2298, simple_loss=0.2891, pruned_loss=0.08527, over 19540.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2434, pruned_loss=0.04329, over 4706795.39 frames. ], batch size: 388, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:18:31,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 00:18:35,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:18:37,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:18:37,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 00:18:38,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:18:38,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:18:40,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:18:42,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:18:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:18:43,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:18:46,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=7.96 vs. limit=12.0 2023-10-03 00:18:46,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:18:49,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 00:18:49,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:18:49,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:18:49,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:18:51,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:18:52,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:52,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:18:57,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 00:18:57,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:19:01,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:19:03,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 00:19:03,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:19:03,314 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 00:19:03,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:19:10,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:19:10,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:19:12,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 00:19:12,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 00:19:12,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 00:19:14,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1070893.3333333333, ans=0.1 2023-10-03 00:19:15,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:19:17,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 00:19:17,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:19,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 00:19:19,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:19:22,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 00:19:22,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:19:22,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:19:24,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:19:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:19:27,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 00:19:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:19:28,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:19:30,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:19:33,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:19:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:19:37,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 00:19:41,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:19:42,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:19:44,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:19:45,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1071093.3333333333, ans=0.0 2023-10-03 00:19:46,260 INFO [train.py:1046] (2/4) Epoch 31, batch 1300, loss[loss=0.1546, simple_loss=0.2208, pruned_loss=0.04418, over 23490.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2433, pruned_loss=0.04375, over 4702210.64 frames. ], batch size: 285, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:19:46,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:47,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:19:47,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 00:19:51,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:19:53,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:19:55,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 00:19:58,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:20:02,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1071160.0, ans=0.2 2023-10-03 00:20:03,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:04,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:20:05,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:20:06,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.35 vs. limit=15.0 2023-10-03 00:20:06,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:08,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:20:08,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:20:08,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 00:20:08,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1071160.0, ans=0.0 2023-10-03 00:20:12,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1071160.0, ans=0.125 2023-10-03 00:20:14,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:20:15,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:20:15,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1071226.6666666667, ans=0.125 2023-10-03 00:20:17,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 00:20:18,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:20:20,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:20:21,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:20:22,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 00:20:22,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:20:22,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 00:20:24,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1071226.6666666667, ans=0.125 2023-10-03 00:20:25,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:20:29,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:20:29,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:20:33,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 00:20:34,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 00:20:34,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 00:20:34,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1071293.3333333333, ans=0.1 2023-10-03 00:20:39,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:20:40,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1071293.3333333333, ans=0.1 2023-10-03 00:20:41,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 00:20:43,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:46,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.122e+02 2.414e+02 4.284e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-03 00:20:51,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 00:20:54,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:20:56,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:00,446 INFO [train.py:1046] (2/4) Epoch 31, batch 1350, loss[loss=0.1614, simple_loss=0.2316, pruned_loss=0.04561, over 23810.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2415, pruned_loss=0.04319, over 4705778.68 frames. ], batch size: 179, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:21:01,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:21:01,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:21:02,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:21:03,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:21:06,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:21:06,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 00:21:09,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:21:09,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:21:11,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 00:21:11,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:21:12,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1071426.6666666667, ans=0.125 2023-10-03 00:21:15,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:21:15,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 00:21:16,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 00:21:18,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 00:21:19,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:19,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 00:21:21,621 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:21:26,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.07 vs. limit=22.5 2023-10-03 00:21:32,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:40,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:40,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:21:40,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 00:21:45,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:21:46,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 00:21:46,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:21:46,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:21:49,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:21:51,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1071626.6666666667, ans=0.125 2023-10-03 00:21:52,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 00:21:53,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:21:56,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1071626.6666666667, ans=0.1 2023-10-03 00:21:58,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 00:21:58,769 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.15 vs. limit=10.0 2023-10-03 00:22:01,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 00:22:03,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1071693.3333333333, ans=0.1 2023-10-03 00:22:06,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 00:22:07,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:22:10,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:22:11,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:22:14,809 INFO [train.py:1046] (2/4) Epoch 31, batch 1400, loss[loss=0.1375, simple_loss=0.2145, pruned_loss=0.03026, over 20962.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2406, pruned_loss=0.04274, over 4715235.76 frames. ], batch size: 46, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:22:16,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 00:22:17,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 00:22:25,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:22:26,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1071760.0, ans=0.125 2023-10-03 00:22:28,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:22:28,859 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.04 vs. limit=22.5 2023-10-03 00:22:32,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:22:32,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:22:35,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1071826.6666666667, ans=0.125 2023-10-03 00:22:37,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:22:38,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 00:22:47,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:22:48,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:22:51,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 00:22:52,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:22:53,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:22:54,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:22:54,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:22:56,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:22:56,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:22:56,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1071893.3333333333, ans=0.125 2023-10-03 00:22:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:22:58,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 00:22:58,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:23:02,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:05,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:23:11,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 00:23:12,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 00:23:13,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:23:16,199 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.848e+02 2.033e+02 2.325e+02 3.935e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 00:23:16,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 00:23:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:19,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:23:21,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:23:23,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:23:23,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:23,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 00:23:29,058 INFO [train.py:1046] (2/4) Epoch 31, batch 1450, loss[loss=0.1485, simple_loss=0.2268, pruned_loss=0.03509, over 24271.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2408, pruned_loss=0.04262, over 4713805.35 frames. ], batch size: 56, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:23:29,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:29,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:23:30,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:23:30,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 00:23:32,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:23:34,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 00:23:34,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:36,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:36,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 00:23:37,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:23:38,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:23:38,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 00:23:38,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:41,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:23:42,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:45,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:49,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:23:49,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:23:50,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:50,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:52,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1072160.0, ans=0.125 2023-10-03 00:23:53,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:53,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:23:53,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:54,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:23:59,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 00:24:02,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:24:05,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 00:24:08,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:24:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:24:10,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:11,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 00:24:14,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:16,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 00:24:16,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1072293.3333333333, ans=0.2 2023-10-03 00:24:17,032 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.22 vs. limit=15.0 2023-10-03 00:24:17,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 00:24:17,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1072293.3333333333, ans=0.125 2023-10-03 00:24:19,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:22,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:24:22,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:24:23,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 00:24:25,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 00:24:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 00:24:27,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:27,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:24:31,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1072360.0, ans=0.125 2023-10-03 00:24:38,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1072360.0, ans=0.125 2023-10-03 00:24:41,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 00:24:41,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:24:41,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:24:43,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:43,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:24:44,682 INFO [train.py:1046] (2/4) Epoch 31, batch 1500, loss[loss=0.1642, simple_loss=0.2323, pruned_loss=0.04804, over 23637.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.241, pruned_loss=0.0426, over 4711125.44 frames. ], batch size: 256, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:24:44,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:24:44,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 00:24:47,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:24:47,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:24:47,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:24:48,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:24:51,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:24:51,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:24:56,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:24:56,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 00:24:56,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:24:56,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:24:57,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:59,710 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-03 00:25:02,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 00:25:02,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1072493.3333333333, ans=0.0 2023-10-03 00:25:05,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.18 vs. limit=15.0 2023-10-03 00:25:06,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 00:25:08,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:25:09,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 00:25:11,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:25:14,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:25:14,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:25:16,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:25:17,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 00:25:17,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:25:17,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:25:17,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 00:25:18,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:25:23,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1072560.0, ans=0.1 2023-10-03 00:25:24,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:25:24,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 00:25:24,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1072560.0, ans=0.125 2023-10-03 00:25:29,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:25:30,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:25:35,375 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 00:25:35,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:35,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 00:25:38,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:25:39,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:25:40,297 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 00:25:41,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:25:44,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 00:25:46,098 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.898e+02 2.100e+02 2.437e+02 3.214e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-03 00:25:46,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:49,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:25:49,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:49,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:25:50,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:50,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:25:53,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 00:25:53,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1072693.3333333333, ans=0.2 2023-10-03 00:25:55,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 00:25:55,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:25:55,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 00:25:56,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 00:25:59,107 INFO [train.py:1046] (2/4) Epoch 31, batch 1550, loss[loss=0.1559, simple_loss=0.2372, pruned_loss=0.03733, over 23399.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.241, pruned_loss=0.04238, over 4720203.87 frames. ], batch size: 106, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:26:00,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:26:00,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:01,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:26:01,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:26:03,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:03,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:08,020 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 00:26:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:08,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:26:09,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:26:10,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:26:12,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 00:26:14,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:26:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 00:26:15,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 00:26:15,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 00:26:16,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:18,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:22,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:26:24,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 00:26:24,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 00:26:27,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1072893.3333333333, ans=0.0 2023-10-03 00:26:33,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:37,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:26:37,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:26:37,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:26:37,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 00:26:37,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1072893.3333333333, ans=0.125 2023-10-03 00:26:38,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1072893.3333333333, ans=0.1 2023-10-03 00:26:42,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1072960.0, ans=0.07 2023-10-03 00:26:43,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:26:44,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:45,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1072960.0, ans=0.125 2023-10-03 00:26:46,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:26:50,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:26:50,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1072960.0, ans=0.0 2023-10-03 00:26:52,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:52,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 00:26:52,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:26:53,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:26:54,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:54,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 00:26:54,100 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 00:26:57,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:01,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 00:27:05,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:27:07,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:07,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 00:27:09,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:27:11,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:27:11,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:27:11,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:27:11,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:27:13,100 INFO [train.py:1046] (2/4) Epoch 31, batch 1600, loss[loss=0.1729, simple_loss=0.2571, pruned_loss=0.04434, over 23593.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2412, pruned_loss=0.0422, over 4735136.89 frames. ], batch size: 85, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:27:14,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:16,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 00:27:16,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 00:27:19,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 00:27:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:27:23,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 00:27:23,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:27:24,408 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=12.0 2023-10-03 00:27:26,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:27:30,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:27:30,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1073160.0, ans=0.125 2023-10-03 00:27:35,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 00:27:36,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1073160.0, ans=0.0 2023-10-03 00:27:37,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:27:37,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 00:27:38,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:38,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 00:27:45,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 00:27:51,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:53,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 00:27:53,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:53,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:27:53,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:27:56,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 00:28:00,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 00:28:03,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:28:04,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:04,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:04,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:28:06,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:28:07,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:28:09,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:28:13,238 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.827e+02 1.985e+02 2.199e+02 3.882e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 00:28:14,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:16,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:28:18,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 00:28:18,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:28:18,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 00:28:24,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1073360.0, ans=0.1 2023-10-03 00:28:25,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:28:26,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1073426.6666666667, ans=0.125 2023-10-03 00:28:27,286 INFO [train.py:1046] (2/4) Epoch 31, batch 1650, loss[loss=0.1804, simple_loss=0.2608, pruned_loss=0.05004, over 23956.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2426, pruned_loss=0.04244, over 4725662.00 frames. ], batch size: 80, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:28:27,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:28:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:28:27,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 00:28:27,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 00:28:27,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 00:28:28,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 00:28:30,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1073426.6666666667, ans=0.1 2023-10-03 00:28:33,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:34,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:28:34,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:28:36,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:28:38,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:28:40,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 00:28:41,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:28:42,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1073493.3333333333, ans=0.2 2023-10-03 00:28:43,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:28:43,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:28:43,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:28:44,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 00:28:44,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 00:28:49,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:28:49,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1073493.3333333333, ans=0.125 2023-10-03 00:28:51,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:29:00,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 00:29:00,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 00:29:04,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:07,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:29:07,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:29:07,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:10,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:29:10,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:12,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:14,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:14,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:29:14,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:29:16,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:29:16,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:29:21,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:29:21,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 00:29:23,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:29:23,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 00:29:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 00:29:25,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 00:29:25,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:29:26,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:29:26,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:26,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:26,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 00:29:31,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:32,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:29:33,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:35,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 00:29:40,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:40,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:29:41,371 INFO [train.py:1046] (2/4) Epoch 31, batch 1700, loss[loss=0.1505, simple_loss=0.2351, pruned_loss=0.03293, over 24326.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2416, pruned_loss=0.04205, over 4712520.02 frames. ], batch size: 61, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:29:41,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 00:29:41,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:29:41,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:29:41,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:42,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:29:42,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:29:44,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 00:29:46,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:29:54,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:30:02,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:30:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:30:03,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:30:05,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:30:05,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1073826.6666666667, ans=0.0 2023-10-03 00:30:09,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 00:30:10,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:30:10,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:12,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:30:13,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1073893.3333333333, ans=0.2 2023-10-03 00:30:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:30:15,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 00:30:16,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 00:30:18,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:18,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 00:30:20,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:30:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:29,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:31,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:30:32,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:30:33,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 00:30:33,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:30:35,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:35,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 00:30:36,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:30:36,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:30:36,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:36,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:30:39,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:30:39,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:30:39,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:40,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=15.0 2023-10-03 00:30:41,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:30:41,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:42,285 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.835e+02 2.052e+02 2.305e+02 3.998e+02, threshold=4.103e+02, percent-clipped=1.0 2023-10-03 00:30:45,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:30:47,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 00:30:50,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:51,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:30:54,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 00:30:55,837 INFO [train.py:1046] (2/4) Epoch 31, batch 1750, loss[loss=0.1551, simple_loss=0.2019, pruned_loss=0.05415, over 19303.00 frames. ], tot_loss[loss=0.162, simple_loss=0.24, pruned_loss=0.04198, over 4701919.16 frames. ], batch size: 388, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:30:56,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1074093.3333333333, ans=0.1 2023-10-03 00:30:56,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1074093.3333333333, ans=0.125 2023-10-03 00:30:57,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:58,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1074093.3333333333, ans=0.125 2023-10-03 00:30:58,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1074093.3333333333, ans=0.0 2023-10-03 00:31:00,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1074093.3333333333, ans=0.1 2023-10-03 00:31:01,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:01,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:31:02,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 00:31:02,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:31:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:31:05,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:09,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 00:31:12,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:13,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 00:31:13,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:31:15,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:31:19,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:31:20,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 00:31:20,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1074160.0, ans=0.125 2023-10-03 00:31:21,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:31:23,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 00:31:24,259 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.92 vs. limit=12.0 2023-10-03 00:31:31,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:31:35,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:31:35,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:31:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:38,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:31:41,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:31:42,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:45,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:31:45,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:31:46,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 00:31:48,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:31:50,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 00:31:50,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:31:50,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1074293.3333333333, ans=0.125 2023-10-03 00:31:53,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:53,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:31:57,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:31:59,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 00:32:00,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:32:00,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:32:01,579 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.87 vs. limit=15.0 2023-10-03 00:32:06,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:32:07,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:32:09,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:32:10,695 INFO [train.py:1046] (2/4) Epoch 31, batch 1800, loss[loss=0.1356, simple_loss=0.2098, pruned_loss=0.03069, over 17127.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2395, pruned_loss=0.04141, over 4702921.66 frames. ], batch size: 37, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:32:10,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 00:32:10,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:32:12,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:32:12,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:12,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:32:12,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:32:12,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:32:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:32:17,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:32:18,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:32:20,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1074426.6666666667, ans=0.05 2023-10-03 00:32:23,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:32:26,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:32:26,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:32:27,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.23 vs. limit=22.5 2023-10-03 00:32:29,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:32:31,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:31,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1074493.3333333333, ans=0.0 2023-10-03 00:32:32,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:34,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:32:36,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1074493.3333333333, ans=0.0 2023-10-03 00:32:37,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:32:37,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 00:32:37,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:37,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1074493.3333333333, ans=0.125 2023-10-03 00:32:37,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1074493.3333333333, ans=0.0 2023-10-03 00:32:39,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:43,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 00:32:45,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 00:32:45,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 00:32:46,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:32:46,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:46,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:32:49,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:32:52,978 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 00:32:54,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:32:56,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:59,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 00:32:59,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 00:33:00,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:33:00,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:33:00,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1074626.6666666667, ans=0.1 2023-10-03 00:33:01,282 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=15.0 2023-10-03 00:33:01,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:33:06,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 00:33:10,966 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.922e+02 2.138e+02 2.507e+02 4.896e+02, threshold=4.277e+02, percent-clipped=2.0 2023-10-03 00:33:12,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:33:13,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 00:33:13,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:33:13,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:33:13,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:33:13,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 00:33:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:33:18,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:33:19,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 00:33:19,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:33:19,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.12 vs. limit=22.5 2023-10-03 00:33:21,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.05 vs. limit=15.0 2023-10-03 00:33:22,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:33:22,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:33:24,466 INFO [train.py:1046] (2/4) Epoch 31, batch 1850, loss[loss=0.172, simple_loss=0.2576, pruned_loss=0.04315, over 24341.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2399, pruned_loss=0.04133, over 4708820.45 frames. ], batch size: 77, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:33:24,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:33:24,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:33:25,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:33:27,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:33:27,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:33:30,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:33:30,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:33:30,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1074760.0, ans=0.125 2023-10-03 00:33:36,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:33:36,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 00:33:39,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 00:33:42,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 00:33:46,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:33:46,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 00:33:46,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 00:33:58,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:33:59,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 00:34:02,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:34:04,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:34:07,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 00:34:07,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:07,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:34:09,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:34:09,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=15.0 2023-10-03 00:34:10,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1074960.0, ans=0.05 2023-10-03 00:34:11,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:34:13,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1074960.0, ans=0.1 2023-10-03 00:34:14,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:34:15,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:34:17,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:17,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:34:17,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:17,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1074960.0, ans=0.5 2023-10-03 00:34:18,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.52 vs. limit=10.0 2023-10-03 00:34:18,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:34:21,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:34:22,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 00:34:22,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:34:28,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:34:29,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:34:29,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 00:34:29,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 00:34:30,818 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 00:34:32,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 00:34:33,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:34:33,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:34:33,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:34:33,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:35,379 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 00:34:35,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:34:35,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:37,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:34:37,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:34:38,575 INFO [train.py:1046] (2/4) Epoch 31, batch 1900, loss[loss=0.1638, simple_loss=0.2407, pruned_loss=0.0434, over 23570.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04232, over 4704387.89 frames. ], batch size: 256, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:34:38,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:34:38,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 00:34:41,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:41,454 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 00:34:41,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:34:42,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:48,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:49,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:34:49,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 00:34:51,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 00:34:53,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:34:53,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:34:54,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 00:34:54,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 00:34:56,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1075160.0, ans=0.1 2023-10-03 00:34:57,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 00:34:59,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:35:03,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 00:35:05,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 00:35:12,310 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.65 vs. limit=15.0 2023-10-03 00:35:14,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 00:35:16,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 00:35:17,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:35:17,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1075226.6666666667, ans=0.125 2023-10-03 00:35:18,297 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 00:35:18,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 00:35:18,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 00:35:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 00:35:19,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:35:24,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 00:35:26,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1075293.3333333333, ans=0.0 2023-10-03 00:35:27,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:35:27,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1075293.3333333333, ans=0.125 2023-10-03 00:35:30,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:35:30,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 00:35:31,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:35:37,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 00:35:39,194 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 2.002e+02 2.260e+02 2.885e+02 4.012e+02, threshold=4.521e+02, percent-clipped=0.0 2023-10-03 00:35:39,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:35:42,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:35:42,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:35:42,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:35:43,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1075360.0, ans=0.125 2023-10-03 00:35:44,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:35:45,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:35:45,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:35:45,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:35:48,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:35:48,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:35:51,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:35:51,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:35:52,435 INFO [train.py:1046] (2/4) Epoch 31, batch 1950, loss[loss=0.2132, simple_loss=0.2803, pruned_loss=0.07299, over 19397.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2422, pruned_loss=0.04258, over 4704573.23 frames. ], batch size: 388, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:35:52,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:35:52,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:35:55,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:35:56,506 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.28 vs. limit=15.0 2023-10-03 00:35:57,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1075426.6666666667, ans=0.125 2023-10-03 00:35:58,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:35:58,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:35:58,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:36:01,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 00:36:01,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:36:01,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:01,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1075426.6666666667, ans=0.125 2023-10-03 00:36:03,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:05,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:36:06,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:06,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:08,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:36:10,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:36:10,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:36:10,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:36:10,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:13,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:17,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:36:17,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:17,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:36:17,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 00:36:19,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:36:19,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:36:20,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:25,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:27,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.54 vs. limit=15.0 2023-10-03 00:36:28,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:36:31,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:36:34,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.00 vs. limit=22.5 2023-10-03 00:36:35,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:36:35,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:36:37,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 00:36:37,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:36:42,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:36:43,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:36:43,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:36:52,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:52,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:54,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:56,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:59,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:36:59,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:59,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 00:36:59,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:36:59,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:37:01,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 00:37:02,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:37:06,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:37:07,729 INFO [train.py:1046] (2/4) Epoch 31, batch 2000, loss[loss=0.1511, simple_loss=0.2192, pruned_loss=0.04144, over 22773.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2433, pruned_loss=0.04285, over 4701392.51 frames. ], batch size: 322, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:37:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:37:08,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:37:10,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:37:12,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:13,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1075760.0, ans=0.0 2023-10-03 00:37:15,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 00:37:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:37:20,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:37:21,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 00:37:23,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:37:23,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:37:24,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:37:26,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 00:37:28,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 00:37:31,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:37:32,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 00:37:32,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:37:35,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:37:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:37:37,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:37,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:37:39,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:37:40,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 00:37:43,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 00:37:43,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:37:43,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:37:48,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:49,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:37:49,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:37:50,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:37:51,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:37:53,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:53,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:37:54,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:54,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:54,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1075960.0, ans=0.125 2023-10-03 00:37:57,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:37:59,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 00:38:03,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:38:04,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:08,531 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.852e+02 2.094e+02 2.367e+02 3.575e+02, threshold=4.187e+02, percent-clipped=0.0 2023-10-03 00:38:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:08,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:38:11,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:13,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:38:13,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:14,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:38:14,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:38:14,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1076026.6666666667, ans=0.07 2023-10-03 00:38:17,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:17,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:21,445 INFO [train.py:1046] (2/4) Epoch 31, batch 2050, loss[loss=0.145, simple_loss=0.1999, pruned_loss=0.04505, over 19629.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2422, pruned_loss=0.04277, over 4704337.06 frames. ], batch size: 388, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:38:21,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:38:22,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:29,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:38:30,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:38:31,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:33,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:38:34,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 00:38:34,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:38:34,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1076160.0, ans=0.125 2023-10-03 00:38:34,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1076160.0, ans=0.125 2023-10-03 00:38:36,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:38:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:38:46,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:38:46,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:47,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1076160.0, ans=0.2 2023-10-03 00:38:48,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 00:38:50,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:51,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 00:38:51,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:38:54,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:38:55,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.98 vs. limit=15.0 2023-10-03 00:38:56,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1076226.6666666667, ans=0.125 2023-10-03 00:38:57,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:38:59,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:38:59,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:39:00,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:39:02,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:39:02,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:39:03,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:05,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:39:06,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:39:09,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:39:13,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:39:15,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:39:16,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 00:39:22,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:39:22,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:39:25,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:39:27,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 00:39:29,795 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 00:39:29,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:39:31,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:39:32,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:39:33,763 INFO [train.py:1046] (2/4) Epoch 31, batch 2100, loss[loss=0.1483, simple_loss=0.2371, pruned_loss=0.02976, over 24630.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.241, pruned_loss=0.04206, over 4701223.77 frames. ], batch size: 73, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:39:33,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 00:39:33,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 00:39:33,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1076426.6666666667, ans=0.1 2023-10-03 00:39:35,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:39:39,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:39:39,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1076426.6666666667, ans=0.125 2023-10-03 00:39:40,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:39:42,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:39:43,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:39:43,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 00:39:43,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:39:43,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 00:39:43,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 00:39:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:39:46,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:39:46,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 00:39:46,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 00:39:50,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1076493.3333333333, ans=0.125 2023-10-03 00:39:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 00:39:51,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:39:54,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:39:56,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:56,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1076493.3333333333, ans=0.125 2023-10-03 00:39:56,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1076493.3333333333, ans=0.0 2023-10-03 00:39:59,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:39:59,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 00:40:00,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:00,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:40:02,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 00:40:02,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:02,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 00:40:03,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 00:40:05,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 00:40:06,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1076560.0, ans=0.2 2023-10-03 00:40:07,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:40:09,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:40:12,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:40:13,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:40:14,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:15,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:15,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 00:40:16,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:16,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:16,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:16,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 00:40:18,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 00:40:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 00:40:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:40:28,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:40:28,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 00:40:31,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1076693.3333333333, ans=0.125 2023-10-03 00:40:32,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.00 vs. limit=15.0 2023-10-03 00:40:35,495 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.012e+02 2.254e+02 2.840e+02 4.737e+02, threshold=4.507e+02, percent-clipped=3.0 2023-10-03 00:40:35,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:37,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:40:37,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:40:37,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:40:38,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 00:40:38,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:40:41,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:41,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:40:42,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:40:42,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:45,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 00:40:46,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 00:40:46,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:40:47,879 INFO [train.py:1046] (2/4) Epoch 31, batch 2150, loss[loss=0.1739, simple_loss=0.2467, pruned_loss=0.05058, over 23707.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2407, pruned_loss=0.04202, over 4705479.50 frames. ], batch size: 179, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:40:49,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:49,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:40:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:40:49,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1076760.0, ans=0.125 2023-10-03 00:40:50,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:40:50,970 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:40:54,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:40:56,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:40:57,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:58,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:40:58,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:40:58,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:41:02,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:03,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:41:03,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:41:06,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 00:41:08,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1076826.6666666667, ans=0.125 2023-10-03 00:41:11,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:11,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:41:12,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:12,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:12,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:13,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:41:14,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:41:14,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:41:16,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:41:19,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 00:41:19,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1076893.3333333333, ans=0.1 2023-10-03 00:41:20,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:41:21,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:21,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:22,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1076893.3333333333, ans=0.0 2023-10-03 00:41:23,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:41:23,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:41:26,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:26,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:41:28,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:28,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 00:41:28,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:41:31,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:32,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:33,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:33,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.40 vs. limit=15.0 2023-10-03 00:41:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:41:36,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:37,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:37,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 00:41:40,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 00:41:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:41:41,052 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 00:41:41,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:41,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:41:42,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 00:41:42,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:41:42,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 00:41:42,465 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 00:41:42,466 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 00:41:42,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 00:41:44,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:45,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:41:45,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:41:45,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:47,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:41:49,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:49,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:54,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:41:55,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 00:41:57,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1077026.6666666667, ans=0.125 2023-10-03 00:41:58,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1077026.6666666667, ans=0.0 2023-10-03 00:42:00,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:42:01,957 INFO [train.py:1046] (2/4) Epoch 31, batch 2200, loss[loss=0.1617, simple_loss=0.2528, pruned_loss=0.03531, over 24419.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2411, pruned_loss=0.04261, over 4689178.92 frames. ], batch size: 69, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:42:03,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1077093.3333333333, ans=0.0 2023-10-03 00:42:06,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:06,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:42:06,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:07,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:42:10,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:42:10,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:42:10,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 00:42:14,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1077093.3333333333, ans=0.125 2023-10-03 00:42:16,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 00:42:18,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:42:24,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 00:42:27,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:27,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:42:28,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:42:31,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:42:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 00:42:35,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:42:37,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:37,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 00:42:40,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:42:41,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:42:45,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:42:47,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 00:42:49,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:50,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 00:42:52,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:52,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:42:52,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:55,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:42:55,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:42:55,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:55,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:55,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1077293.3333333333, ans=0.125 2023-10-03 00:42:55,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1077293.3333333333, ans=0.09899494936611666 2023-10-03 00:42:58,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:42:58,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:42:59,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:43:03,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:43:03,889 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.25 vs. limit=22.5 2023-10-03 00:43:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:43:05,899 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.847e+02 1.989e+02 2.183e+02 3.187e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-03 00:43:08,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:43:08,614 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 00:43:11,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:43:11,902 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 00:43:13,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:43:14,621 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 00:43:15,957 INFO [train.py:1046] (2/4) Epoch 31, batch 2250, loss[loss=0.163, simple_loss=0.2438, pruned_loss=0.04114, over 24671.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2414, pruned_loss=0.04246, over 4702305.25 frames. ], batch size: 65, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:43:16,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:43:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:43:19,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:43:20,687 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 00:43:22,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:43:23,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1077426.6666666667, ans=0.5 2023-10-03 00:43:23,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1077426.6666666667, ans=0.125 2023-10-03 00:43:24,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:43:30,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:43:31,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:43:35,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:35,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:43:36,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:43:39,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 00:43:39,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:43:40,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:43:42,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 00:43:42,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:43:42,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:43,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:43:47,589 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:43:48,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:43:48,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 00:43:50,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:43:52,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 00:43:53,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:53,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1077560.0, ans=0.0 2023-10-03 00:43:54,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:43:59,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:44:01,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:44:02,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:44:03,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:44:05,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:44:06,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1077626.6666666667, ans=0.07 2023-10-03 00:44:09,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:44:10,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1077626.6666666667, ans=0.125 2023-10-03 00:44:13,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:44:17,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:44:17,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:44:17,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:44:23,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:44:24,005 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.30 vs. limit=15.0 2023-10-03 00:44:26,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:44:26,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 00:44:26,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:26,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:44:29,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 00:44:30,740 INFO [train.py:1046] (2/4) Epoch 31, batch 2300, loss[loss=0.155, simple_loss=0.2429, pruned_loss=0.03351, over 24445.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2426, pruned_loss=0.04282, over 4707460.35 frames. ], batch size: 63, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:44:32,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:44:32,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:39,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:41,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:44:42,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 00:44:43,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:51,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:44:51,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:44:51,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:44:51,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:51,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 00:44:52,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:44:55,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:44:57,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:45:00,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:45:01,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.37 vs. limit=15.0 2023-10-03 00:45:03,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:45:06,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:45:06,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1077893.3333333333, ans=0.0 2023-10-03 00:45:08,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:45:10,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:45:12,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:45:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:45:19,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:45:20,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:45:21,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:45:21,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 00:45:25,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:45:26,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:45:26,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:45:26,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:45:26,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:45:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 00:45:28,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:45:30,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 00:45:30,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:45:30,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:45:30,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 00:45:34,152 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.897e+02 2.094e+02 2.362e+02 4.130e+02, threshold=4.187e+02, percent-clipped=1.0 2023-10-03 00:45:35,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:45:38,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:45:41,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:45:43,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:45:43,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:45:43,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1078093.3333333333, ans=0.125 2023-10-03 00:45:44,634 INFO [train.py:1046] (2/4) Epoch 31, batch 2350, loss[loss=0.1623, simple_loss=0.2474, pruned_loss=0.03855, over 24620.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2428, pruned_loss=0.04248, over 4722501.02 frames. ], batch size: 68, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:45:44,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:45:44,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:45:44,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:45:46,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 00:45:52,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1078093.3333333333, ans=0.125 2023-10-03 00:45:53,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:45:53,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 00:45:58,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 00:46:02,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:46:05,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:05,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:46:05,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:46:06,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 00:46:09,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:46:13,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 00:46:15,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:46:20,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:46:20,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:46:21,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:46:24,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 00:46:25,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:46:26,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:46:26,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:46:26,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:46:31,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:46:32,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 00:46:32,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:46:34,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1078293.3333333333, ans=0.125 2023-10-03 00:46:35,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:35,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:46:37,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 00:46:38,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:46:39,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 00:46:39,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:46:42,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1078360.0, ans=0.2 2023-10-03 00:46:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 00:46:47,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 00:46:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:46:47,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:46:47,422 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 00:46:48,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 00:46:52,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 00:46:56,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:46:58,646 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.64 vs. limit=15.0 2023-10-03 00:46:59,196 INFO [train.py:1046] (2/4) Epoch 31, batch 2400, loss[loss=0.1426, simple_loss=0.224, pruned_loss=0.03064, over 24337.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2428, pruned_loss=0.04292, over 4719420.46 frames. ], batch size: 61, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:46:59,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1078426.6666666667, ans=0.125 2023-10-03 00:47:01,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:47:03,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.90 vs. limit=15.0 2023-10-03 00:47:04,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:47:05,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:47:06,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 00:47:06,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 00:47:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:47:14,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:47:16,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 00:47:16,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:47:18,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:19,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 00:47:22,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:27,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 00:47:28,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=15.43 vs. limit=15.0 2023-10-03 00:47:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:47:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 00:47:37,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:47:39,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:42,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:47:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 00:47:43,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:47:49,374 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.15 vs. limit=15.0 2023-10-03 00:47:50,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:47:53,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:47:56,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:47:56,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:47:56,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:47:56,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1078626.6666666667, ans=0.04949747468305833 2023-10-03 00:47:57,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:47:57,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:47:57,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:47:57,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:47:59,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1078693.3333333333, ans=10.0 2023-10-03 00:48:02,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:48:03,808 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.077e+02 2.404e+02 3.282e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-03 00:48:03,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:48:03,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 00:48:03,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 00:48:06,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:48:06,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:48:08,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 00:48:08,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 00:48:09,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 00:48:09,408 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 00:48:10,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 00:48:10,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:48:10,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1078693.3333333333, ans=0.0 2023-10-03 00:48:11,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1078693.3333333333, ans=0.125 2023-10-03 00:48:12,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:12,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:48:12,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1078760.0, ans=0.1 2023-10-03 00:48:13,486 INFO [train.py:1046] (2/4) Epoch 31, batch 2450, loss[loss=0.1615, simple_loss=0.235, pruned_loss=0.04403, over 23773.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2405, pruned_loss=0.04242, over 4704135.56 frames. ], batch size: 212, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:48:13,621 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 00:48:15,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:15,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:48:17,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1078760.0, ans=0.09899494936611666 2023-10-03 00:48:18,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:48:18,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:48:22,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:22,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:24,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 00:48:27,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:48:27,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:31,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:48:31,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:48:31,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:48:31,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 00:48:34,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:36,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1078826.6666666667, ans=0.125 2023-10-03 00:48:37,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:48:37,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:48:40,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:48:40,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:48:40,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1078826.6666666667, ans=0.125 2023-10-03 00:48:42,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:48:42,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:44,077 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:48:45,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 00:48:45,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1078893.3333333333, ans=0.125 2023-10-03 00:48:48,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:48:56,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:58,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:58,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:48:58,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:48:59,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:59,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:49:01,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 00:49:02,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:49:04,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:49:08,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:49:08,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:49:13,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:49:13,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 00:49:15,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:49:15,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:49:15,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 00:49:16,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:49:17,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:49:21,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:49:23,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:49:24,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:49:27,727 INFO [train.py:1046] (2/4) Epoch 31, batch 2500, loss[loss=0.1621, simple_loss=0.2425, pruned_loss=0.04089, over 23622.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2391, pruned_loss=0.04218, over 4699851.32 frames. ], batch size: 149, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:49:27,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 00:49:29,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:49:32,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:49:41,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:49:41,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:49:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:49:43,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 00:49:49,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:49:50,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:49:50,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:49:50,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 00:49:51,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 00:49:54,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:49:55,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:49:56,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 00:49:56,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:49:56,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 00:49:56,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:01,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:50:02,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:50:04,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1079226.6666666667, ans=0.1 2023-10-03 00:50:05,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:50:05,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 00:50:07,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:50:08,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:50:12,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:15,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:20,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:50:24,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:50:27,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 00:50:29,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:50:29,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:50:31,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:50:31,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:50:31,127 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 00:50:31,128 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 00:50:31,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 00:50:32,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.760e+02 1.910e+02 2.096e+02 3.347e+02, threshold=3.821e+02, percent-clipped=0.0 2023-10-03 00:50:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:50:35,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1079360.0, ans=0.125 2023-10-03 00:50:36,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 00:50:36,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 00:50:36,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:50:38,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 00:50:42,254 INFO [train.py:1046] (2/4) Epoch 31, batch 2550, loss[loss=0.1725, simple_loss=0.2573, pruned_loss=0.04391, over 23967.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2396, pruned_loss=0.04217, over 4708946.13 frames. ], batch size: 86, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:50:42,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 00:50:45,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:50:45,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1079426.6666666667, ans=22.5 2023-10-03 00:50:46,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:50:46,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:50:48,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:50:48,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 00:50:48,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1079426.6666666667, ans=0.95 2023-10-03 00:50:49,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:50:52,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 00:50:54,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:50:55,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:58,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:50:58,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 00:50:58,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:50:58,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:51:00,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:51:03,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:51:03,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 00:51:04,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:51:04,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:04,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 00:51:14,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:51:19,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:51:19,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:20,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:51:21,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:51:25,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:51:26,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1079626.6666666667, ans=0.125 2023-10-03 00:51:29,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:51:29,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:51:29,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:51:31,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:51:31,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:51:34,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:51:35,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:39,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:51:39,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 00:51:39,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:51:39,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:40,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:51:41,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:51:42,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:51:50,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:51:51,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:51:54,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 00:51:57,152 INFO [train.py:1046] (2/4) Epoch 31, batch 2600, loss[loss=0.1562, simple_loss=0.2342, pruned_loss=0.0391, over 23578.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2405, pruned_loss=0.04244, over 4709817.40 frames. ], batch size: 149, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:51:57,271 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 00:51:57,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:51:58,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 00:52:00,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 00:52:00,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 00:52:02,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:52:02,263 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 00:52:05,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 00:52:06,430 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 00:52:07,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:52:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 00:52:09,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 00:52:10,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:52:12,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 00:52:13,858 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 00:52:13,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 00:52:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:52:21,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:21,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:52:21,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 00:52:23,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:52:26,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.71 vs. limit=12.0 2023-10-03 00:52:26,997 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.22 vs. limit=15.0 2023-10-03 00:52:28,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 00:52:34,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:34,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:52:36,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 00:52:37,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:52:37,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:52:37,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 00:52:40,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1079960.0, ans=0.05 2023-10-03 00:52:41,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:52:41,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:52:43,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:52:43,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.62 vs. limit=15.0 2023-10-03 00:52:45,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1079960.0, ans=0.125 2023-10-03 00:52:47,160 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 00:52:47,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:52:47,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:52:52,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:52:53,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:52:54,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 00:52:54,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:56,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:52:56,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:53:01,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=15.0 2023-10-03 00:53:02,170 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.914e+02 2.104e+02 2.405e+02 3.485e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-03 00:53:02,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 00:53:03,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:05,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:53:08,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 00:53:08,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:08,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:53:09,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 00:53:09,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:12,210 INFO [train.py:1046] (2/4) Epoch 31, batch 2650, loss[loss=0.1499, simple_loss=0.2249, pruned_loss=0.03743, over 23631.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2411, pruned_loss=0.04279, over 4707228.75 frames. ], batch size: 149, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:53:12,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:12,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1080093.3333333333, ans=0.2 2023-10-03 00:53:14,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1080093.3333333333, ans=0.125 2023-10-03 00:53:15,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:53:17,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:53:19,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:53:20,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 00:53:20,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:53:20,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:53:25,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 00:53:25,271 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 00:53:25,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1080093.3333333333, ans=0.04949747468305833 2023-10-03 00:53:27,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:53:29,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 00:53:31,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:53:31,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 00:53:34,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:34,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 00:53:34,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:34,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:53:35,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1080160.0, ans=0.0 2023-10-03 00:53:39,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 00:53:39,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 00:53:41,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:53:44,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 00:53:44,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:53:46,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:53:46,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:46,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:53:47,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1080226.6666666667, ans=0.125 2023-10-03 00:53:47,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:51,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:53:52,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:53,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:53:54,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:53:56,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:53:57,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:53:59,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:00,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:54:00,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:54:04,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:05,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:54:05,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:06,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 00:54:12,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:54:14,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:16,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:16,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:54:17,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:19,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:54:19,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 00:54:23,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:54:24,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 00:54:25,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:54:25,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:26,998 INFO [train.py:1046] (2/4) Epoch 31, batch 2700, loss[loss=0.1984, simple_loss=0.262, pruned_loss=0.06745, over 19743.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.242, pruned_loss=0.04288, over 4711258.21 frames. ], batch size: 389, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:54:27,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:28,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:54:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:28,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:54:29,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:54:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 00:54:29,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:54:32,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:54:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:54:33,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:37,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:54:38,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 00:54:39,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:54:44,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:54:45,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:54:49,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:54:49,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:54:49,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:54:49,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:54:53,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:54:54,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:54:56,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:54:56,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:55:00,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:00,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:55:08,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:55:08,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:55:09,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1080626.6666666667, ans=0.125 2023-10-03 00:55:10,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:55:10,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:13,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:13,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:55:16,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:19,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:21,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:55:24,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:55:25,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:55:25,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:55:28,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 00:55:30,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:31,496 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.858e+02 1.974e+02 2.152e+02 3.142e+02, threshold=3.947e+02, percent-clipped=0.0 2023-10-03 00:55:31,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:55:31,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 00:55:32,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 00:55:33,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:33,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1080693.3333333333, ans=0.0 2023-10-03 00:55:37,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:55:39,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:40,419 INFO [train.py:1046] (2/4) Epoch 31, batch 2750, loss[loss=0.1624, simple_loss=0.2499, pruned_loss=0.03744, over 24568.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2411, pruned_loss=0.04256, over 4715389.97 frames. ], batch size: 71, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 00:55:41,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:41,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:55:41,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:44,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:55:46,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:55:46,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:55:46,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:46,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 00:55:46,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:55:46,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:53,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 00:55:55,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:55:55,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:56,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:55:56,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 00:55:56,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1080826.6666666667, ans=0.125 2023-10-03 00:55:57,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:55:59,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:01,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:04,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.48 vs. limit=15.0 2023-10-03 00:56:05,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:56:05,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:56:06,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:56:08,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:56:08,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:56:15,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:18,237 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.35 vs. limit=15.0 2023-10-03 00:56:18,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:56:18,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:23,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:56:23,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:56:23,916 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.67 vs. limit=22.5 2023-10-03 00:56:24,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:56:27,467 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.98 vs. limit=10.0 2023-10-03 00:56:30,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:56:30,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:56:30,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 00:56:35,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1080960.0, ans=0.5 2023-10-03 00:56:36,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:38,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 00:56:41,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:56:44,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:56:44,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 00:56:44,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:56:47,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:56:48,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 00:56:48,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:56:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 00:56:51,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:56:51,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:56:53,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 00:56:53,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:56:54,345 INFO [train.py:1046] (2/4) Epoch 31, batch 2800, loss[loss=0.1537, simple_loss=0.246, pruned_loss=0.03069, over 24619.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2404, pruned_loss=0.04202, over 4727415.95 frames. ], batch size: 68, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:56:54,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:55,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1081093.3333333333, ans=0.0 2023-10-03 00:56:57,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:56:57,079 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 00:56:57,079 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 00:57:00,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:57:00,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1081093.3333333333, ans=0.125 2023-10-03 00:57:01,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:57:03,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:57:05,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1081093.3333333333, ans=0.1 2023-10-03 00:57:06,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:57:09,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 00:57:10,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 00:57:10,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 00:57:12,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:12,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:57:12,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:16,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:57:16,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:16,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:57:18,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:57:26,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:57:28,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:57:31,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:31,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:57:31,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:37,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:57:37,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 00:57:37,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:57:39,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:57:39,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:57:45,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:57:45,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:49,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:57:50,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:57:50,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:50,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:57:50,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:57:52,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:57:53,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:53,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 00:57:54,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:57:55,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:57:56,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:57:58,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 00:57:59,539 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.999e+02 2.250e+02 2.864e+02 4.547e+02, threshold=4.500e+02, percent-clipped=1.0 2023-10-03 00:57:59,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:59,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:58:00,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:58:01,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 00:58:01,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1081360.0, ans=0.05 2023-10-03 00:58:03,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1081360.0, ans=0.125 2023-10-03 00:58:07,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:58:07,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:58:08,892 INFO [train.py:1046] (2/4) Epoch 31, batch 2850, loss[loss=0.1578, simple_loss=0.2496, pruned_loss=0.03302, over 24314.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2404, pruned_loss=0.04192, over 4739848.25 frames. ], batch size: 74, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:58:08,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:58:10,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:58:14,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:58:14,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:58:14,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:58:17,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:19,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:58:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:58:20,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 00:58:24,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 00:58:24,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:58:26,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1081493.3333333333, ans=0.2 2023-10-03 00:58:26,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1081493.3333333333, ans=0.1 2023-10-03 00:58:28,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 00:58:29,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:32,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 00:58:34,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 00:58:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:46,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:58:47,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:58:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:58:49,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:58:49,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:58:51,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:58:52,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 00:58:54,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:58:54,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:58:54,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:56,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:59,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:58:59,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:59:00,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:02,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:59:05,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:59:05,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:07,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:59:14,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:59:14,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 00:59:14,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 00:59:15,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:59:17,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:17,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 00:59:19,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:59:20,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:20,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:59:20,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:59:20,611 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 00:59:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 00:59:20,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:59:21,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:23,272 INFO [train.py:1046] (2/4) Epoch 31, batch 2900, loss[loss=0.1613, simple_loss=0.2464, pruned_loss=0.03814, over 24479.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2405, pruned_loss=0.04157, over 4749269.62 frames. ], batch size: 63, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:59:26,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:59:26,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:59:26,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:59:27,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 00:59:32,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:32,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 00:59:33,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 00:59:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:59:34,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:59:36,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:59:37,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:59:41,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:59:42,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:45,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:59:45,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 00:59:45,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:59:48,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:50,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1081826.6666666667, ans=0.0 2023-10-03 00:59:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 00:59:52,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 00:59:54,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:54,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 00:59:54,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:59:57,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:59:57,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 01:00:00,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:00:02,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:00:02,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=1081893.3333333333, ans=0.2 2023-10-03 01:00:04,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:00:07,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:08,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 01:00:09,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 01:00:09,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:00:12,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:00:15,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 01:00:18,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:00:21,866 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-10-03 01:00:22,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:00:27,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1082026.6666666667, ans=0.125 2023-10-03 01:00:27,346 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:00:28,269 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.814e+02 1.971e+02 2.189e+02 2.896e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 01:00:28,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1082026.6666666667, ans=0.125 2023-10-03 01:00:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:00:29,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:00:33,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 01:00:35,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:35,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 01:00:35,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:00:37,294 INFO [train.py:1046] (2/4) Epoch 31, batch 2950, loss[loss=0.1627, simple_loss=0.2524, pruned_loss=0.03649, over 24661.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2414, pruned_loss=0.04197, over 4744258.66 frames. ], batch size: 68, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:00:37,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:00:40,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:00:43,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 01:00:45,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:00:45,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:46,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:00:48,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:00:48,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 01:00:49,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 01:00:49,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:00:49,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:00:54,550 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:00:55,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:00:57,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:00:59,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:00:59,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:01:03,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:01:03,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:01:04,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:01:05,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:01:05,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:01:05,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1082226.6666666667, ans=0.125 2023-10-03 01:01:08,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 01:01:12,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.85 vs. limit=6.0 2023-10-03 01:01:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 01:01:13,385 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 01:01:14,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:01:16,313 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 01:01:17,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 01:01:17,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:01:19,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:01:19,493 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 01:01:19,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:01:22,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 01:01:23,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:01:23,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:01:28,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:01:28,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:01:29,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:29,933 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 01:01:29,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:01:29,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 01:01:37,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:37,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:01:38,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 01:01:38,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:01:40,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 01:01:41,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:01:42,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1082360.0, ans=0.125 2023-10-03 01:01:43,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:01:45,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:01:46,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:47,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:01:48,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:01:48,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:01:48,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:01:50,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:01:50,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:01:52,121 INFO [train.py:1046] (2/4) Epoch 31, batch 3000, loss[loss=0.1596, simple_loss=0.2503, pruned_loss=0.03448, over 24641.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2423, pruned_loss=0.04182, over 4752915.85 frames. ], batch size: 68, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:01:52,122 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 01:02:04,996 INFO [train.py:1078] (2/4) Epoch 31, validation: loss=0.3333, simple_loss=0.2731, pruned_loss=0.1967, over 1125622.00 frames. 2023-10-03 01:02:04,997 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 01:02:05,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:02:05,851 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.12 vs. limit=15.0 2023-10-03 01:02:06,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:02:06,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 01:02:06,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:02:09,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:02:09,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:02:13,967 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 01:02:14,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 01:02:16,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:02:16,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:02:16,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 01:02:18,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:02:25,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:02:32,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.69 vs. limit=15.0 2023-10-03 01:02:33,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:02:39,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 01:02:39,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:02:39,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1082560.0, ans=0.1 2023-10-03 01:02:41,596 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.98 vs. limit=15.0 2023-10-03 01:02:42,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:02:42,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:02:42,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:02:45,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:02:45,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 01:02:46,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 01:02:48,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:02:50,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:02:50,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:02:50,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:02:51,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:02:51,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:02:54,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:02:56,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:02:56,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:02:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:02:59,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 01:02:59,840 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:03:00,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:03:02,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:02,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:03:03,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1082693.3333333333, ans=0.125 2023-10-03 01:03:07,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:07,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:09,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 01:03:09,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 01:03:09,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:03:10,613 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.904e+02 2.048e+02 2.313e+02 4.264e+02, threshold=4.096e+02, percent-clipped=1.0 2023-10-03 01:03:10,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 01:03:10,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:03:12,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 01:03:13,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:03:16,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:03:16,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 01:03:18,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 01:03:18,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:03:19,489 INFO [train.py:1046] (2/4) Epoch 31, batch 3050, loss[loss=0.2196, simple_loss=0.2854, pruned_loss=0.07689, over 19743.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2431, pruned_loss=0.04221, over 4747639.57 frames. ], batch size: 388, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:03:19,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:03:19,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:19,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:03:19,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:20,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:03:24,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 01:03:27,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:03:29,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:30,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:03:31,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:34,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 01:03:40,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1082826.6666666667, ans=0.125 2023-10-03 01:03:42,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 01:03:42,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 01:03:42,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:03:46,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:03:50,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:50,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:51,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:03:53,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:03:53,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.41 vs. limit=22.5 2023-10-03 01:03:54,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:03:54,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:03:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:03:56,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:59,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:02,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:04:02,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 01:04:02,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:04:02,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:04:06,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:04:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:04:07,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:04:07,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:08,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1082960.0, ans=0.0 2023-10-03 01:04:13,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:04:13,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:18,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:18,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:04:18,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:04:21,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:04:21,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:04:21,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:04:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 01:04:24,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:04:24,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:26,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 01:04:27,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:32,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:33,360 INFO [train.py:1046] (2/4) Epoch 31, batch 3100, loss[loss=0.1629, simple_loss=0.2416, pruned_loss=0.04212, over 23306.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2428, pruned_loss=0.04195, over 4754728.07 frames. ], batch size: 93, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:04:33,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:04:36,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:04:37,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 01:04:40,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 01:04:40,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 01:04:40,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:04:45,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:04:45,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:48,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:04:52,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 01:04:58,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1083160.0, ans=6.0 2023-10-03 01:05:01,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:05:01,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:01,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:05:01,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:05:03,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 01:05:05,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:05:05,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 01:05:05,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:05:07,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:05:08,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 01:05:10,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:05:15,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:05:15,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 01:05:16,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 01:05:16,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:18,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:05:19,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:20,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:20,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:05:21,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:05:21,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:05:24,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:05:25,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:05:25,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:25,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:05:27,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.30 vs. limit=15.0 2023-10-03 01:05:30,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:05:31,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 01:05:34,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:05:34,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 01:05:35,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:35,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:35,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 01:05:38,197 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.903e+02 2.126e+02 2.520e+02 4.741e+02, threshold=4.252e+02, percent-clipped=3.0 2023-10-03 01:05:45,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 01:05:46,967 INFO [train.py:1046] (2/4) Epoch 31, batch 3150, loss[loss=0.1655, simple_loss=0.256, pruned_loss=0.03743, over 24331.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2422, pruned_loss=0.04253, over 4737070.62 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:05:48,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:05:48,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1083426.6666666667, ans=0.2 2023-10-03 01:05:49,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:51,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:05:51,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:05:52,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 01:05:53,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:05:53,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:05:54,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 01:05:55,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:59,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 01:06:00,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 01:06:00,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:06:02,019 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 01:06:02,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 01:06:02,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 01:06:03,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 01:06:03,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 01:06:03,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:06:03,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:06:04,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:06:06,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 01:06:07,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:06:07,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:06:09,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:06:10,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:06:15,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 01:06:15,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:06:17,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1083560.0, ans=0.0 2023-10-03 01:06:19,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:06:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:06:22,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 01:06:24,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 01:06:25,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:06:25,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:06:25,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:06:25,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:06:27,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:06:28,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:06:28,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:06:28,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 01:06:30,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:06:30,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:33,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:06:33,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:06:33,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 01:06:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:06:35,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 01:06:35,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:38,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 01:06:38,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 01:06:40,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:06:40,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:06:40,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1083626.6666666667, ans=0.0 2023-10-03 01:06:41,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 01:06:42,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 01:06:44,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:06:47,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:06:48,038 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.60 vs. limit=22.5 2023-10-03 01:06:48,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:49,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1083693.3333333333, ans=0.0 2023-10-03 01:06:50,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:06:55,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:06:55,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:56,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 01:06:59,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1083760.0, ans=0.5 2023-10-03 01:07:00,743 INFO [train.py:1046] (2/4) Epoch 31, batch 3200, loss[loss=0.1568, simple_loss=0.2469, pruned_loss=0.03336, over 24463.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2408, pruned_loss=0.0422, over 4720021.56 frames. ], batch size: 69, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:07:02,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:07:02,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 01:07:05,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:07:06,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:07:06,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 01:07:07,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.69 vs. limit=15.0 2023-10-03 01:07:09,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:07:14,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:07:14,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1083826.6666666667, ans=0.05 2023-10-03 01:07:18,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:07:22,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1083826.6666666667, ans=0.2 2023-10-03 01:07:26,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:07:36,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 01:07:38,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:07:39,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 01:07:41,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:07:44,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:07:44,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:07:45,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:07:48,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 01:07:50,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 01:07:52,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 01:07:55,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 01:07:56,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1083960.0, ans=0.0 2023-10-03 01:07:57,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:08:02,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:04,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:08:04,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:04,201 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 01:08:04,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:08:06,917 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.876e+02 2.218e+02 2.823e+02 4.927e+02, threshold=4.435e+02, percent-clipped=2.0 2023-10-03 01:08:08,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:09,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 01:08:09,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 01:08:11,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 01:08:11,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 01:08:13,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:08:15,752 INFO [train.py:1046] (2/4) Epoch 31, batch 3250, loss[loss=0.1772, simple_loss=0.2486, pruned_loss=0.05286, over 23762.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2404, pruned_loss=0.04219, over 4713073.95 frames. ], batch size: 232, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:08:17,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:08:17,682 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 01:08:17,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:08:17,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:17,803 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 01:08:22,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:08:22,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1084093.3333333333, ans=0.125 2023-10-03 01:08:25,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1084093.3333333333, ans=15.0 2023-10-03 01:08:26,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:08:29,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1084160.0, ans=0.125 2023-10-03 01:08:32,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:08:32,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 01:08:33,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:35,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:35,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:08:35,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1084160.0, ans=0.125 2023-10-03 01:08:36,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:08:36,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:08:38,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:08:39,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:39,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:08:44,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:08:45,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:08:47,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:47,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:48,072 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:08:49,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:50,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:08:50,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:08:54,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 01:08:54,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:08:54,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:08:54,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1084226.6666666667, ans=0.0 2023-10-03 01:08:57,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:58,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:09:03,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:09:08,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1084293.3333333333, ans=0.125 2023-10-03 01:09:12,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:09:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:12,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 01:09:12,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:09:12,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:09:12,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:12,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1084293.3333333333, ans=0.0 2023-10-03 01:09:12,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1084293.3333333333, ans=0.2 2023-10-03 01:09:15,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 01:09:15,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 01:09:17,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:09:17,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:09:19,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:09:19,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:09:19,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:09:24,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:09:24,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:09:25,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 01:09:25,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:28,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:09:28,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 01:09:30,107 INFO [train.py:1046] (2/4) Epoch 31, batch 3300, loss[loss=0.1633, simple_loss=0.2509, pruned_loss=0.03783, over 24689.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.241, pruned_loss=0.04275, over 4697277.97 frames. ], batch size: 73, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:09:31,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:09:31,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 01:09:34,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 01:09:36,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 01:09:36,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:09:36,380 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:09:37,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1084426.6666666667, ans=0.2 2023-10-03 01:09:40,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:09:42,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:09:42,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:43,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 01:09:43,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:09:45,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:46,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:09:50,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 01:09:51,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:09:53,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:54,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:54,439 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 01:09:55,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:09:57,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:09:57,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:09:57,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:09:57,225 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 01:10:00,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:10:00,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:10:02,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-03 01:10:02,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:02,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 01:10:02,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 01:10:04,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:04,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1084560.0, ans=0.125 2023-10-03 01:10:06,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:10:07,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 01:10:09,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 01:10:10,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:10:13,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 01:10:14,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:10:16,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:10:16,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:10:16,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1084626.6666666667, ans=0.125 2023-10-03 01:10:17,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1084626.6666666667, ans=0.0 2023-10-03 01:10:20,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:10:21,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:10:21,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:10:21,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:10:23,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:10:23,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:24,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:10:26,142 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 01:10:26,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1084626.6666666667, ans=0.1 2023-10-03 01:10:27,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 01:10:30,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:10:30,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:10:30,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:31,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:10:31,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:34,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:10:34,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:34,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:10:35,806 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.832e+02 2.063e+02 2.275e+02 2.937e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-03 01:10:35,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:37,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:10:41,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 01:10:41,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:42,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:43,754 INFO [train.py:1046] (2/4) Epoch 31, batch 3350, loss[loss=0.1677, simple_loss=0.2534, pruned_loss=0.04098, over 24369.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2419, pruned_loss=0.04244, over 4716540.34 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:10:45,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:10:45,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:10:45,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1084760.0, ans=0.125 2023-10-03 01:10:46,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:10:46,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1084760.0, ans=0.125 2023-10-03 01:10:47,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:51,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:10:53,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:53,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:10:56,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:57,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:10:57,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1084826.6666666667, ans=0.125 2023-10-03 01:11:00,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:11:00,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:11:01,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 01:11:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 01:11:01,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:11:05,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 01:11:06,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 01:11:07,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:11:07,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:11:09,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:09,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1084826.6666666667, ans=0.1 2023-10-03 01:11:10,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 01:11:10,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:10,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:11:13,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:14,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.92 vs. limit=15.0 2023-10-03 01:11:15,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:16,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:16,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:11:20,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:22,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:24,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1084893.3333333333, ans=0.1 2023-10-03 01:11:25,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:11:25,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:26,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1084893.3333333333, ans=0.2 2023-10-03 01:11:27,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:27,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:27,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-10-03 01:11:30,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:32,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 01:11:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:11:34,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 01:11:34,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:11:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 01:11:35,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1084960.0, ans=0.125 2023-10-03 01:11:36,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:38,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:41,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1085026.6666666667, ans=0.0 2023-10-03 01:11:45,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:46,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 01:11:46,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:11:47,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:11:49,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:11:52,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:11:55,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 01:11:56,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:11:56,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:11:58,001 INFO [train.py:1046] (2/4) Epoch 31, batch 3400, loss[loss=0.1526, simple_loss=0.2346, pruned_loss=0.03532, over 24072.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2425, pruned_loss=0.04257, over 4725192.90 frames. ], batch size: 80, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:11:58,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:58,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 01:11:59,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:12:00,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 01:12:01,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:12:02,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:12:03,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.35 vs. limit=22.5 2023-10-03 01:12:03,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:12:05,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:12:05,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 01:12:09,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 01:12:09,305 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 01:12:09,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:13,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:12:13,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:12:13,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:15,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:12:21,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:12:22,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 01:12:27,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:12:29,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:30,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:12:32,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:12:32,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1085226.6666666667, ans=0.2 2023-10-03 01:12:37,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:12:40,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 01:12:45,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:46,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:46,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 01:12:48,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:12:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:12:48,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:12:49,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:12:51,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:52,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.59 vs. limit=22.5 2023-10-03 01:12:54,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:12:54,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:13:01,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:13:03,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 01:13:04,323 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.916e+02 2.114e+02 2.433e+02 5.346e+02, threshold=4.228e+02, percent-clipped=1.0 2023-10-03 01:13:06,139 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:13:06,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1085360.0, ans=0.125 2023-10-03 01:13:07,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:13:11,335 INFO [train.py:1046] (2/4) Epoch 31, batch 3450, loss[loss=0.1833, simple_loss=0.2483, pruned_loss=0.05917, over 20066.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2428, pruned_loss=0.04313, over 4712492.34 frames. ], batch size: 389, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:13:11,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 01:13:14,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 01:13:16,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:13:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:13:18,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 01:13:19,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:13:22,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:13:26,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:13:27,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:13:27,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:13:27,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:13:29,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1085493.3333333333, ans=0.125 2023-10-03 01:13:30,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:13:33,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1085493.3333333333, ans=0.125 2023-10-03 01:13:38,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 01:13:42,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 01:13:42,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:13:42,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:13:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:13:48,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 01:13:49,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:13:55,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:13:55,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:13:56,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:13:56,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:13:58,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 01:13:58,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:13:58,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1085626.6666666667, ans=0.125 2023-10-03 01:13:59,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:14:01,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:14:04,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 01:14:08,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:14:11,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:14:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:16,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:20,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:21,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:14:22,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:14:22,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:14:25,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:26,723 INFO [train.py:1046] (2/4) Epoch 31, batch 3500, loss[loss=0.1606, simple_loss=0.2472, pruned_loss=0.03704, over 24672.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2413, pruned_loss=0.04261, over 4720033.37 frames. ], batch size: 73, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 01:14:28,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:14:28,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1085760.0, ans=0.125 2023-10-03 01:14:29,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 01:14:31,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:14:35,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:14:35,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:35,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 01:14:42,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:14:43,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:14:43,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:14:43,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:14:45,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:14:45,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:45,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:14:47,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 01:14:48,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:48,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:14:50,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:14:55,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:56,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 01:14:56,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:14:56,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1085893.3333333333, ans=0.0 2023-10-03 01:14:58,366 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.14 vs. limit=22.5 2023-10-03 01:14:59,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:14:59,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1085893.3333333333, ans=0.125 2023-10-03 01:15:00,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:15:02,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:03,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1085893.3333333333, ans=0.0 2023-10-03 01:15:04,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:15:04,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:15:06,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 01:15:07,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 01:15:07,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 01:15:07,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:15:08,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:10,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:15:10,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:15:13,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:15:14,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:15:18,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:15:20,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 01:15:20,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 01:15:20,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:15:24,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:15:26,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:15:27,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:30,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 01:15:30,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:15:31,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:15:33,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 01:15:34,359 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.859e+02 2.081e+02 2.339e+02 4.872e+02, threshold=4.163e+02, percent-clipped=1.0 2023-10-03 01:15:34,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 01:15:35,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:37,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:15:37,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:15:38,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:15:39,947 INFO [train.py:1046] (2/4) Epoch 31, batch 3550, loss[loss=0.1624, simple_loss=0.228, pruned_loss=0.04837, over 22796.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2392, pruned_loss=0.04223, over 4699293.87 frames. ], batch size: 322, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 01:15:41,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:15:50,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:15:52,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 01:15:55,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.51 vs. limit=15.0 2023-10-03 01:15:55,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:15:57,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:15:58,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff3.min_abs, batch_count=1086160.0, ans=0.2 2023-10-03 01:15:59,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:00,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:16:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:16:01,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1086160.0, ans=0.0 2023-10-03 01:16:03,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.80 vs. limit=6.0 2023-10-03 01:16:03,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:16:05,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:16:05,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:16:05,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:16:05,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:16:10,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:16:10,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:16:12,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:16:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:16:12,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:16:13,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 01:16:13,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:15,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:17,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:16:21,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:16:21,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:16:23,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:16:24,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 01:16:26,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:16:28,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 01:16:28,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:16:29,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:16:29,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:16:31,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1086293.3333333333, ans=0.1 2023-10-03 01:16:32,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 01:16:33,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:16:39,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:16:39,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 01:16:40,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:45,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:45,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 01:16:50,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.15 vs. limit=15.0 2023-10-03 01:16:52,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.67 vs. limit=15.0 2023-10-03 01:16:52,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 01:16:52,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:16:52,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:16:54,680 INFO [train.py:1046] (2/4) Epoch 31, batch 3600, loss[loss=0.1666, simple_loss=0.24, pruned_loss=0.0466, over 23814.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2392, pruned_loss=0.0419, over 4722646.80 frames. ], batch size: 212, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:16:56,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:56,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:56,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:17:00,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:17:01,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:01,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:17:03,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:17:03,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:03,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 01:17:04,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1086426.6666666667, ans=0.5 2023-10-03 01:17:06,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:17:07,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:10,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:17:11,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1086493.3333333333, ans=0.0 2023-10-03 01:17:13,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:17:13,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:17:15,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:17:15,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 01:17:16,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:17:19,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:20,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:17:22,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:24,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:17:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:17:27,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 01:17:32,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:17:35,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:17:35,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1086560.0, ans=0.0 2023-10-03 01:17:36,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 01:17:38,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.20 vs. limit=15.0 2023-10-03 01:17:41,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:17:45,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1086626.6666666667, ans=0.125 2023-10-03 01:17:46,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:49,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:55,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:17:55,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:17:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 01:17:57,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1086693.3333333333, ans=0.0 2023-10-03 01:17:59,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 01:18:00,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 01:18:02,194 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.943e+02 2.241e+02 2.571e+02 3.664e+02, threshold=4.481e+02, percent-clipped=0.0 2023-10-03 01:18:03,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:18:03,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:18:05,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 01:18:05,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:18:05,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:18:05,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:18:05,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 01:18:06,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 01:18:08,166 INFO [train.py:1046] (2/4) Epoch 31, batch 3650, loss[loss=0.1561, simple_loss=0.231, pruned_loss=0.04055, over 24445.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2402, pruned_loss=0.04218, over 4721828.36 frames. ], batch size: 58, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:18:09,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:18:10,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 01:18:15,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 01:18:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:18:21,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 01:18:22,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 01:18:26,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:18:26,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:18:26,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:18:32,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:18:32,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:18:32,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 01:18:33,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:18:33,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:18:34,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 01:18:35,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:18:36,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:18:36,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:18:37,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:18:40,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 01:18:41,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 01:18:43,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:18:44,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 01:18:45,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:18:46,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:18:46,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1086893.3333333333, ans=0.125 2023-10-03 01:18:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:18:50,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1086960.0, ans=0.2 2023-10-03 01:18:52,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:18:52,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:18:53,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:18:54,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:18:57,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:19:01,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:19:03,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:03,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:19:03,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1086960.0, ans=0.1 2023-10-03 01:19:05,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:19:05,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:19:07,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:11,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 01:19:13,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:19:14,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:14,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:19:15,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:17,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:19:18,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:20,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 01:19:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:21,965 INFO [train.py:1046] (2/4) Epoch 31, batch 3700, loss[loss=0.1485, simple_loss=0.2329, pruned_loss=0.03203, over 24336.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.241, pruned_loss=0.0423, over 4714325.01 frames. ], batch size: 61, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:19:23,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:19:26,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:19:26,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:19:28,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:28,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 01:19:28,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:31,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.27 vs. limit=15.0 2023-10-03 01:19:32,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:19:32,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:19:36,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:19:38,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:19:39,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:19:39,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:19:39,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:41,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:19:42,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1087160.0, ans=0.125 2023-10-03 01:19:43,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:19:45,195 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 01:19:49,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:19:50,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:19:52,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:19:52,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 01:19:52,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:19:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:55,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 01:19:57,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:59,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:20:02,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:20:03,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:20:06,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:20:09,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:20:09,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 01:20:09,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:20:10,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 01:20:15,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:20:15,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:20:17,172 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.84 vs. limit=12.0 2023-10-03 01:20:17,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:18,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=1087293.3333333333, ans=22.5 2023-10-03 01:20:19,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 01:20:20,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:20:20,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:20:20,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:20:21,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:25,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:20:26,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 01:20:26,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 01:20:26,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1087360.0, ans=0.015 2023-10-03 01:20:28,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:20:28,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:30,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:20:30,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:20:31,038 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.41 vs. limit=15.0 2023-10-03 01:20:32,845 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.869e+02 2.113e+02 2.369e+02 2.929e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 01:20:34,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:20:34,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:20:35,955 INFO [train.py:1046] (2/4) Epoch 31, batch 3750, loss[loss=0.1698, simple_loss=0.2582, pruned_loss=0.04066, over 24407.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.243, pruned_loss=0.04334, over 4712908.16 frames. ], batch size: 69, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:20:36,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:20:38,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 01:20:39,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.16 vs. limit=15.0 2023-10-03 01:20:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 01:20:41,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:20:42,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 01:20:43,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:20:44,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:45,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:47,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:20:50,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:20:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:20:54,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:20:56,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:56,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1087493.3333333333, ans=0.1 2023-10-03 01:20:58,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:20:59,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1087493.3333333333, ans=0.1 2023-10-03 01:21:00,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 01:21:01,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:21:03,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:21:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:21:04,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1087560.0, ans=0.1 2023-10-03 01:21:05,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1087560.0, ans=0.2 2023-10-03 01:21:07,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 01:21:08,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1087560.0, ans=0.125 2023-10-03 01:21:10,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 01:21:11,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:21:12,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:21:14,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1087560.0, ans=0.125 2023-10-03 01:21:14,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1087560.0, ans=0.0 2023-10-03 01:21:15,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:18,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1087626.6666666667, ans=0.0 2023-10-03 01:21:21,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:21:21,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 01:21:23,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.19 vs. limit=15.0 2023-10-03 01:21:24,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 01:21:27,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:21:30,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:21:32,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:21:34,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:21:38,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=14.66 vs. limit=15.0 2023-10-03 01:21:39,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:21:40,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:21:43,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:21:43,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:21:47,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:21:49,173 INFO [train.py:1046] (2/4) Epoch 31, batch 3800, loss[loss=0.1509, simple_loss=0.2405, pruned_loss=0.03065, over 24439.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2446, pruned_loss=0.04364, over 4704076.30 frames. ], batch size: 63, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:21:49,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1087760.0, ans=0.125 2023-10-03 01:21:49,531 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:21:53,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:21:58,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:59,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:21:59,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 01:22:01,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1087760.0, ans=0.125 2023-10-03 01:22:02,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:22:04,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:22:06,658 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-10-03 01:22:07,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 01:22:07,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:07,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:22:09,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:22:09,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:22:11,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:11,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 01:22:15,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 01:22:16,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:22:17,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1087826.6666666667, ans=0.0 2023-10-03 01:22:18,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:19,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-03 01:22:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:22:20,394 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.61 vs. limit=15.0 2023-10-03 01:22:21,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:22:22,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:22:22,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:25,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:25,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:30,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:22:30,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 01:22:31,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:22:40,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:22:44,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:22:47,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 01:22:50,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 01:22:50,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:52,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:22:52,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:54,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 01:22:57,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 01:22:57,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 01:22:58,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:59,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:23:01,717 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.890e+02 2.060e+02 2.285e+02 4.176e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-03 01:23:02,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1088026.6666666667, ans=0.125 2023-10-03 01:23:03,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:23:03,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:23:03,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1088093.3333333333, ans=0.1 2023-10-03 01:23:04,717 INFO [train.py:1046] (2/4) Epoch 31, batch 3850, loss[loss=0.1714, simple_loss=0.26, pruned_loss=0.04135, over 24551.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2433, pruned_loss=0.04313, over 4706077.52 frames. ], batch size: 71, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:23:06,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1088093.3333333333, ans=0.125 2023-10-03 01:23:06,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1088093.3333333333, ans=0.2 2023-10-03 01:23:09,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:23:09,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 01:23:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:23:13,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:23:14,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-10-03 01:23:17,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:23:17,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1088093.3333333333, ans=15.0 2023-10-03 01:23:18,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:23:20,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:23:21,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 01:23:27,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:27,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:23:29,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:23:30,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:23:33,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:34,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:23:34,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:23:35,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:23:36,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:23:37,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:23:39,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:40,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:23:40,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 01:23:40,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 01:23:41,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1088226.6666666667, ans=0.0 2023-10-03 01:23:43,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:23:43,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:45,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:23:47,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:47,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 01:23:50,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 01:23:51,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:23:53,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 01:23:54,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:23:59,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:00,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1088293.3333333333, ans=0.1 2023-10-03 01:24:01,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:24:03,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:03,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 01:24:06,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 01:24:09,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:09,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:12,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:24:12,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:24:14,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:14,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:14,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:24:14,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 01:24:14,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1088360.0, ans=0.2 2023-10-03 01:24:15,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:24:16,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 01:24:16,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:16,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:18,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:24:19,596 INFO [train.py:1046] (2/4) Epoch 31, batch 3900, loss[loss=0.1468, simple_loss=0.2216, pruned_loss=0.03601, over 23485.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2416, pruned_loss=0.04284, over 4691068.54 frames. ], batch size: 134, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:24:19,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:22,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:24:22,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:22,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:24,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:24:24,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 01:24:24,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:25,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1088426.6666666667, ans=0.0 2023-10-03 01:24:27,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1088426.6666666667, ans=0.125 2023-10-03 01:24:28,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:24:28,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:24:28,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:24:29,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:24:30,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1088426.6666666667, ans=0.1 2023-10-03 01:24:31,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:24:32,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:33,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:24:34,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 01:24:34,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:24:36,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1088493.3333333333, ans=0.125 2023-10-03 01:24:37,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 01:24:37,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:38,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 01:24:39,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 01:24:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:24:48,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:24:48,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:24:49,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:24:53,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:24:55,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:24:56,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:24:58,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:24:58,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:25:02,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1088626.6666666667, ans=10.0 2023-10-03 01:25:03,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:25:03,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:25:09,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:25:10,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1088626.6666666667, ans=0.05 2023-10-03 01:25:11,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:25:22,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:25:23,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:25:23,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 01:25:24,229 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.97 vs. limit=15.0 2023-10-03 01:25:24,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 01:25:24,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:25:26,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 01:25:28,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:25:28,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 01:25:30,792 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.803e+02 2.022e+02 2.316e+02 4.391e+02, threshold=4.043e+02, percent-clipped=1.0 2023-10-03 01:25:33,534 INFO [train.py:1046] (2/4) Epoch 31, batch 3950, loss[loss=0.1691, simple_loss=0.2482, pruned_loss=0.04503, over 23821.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2411, pruned_loss=0.0426, over 4688506.71 frames. ], batch size: 85, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:25:35,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:25:37,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 01:25:38,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:25:39,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:25:42,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:25:47,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1088826.6666666667, ans=0.5 2023-10-03 01:25:49,178 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 01:25:50,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:25:50,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 01:25:51,758 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 01:25:51,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:25:52,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1088826.6666666667, ans=0.125 2023-10-03 01:25:54,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:25:54,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:25:54,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:25:56,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 01:25:59,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:25:59,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:26:00,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:26:00,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:26:02,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:26:12,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:26:12,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:26:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 01:26:23,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 01:26:23,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 01:26:24,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:26:26,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:26:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:26:31,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:26:32,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:26:32,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:26:32,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 01:26:35,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:26:38,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:26:41,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 01:26:49,252 INFO [train.py:1046] (2/4) Epoch 31, batch 4000, loss[loss=0.1355, simple_loss=0.2092, pruned_loss=0.03094, over 24327.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2412, pruned_loss=0.0424, over 4698346.18 frames. ], batch size: 56, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:26:49,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1089093.3333333333, ans=0.1 2023-10-03 01:26:50,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:26:56,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:27:01,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:02,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:27:03,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:27:03,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 01:27:04,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:27:05,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 01:27:05,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:27:05,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 01:27:06,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:09,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:27:09,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:27:09,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:27:09,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:27:09,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:27:09,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1089160.0, ans=0.125 2023-10-03 01:27:11,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:27:12,216 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:27:13,139 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 01:27:14,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:27:14,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:18,876 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 01:27:20,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:27:20,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:27:25,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1089226.6666666667, ans=0.0 2023-10-03 01:27:27,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 01:27:27,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:27:28,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:27:29,900 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 01:27:31,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:27:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 01:27:31,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:27:33,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:33,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:27:34,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:27:34,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1089293.3333333333, ans=0.125 2023-10-03 01:27:36,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:27:36,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:27:37,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 01:27:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:40,036 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 01:27:44,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:27:47,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 01:27:49,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:27:51,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:51,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:27:52,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:27:58,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:59,512 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.852e+02 2.002e+02 2.226e+02 3.079e+02, threshold=4.003e+02, percent-clipped=0.0 2023-10-03 01:27:59,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:28:01,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 01:28:02,494 INFO [train.py:1046] (2/4) Epoch 31, batch 4050, loss[loss=0.1692, simple_loss=0.2467, pruned_loss=0.04588, over 23315.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.242, pruned_loss=0.0426, over 4705685.12 frames. ], batch size: 105, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:28:04,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:28:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:05,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:28:06,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:28:07,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:28:09,790 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.61 vs. limit=15.0 2023-10-03 01:28:11,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:28:13,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1089426.6666666667, ans=0.015 2023-10-03 01:28:14,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:28:14,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:28:18,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:28:18,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:28:22,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:28:24,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:28:24,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1089493.3333333333, ans=0.0 2023-10-03 01:28:26,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.02 vs. limit=12.0 2023-10-03 01:28:26,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 01:28:28,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 01:28:29,696 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 01:28:31,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:28:38,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 01:28:39,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:28:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:45,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:28:45,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:28:45,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:49,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:28:50,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 01:28:50,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:28:50,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1089626.6666666667, ans=0.125 2023-10-03 01:28:52,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:28:55,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 01:28:59,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:29:06,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 01:29:06,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:29:06,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:29:09,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 01:29:09,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 01:29:09,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:12,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:29:13,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:15,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:29:16,973 INFO [train.py:1046] (2/4) Epoch 31, batch 4100, loss[loss=0.1518, simple_loss=0.2257, pruned_loss=0.03895, over 23595.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2431, pruned_loss=0.04295, over 4709078.87 frames. ], batch size: 120, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:29:21,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 01:29:21,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1089760.0, ans=0.04949747468305833 2023-10-03 01:29:23,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 01:29:25,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 01:29:25,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1089760.0, ans=0.5 2023-10-03 01:29:26,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 01:29:26,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:26,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:26,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:26,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:29:26,636 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 01:29:29,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:29:32,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:29:32,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:32,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:29:36,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:29:36,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1089826.6666666667, ans=0.125 2023-10-03 01:29:37,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:29:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:29:37,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 01:29:37,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1089826.6666666667, ans=0.125 2023-10-03 01:29:40,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:40,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:29:40,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:29:40,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:29:41,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 01:29:42,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.23 vs. limit=22.5 2023-10-03 01:29:43,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:29:43,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1089826.6666666667, ans=0.0 2023-10-03 01:29:44,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 01:29:46,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:29:49,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:29:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 01:29:51,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:29:52,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:29:52,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:29:55,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 01:29:57,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:29:58,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:30:01,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 01:30:01,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:30:01,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:30:04,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:30:05,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1089960.0, ans=0.125 2023-10-03 01:30:08,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:11,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1089960.0, ans=0.1 2023-10-03 01:30:13,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:30:13,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:30:17,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:17,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1090026.6666666667, ans=0.125 2023-10-03 01:30:19,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:30:22,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:30:23,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.58 vs. limit=5.0 2023-10-03 01:30:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:30:28,708 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.780e+02 1.987e+02 2.300e+02 3.252e+02, threshold=3.974e+02, percent-clipped=0.0 2023-10-03 01:30:28,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:30:30,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:30:30,493 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:30:31,465 INFO [train.py:1046] (2/4) Epoch 31, batch 4150, loss[loss=0.1801, simple_loss=0.2627, pruned_loss=0.04872, over 24019.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2428, pruned_loss=0.04274, over 4715791.08 frames. ], batch size: 80, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:30:31,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:30:31,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:30:34,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 01:30:34,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:35,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 01:30:36,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 01:30:36,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 01:30:38,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:42,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:30:42,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:45,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:30:45,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:30:47,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:30:50,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:30:51,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:30:51,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:30:53,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-10-03 01:30:56,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1090160.0, ans=0.2 2023-10-03 01:30:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:31:02,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:31:03,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 01:31:05,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 01:31:05,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:31:06,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 01:31:06,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:31:06,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:31:09,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:11,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:31:12,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1090226.6666666667, ans=0.125 2023-10-03 01:31:13,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 01:31:16,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:31:18,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1090293.3333333333, ans=0.2 2023-10-03 01:31:19,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:31:19,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 01:31:19,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:31:20,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 01:31:22,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:31:24,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:31:25,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:26,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 01:31:26,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:31:26,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:31:29,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:31:33,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 01:31:33,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:33,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:31:33,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:31:34,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 01:31:34,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:31:35,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:31:35,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:31:37,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:37,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 01:31:38,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:31:42,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1090360.0, ans=0.125 2023-10-03 01:31:43,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:31:44,712 INFO [train.py:1046] (2/4) Epoch 31, batch 4200, loss[loss=0.1517, simple_loss=0.2064, pruned_loss=0.04849, over 19467.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2409, pruned_loss=0.04242, over 4712255.96 frames. ], batch size: 388, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:31:44,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 01:31:47,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:31:49,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:31:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:31:51,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:31:52,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:31:54,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 01:31:57,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 01:31:58,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:31:59,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:32:03,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:32:05,471 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:32:07,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:32:08,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:32:08,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:09,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 01:32:09,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:32:10,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:10,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:32:11,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:32:13,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:32:14,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 01:32:14,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:15,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1090560.0, ans=0.0 2023-10-03 01:32:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:32:20,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:32:20,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1090560.0, ans=0.0 2023-10-03 01:32:21,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1090560.0, ans=0.125 2023-10-03 01:32:23,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:32:24,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:32:27,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:32:27,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 01:32:29,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:32:29,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:32:33,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:32:34,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:32:41,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:32:41,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1090626.6666666667, ans=0.0 2023-10-03 01:32:43,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 01:32:45,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:32:49,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:32:51,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:32:53,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 01:32:56,584 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.884e+02 2.053e+02 2.259e+02 3.350e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-03 01:32:59,317 INFO [train.py:1046] (2/4) Epoch 31, batch 4250, loss[loss=0.1769, simple_loss=0.2567, pruned_loss=0.04851, over 24009.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.239, pruned_loss=0.04218, over 4691467.17 frames. ], batch size: 86, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:32:59,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:32:59,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1090760.0, ans=0.1 2023-10-03 01:33:01,363 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.78 vs. limit=12.0 2023-10-03 01:33:03,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:33:03,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:33:04,138 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-10-03 01:33:06,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:07,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.69 vs. limit=22.5 2023-10-03 01:33:09,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.47 vs. limit=10.0 2023-10-03 01:33:11,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:33:11,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 01:33:11,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:33:12,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1090826.6666666667, ans=0.0 2023-10-03 01:33:15,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:18,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:33:21,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:21,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:23,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:33:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:33:24,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:24,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1090826.6666666667, ans=0.125 2023-10-03 01:33:25,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:26,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:28,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:33:30,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:33:31,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 01:33:34,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 01:33:34,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:35,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:33:35,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:33:37,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:39,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:33:41,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:33:46,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:33:49,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:33:49,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 01:33:49,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:33:50,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 01:33:52,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:33:53,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:33:55,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:55,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:33:57,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 01:33:59,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:33:59,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:34:01,752 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.08 vs. limit=15.0 2023-10-03 01:34:03,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:34:06,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:34:07,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:34:08,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:34:09,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1091026.6666666667, ans=0.2 2023-10-03 01:34:11,556 INFO [train.py:1046] (2/4) Epoch 31, batch 4300, loss[loss=0.1626, simple_loss=0.2387, pruned_loss=0.04322, over 23427.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2386, pruned_loss=0.0421, over 4686341.82 frames. ], batch size: 285, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:34:11,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:34:11,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:34:13,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:34:13,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 01:34:14,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:34:21,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:34:21,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:34:24,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:34:31,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:34:31,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 01:34:33,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:34:34,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:34:34,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:34:34,694 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 01:34:37,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:34:40,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:34:42,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 01:34:42,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:34:44,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 01:34:47,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:34:47,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:34:50,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:34:50,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:34:52,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:34:54,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:34:55,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:34:55,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 01:34:56,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 01:34:58,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:35:00,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:00,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:35:00,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:00,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:35:00,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 01:35:00,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 01:35:02,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 01:35:03,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:35:03,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 01:35:03,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 01:35:07,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:35:08,987 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 01:35:10,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:35:11,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:11,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:35:15,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 01:35:15,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:35:15,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:15,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:35:16,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:35:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:35:19,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:35:22,330 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.858e+02 1.987e+02 2.158e+02 3.474e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 01:35:22,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:24,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:24,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:35:25,699 INFO [train.py:1046] (2/4) Epoch 31, batch 4350, loss[loss=0.1581, simple_loss=0.2344, pruned_loss=0.0409, over 23302.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2398, pruned_loss=0.04216, over 4690490.19 frames. ], batch size: 119, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:35:29,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 01:35:29,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:35:33,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:35:34,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:36,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:35:36,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:35:36,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1091426.6666666667, ans=0.1 2023-10-03 01:35:40,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:35:43,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:46,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:35:46,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:35:47,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1091493.3333333333, ans=0.125 2023-10-03 01:35:48,833 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.43 vs. limit=22.5 2023-10-03 01:35:49,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:35:51,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:35:54,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:35:58,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1091560.0, ans=10.0 2023-10-03 01:36:00,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 01:36:01,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:01,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:07,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:08,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 01:36:11,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:13,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:36:17,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 01:36:18,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:36:19,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:36:19,082 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 01:36:20,398 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 01:36:20,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:36:20,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:22,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:36:23,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:36:23,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:36:24,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.70 vs. limit=15.0 2023-10-03 01:36:25,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:36:28,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 01:36:28,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:28,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:28,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:29,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 01:36:29,810 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 01:36:29,814 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 01:36:31,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 01:36:33,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:36:33,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:36:34,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:36:35,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:36:36,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 01:36:38,124 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 01:36:38,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:39,841 INFO [train.py:1046] (2/4) Epoch 31, batch 4400, loss[loss=0.1609, simple_loss=0.241, pruned_loss=0.04044, over 24308.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2407, pruned_loss=0.0424, over 4699878.57 frames. ], batch size: 61, lr: 3.28e-03, grad_scale: 32.0 2023-10-03 01:36:42,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:36:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:45,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:46,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 01:36:48,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 01:36:48,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 01:36:48,319 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 01:36:49,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:36:49,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:36:51,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 01:36:52,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:54,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 01:36:58,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:36:58,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 01:36:58,257 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 01:37:02,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 01:37:03,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 01:37:04,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 01:37:04,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:05,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:37:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:37:06,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:37:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 01:37:07,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 01:37:08,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1091893.3333333333, ans=0.0 2023-10-03 01:37:09,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:37:12,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:37:12,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:37:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:13,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:37:13,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 01:37:15,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 01:37:19,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:25,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:37:27,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 01:37:30,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:37:33,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:37:36,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:37:36,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 01:37:36,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:37:37,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:37:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:37:39,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:37:40,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1092026.6666666667, ans=0.125 2023-10-03 01:37:44,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 01:37:44,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1092026.6666666667, ans=0.125 2023-10-03 01:37:45,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 01:37:46,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 01:37:46,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:37:46,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 01:37:48,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:37:50,997 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.843e+02 2.055e+02 2.239e+02 3.074e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 01:37:51,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:37:52,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 01:37:52,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1092093.3333333333, ans=0.1 2023-10-03 01:37:53,809 INFO [train.py:1046] (2/4) Epoch 31, batch 4450, loss[loss=0.1753, simple_loss=0.2449, pruned_loss=0.05286, over 23617.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2423, pruned_loss=0.0433, over 4695236.59 frames. ], batch size: 256, lr: 3.28e-03, grad_scale: 32.0 2023-10-03 01:37:57,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:37:59,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:00,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:38:00,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1092093.3333333333, ans=0.0 2023-10-03 01:38:07,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:07,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:38:10,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:13,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:38:15,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:38:15,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:38:18,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 01:38:18,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:38:18,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:19,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:38:19,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:38:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:38:26,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:26,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:28,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:38:28,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:38:29,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:38:32,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 01:38:34,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 01:38:34,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 01:38:34,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:38:37,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:38,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 01:38:42,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:38:42,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1092293.3333333333, ans=0.0 2023-10-03 01:38:45,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:45,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 01:38:47,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:47,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:38:47,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:38:47,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:48,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:51,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:38:52,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 01:38:54,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:38:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:38:57,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:00,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:39:00,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:39:01,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:39:04,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 01:39:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:39:07,816 INFO [train.py:1046] (2/4) Epoch 31, batch 4500, loss[loss=0.1607, simple_loss=0.2485, pruned_loss=0.03645, over 24682.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2426, pruned_loss=0.04341, over 4693110.66 frames. ], batch size: 73, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:39:09,482 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=7.946e-03 2023-10-03 01:39:10,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1092426.6666666667, ans=0.0 2023-10-03 01:39:11,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:39:13,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 01:39:13,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 01:39:14,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=12.0 2023-10-03 01:39:14,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:39:19,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:39:20,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:39:20,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:39:22,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:39:22,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:39:22,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:39:33,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:39:35,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1092560.0, ans=0.125 2023-10-03 01:39:36,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:39:36,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:39:38,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:39:40,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:39:44,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:39:47,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:39:50,123 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.20 vs. limit=6.0 2023-10-03 01:39:52,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:39:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:39:56,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 01:39:56,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:39:57,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:59,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:59,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1092626.6666666667, ans=0.125 2023-10-03 01:40:00,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:40:00,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:40:00,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 01:40:00,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:40:01,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1092626.6666666667, ans=0.125 2023-10-03 01:40:01,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1092626.6666666667, ans=0.125 2023-10-03 01:40:01,480 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.38 vs. limit=15.0 2023-10-03 01:40:02,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:08,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:40:08,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:40:11,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:13,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1092693.3333333333, ans=0.0 2023-10-03 01:40:14,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:40:14,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:40:15,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 01:40:17,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 01:40:17,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 01:40:20,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 01:40:21,551 INFO [train.py:1046] (2/4) Epoch 31, batch 4550, loss[loss=0.1556, simple_loss=0.2359, pruned_loss=0.0376, over 24511.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2416, pruned_loss=0.04291, over 4701748.00 frames. ], batch size: 63, lr: 3.28e-03, grad_scale: 4.0 2023-10-03 01:40:22,882 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.951e+02 2.109e+02 2.572e+02 4.097e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-03 01:40:23,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 01:40:24,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:40:27,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:40:27,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:40:30,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:40:34,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:40:37,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:40:37,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:40:37,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:40:37,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:40,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:40:41,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:40:43,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.95 vs. limit=15.0 2023-10-03 01:40:44,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:40:46,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 01:40:48,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 01:40:49,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:40:49,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 01:40:50,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 01:40:52,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:40:55,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.81 vs. limit=12.0 2023-10-03 01:40:56,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 01:40:58,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:41:00,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:00,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:00,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:41:03,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 01:41:05,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:41:07,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:08,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:41:09,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:41:12,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 01:41:12,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 01:41:12,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:41:13,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 01:41:16,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 01:41:16,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:41:17,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:17,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:41:18,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:18,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:41:20,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:41:21,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 01:41:22,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:41:22,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 01:41:23,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 01:41:24,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:41:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 01:41:27,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:41:27,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:41:30,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:41:30,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:30,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:41:33,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:41:34,923 INFO [train.py:1046] (2/4) Epoch 31, batch 4600, loss[loss=0.1666, simple_loss=0.2541, pruned_loss=0.03958, over 24595.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2405, pruned_loss=0.0423, over 4713901.98 frames. ], batch size: 71, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:41:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:41:37,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:37,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:41:40,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1093093.3333333333, ans=0.125 2023-10-03 01:41:40,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:41:40,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:41:42,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:41:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 01:41:44,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1093093.3333333333, ans=0.125 2023-10-03 01:41:45,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:41:47,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1093093.3333333333, ans=0.125 2023-10-03 01:41:49,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:41:50,348 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=22.5 2023-10-03 01:41:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:41:52,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:58,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 01:41:58,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:59,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1093160.0, ans=0.125 2023-10-03 01:42:02,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:04,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:42:04,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:42:12,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 01:42:12,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:42:12,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:42:18,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:18,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:42:21,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:42:21,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1093293.3333333333, ans=0.125 2023-10-03 01:42:23,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 01:42:24,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:42:26,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:30,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:42:32,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:32,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 01:42:33,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 01:42:35,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:36,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1093360.0, ans=0.04949747468305833 2023-10-03 01:42:38,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:38,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:42:40,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:41,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 01:42:41,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 01:42:43,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 01:42:43,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:44,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:42:45,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:47,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:51,987 INFO [train.py:1046] (2/4) Epoch 31, batch 4650, loss[loss=0.1715, simple_loss=0.253, pruned_loss=0.04499, over 23395.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2405, pruned_loss=0.04216, over 4723014.39 frames. ], batch size: 119, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:42:53,307 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.809e+02 1.983e+02 2.209e+02 6.032e+02, threshold=3.967e+02, percent-clipped=1.0 2023-10-03 01:42:56,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:42:57,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:42:57,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:59,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:42:59,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:59,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:42:59,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:43:03,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 01:43:06,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:43:09,026 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.80 vs. limit=5.0 2023-10-03 01:43:09,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 01:43:09,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:43:11,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 01:43:11,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:43:12,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 01:43:12,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 01:43:12,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:12,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:43:16,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:43:16,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:18,233 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 01:43:21,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:22,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 01:43:24,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:24,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:43:25,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 01:43:26,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:43:26,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1093560.0, ans=0.125 2023-10-03 01:43:29,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:43:33,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:43:39,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:41,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:41,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:43:41,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1093626.6666666667, ans=0.0 2023-10-03 01:43:44,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 01:43:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 01:43:46,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 01:43:46,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 01:43:48,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:43:50,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1093626.6666666667, ans=0.2 2023-10-03 01:43:52,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1093693.3333333333, ans=0.0 2023-10-03 01:43:52,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1093693.3333333333, ans=0.125 2023-10-03 01:43:52,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1093693.3333333333, ans=0.125 2023-10-03 01:43:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:43:55,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:43:55,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 01:43:55,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:43:56,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:43:56,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:43:56,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:43:59,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:44:00,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:44:00,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:44:04,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:44:04,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:44:04,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:44:06,270 INFO [train.py:1046] (2/4) Epoch 31, batch 4700, loss[loss=0.1595, simple_loss=0.2387, pruned_loss=0.04013, over 24462.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2411, pruned_loss=0.04241, over 4719923.32 frames. ], batch size: 58, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:44:06,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 01:44:06,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:44:08,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 01:44:17,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:18,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:44:19,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:44:22,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:44:23,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:44:27,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 01:44:27,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 01:44:28,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:30,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:44:31,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:44:34,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:38,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:44:39,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:44:41,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:44:43,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1093893.3333333333, ans=0.1 2023-10-03 01:44:49,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 01:44:50,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:44:52,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:44:55,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 01:44:56,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:00,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:45:01,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 01:45:02,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:02,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:04,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:45:04,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:45:06,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 01:45:06,244 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 01:45:06,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1094026.6666666667, ans=0.025 2023-10-03 01:45:07,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:10,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 01:45:12,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:15,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 01:45:18,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:45:18,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:20,095 INFO [train.py:1046] (2/4) Epoch 31, batch 4750, loss[loss=0.1551, simple_loss=0.2458, pruned_loss=0.03218, over 24610.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2416, pruned_loss=0.04221, over 4737031.42 frames. ], batch size: 73, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:45:21,361 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.883e+02 2.081e+02 2.313e+02 2.638e+02, threshold=4.163e+02, percent-clipped=0.0 2023-10-03 01:45:22,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:22,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:45:23,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1094093.3333333333, ans=0.2 2023-10-03 01:45:24,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 01:45:25,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:45:28,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 01:45:31,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:45:31,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:45:34,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1094160.0, ans=0.0 2023-10-03 01:45:36,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 01:45:41,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:45:42,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 01:45:42,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:45:48,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:48,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:48,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:48,155 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 01:45:48,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 01:45:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 01:45:56,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:45:59,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:01,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:46:01,116 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 01:46:01,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:05,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:46:06,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:46:09,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 01:46:09,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 01:46:10,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:46:10,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:46:12,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:46:12,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:46:12,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 01:46:14,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 01:46:14,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1094293.3333333333, ans=0.125 2023-10-03 01:46:18,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:19,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1094360.0, ans=0.125 2023-10-03 01:46:21,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:46:21,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 01:46:23,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:46:25,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:26,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:46:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:27,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:46:29,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:46:30,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 01:46:32,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 01:46:33,411 INFO [train.py:1046] (2/4) Epoch 31, batch 4800, loss[loss=0.1852, simple_loss=0.2539, pruned_loss=0.05826, over 23752.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2427, pruned_loss=0.04258, over 4735580.36 frames. ], batch size: 164, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:46:33,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 01:46:33,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:46:34,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:46:34,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 01:46:36,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1094426.6666666667, ans=0.125 2023-10-03 01:46:38,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1094426.6666666667, ans=0.125 2023-10-03 01:46:38,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1094426.6666666667, ans=0.2 2023-10-03 01:46:39,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:39,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:45,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:46:46,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:46:46,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:46,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 01:46:47,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=12.0 2023-10-03 01:46:48,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1094493.3333333333, ans=0.5 2023-10-03 01:46:50,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:46:50,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:46:52,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:46:54,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:46:57,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:57,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:46:58,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:58,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 01:46:58,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:58,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1094493.3333333333, ans=0.125 2023-10-03 01:46:59,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:00,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:03,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:47:04,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:47:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:47:07,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:47:07,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=1094560.0, ans=0.2 2023-10-03 01:47:10,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:10,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 01:47:10,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 01:47:11,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:47:13,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:47:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:47:13,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:47:16,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:47:16,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:47:18,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1094626.6666666667, ans=0.125 2023-10-03 01:47:21,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:47:24,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:24,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:47:29,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 01:47:29,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:30,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:30,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:47:30,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:47:37,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:47:37,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:37,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:47:37,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:47:39,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:47:42,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:47:43,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:43,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:44,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 01:47:46,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 01:47:46,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:46,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:47,763 INFO [train.py:1046] (2/4) Epoch 31, batch 4850, loss[loss=0.1528, simple_loss=0.2437, pruned_loss=0.03091, over 24406.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2426, pruned_loss=0.04246, over 4726936.70 frames. ], batch size: 69, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:47:47,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:47:47,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:49,139 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.923e+02 2.074e+02 2.370e+02 4.081e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-03 01:47:51,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:58,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 01:47:59,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:48:04,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:48:06,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:48:06,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:48:08,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:48:10,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:48:11,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:48:11,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 01:48:14,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:48:17,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:48:17,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:48:17,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:48:17,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 01:48:22,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:48:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:26,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:26,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 01:48:27,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 01:48:27,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:48:33,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:48:34,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 01:48:34,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:48:36,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:48:37,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:48:38,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=15.0 2023-10-03 01:48:38,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 01:48:38,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:40,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 01:48:40,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:48:43,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:48:43,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 01:48:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:58,002 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.64 vs. limit=12.0 2023-10-03 01:49:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:49:01,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:03,074 INFO [train.py:1046] (2/4) Epoch 31, batch 4900, loss[loss=0.1401, simple_loss=0.2176, pruned_loss=0.03127, over 24601.00 frames. ], tot_loss[loss=0.163, simple_loss=0.242, pruned_loss=0.04198, over 4721920.26 frames. ], batch size: 60, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:49:05,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 01:49:05,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:49:11,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:13,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:49:13,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:49:14,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 01:49:18,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1095160.0, ans=0.0 2023-10-03 01:49:20,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 01:49:20,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1095160.0, ans=0.07 2023-10-03 01:49:22,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 01:49:24,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 01:49:24,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:49:24,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:49:24,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:49:24,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:24,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:49:26,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 01:49:29,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 01:49:30,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:49:30,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:49:31,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:49:33,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:49:34,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:49:35,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 01:49:37,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:49:38,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:38,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 01:49:38,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 01:49:43,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 01:49:44,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:49:45,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:49:45,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:49:47,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:47,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 01:49:47,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:49:48,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 01:49:50,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:49:52,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:49:55,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:49:55,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1095293.3333333333, ans=0.125 2023-10-03 01:49:56,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 01:49:58,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:49:58,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 01:49:58,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 01:50:04,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:50:06,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:50:08,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 01:50:08,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:50:08,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:50:09,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:13,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:50:13,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:50:13,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:50:15,108 INFO [train.py:1046] (2/4) Epoch 31, batch 4950, loss[loss=0.1671, simple_loss=0.2528, pruned_loss=0.04069, over 24329.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2409, pruned_loss=0.04184, over 4712199.68 frames. ], batch size: 74, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:50:15,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 01:50:15,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1095426.6666666667, ans=0.09899494936611666 2023-10-03 01:50:16,403 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 1.910e+02 2.098e+02 2.377e+02 3.455e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 01:50:16,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:50:19,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:50:19,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:50:21,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 01:50:22,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 01:50:22,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:50:22,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 01:50:22,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:22,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:50:24,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:50:24,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:27,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:27,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:50:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:50:30,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:50:32,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:50:35,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:50:39,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:41,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:50:41,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:43,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:44,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:50:44,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 01:50:46,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 01:50:48,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:51,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:50:51,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:50:52,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:50:52,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:50:54,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:50:55,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:57,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1095560.0, ans=0.125 2023-10-03 01:50:58,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:51:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:51:02,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:02,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:03,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 01:51:03,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:51:06,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:51:07,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1095626.6666666667, ans=0.04949747468305833 2023-10-03 01:51:10,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:51:11,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:51:11,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:51:11,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:11,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:51:13,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:51:15,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:51:15,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:51:15,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:51:17,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 01:51:17,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1095693.3333333333, ans=0.2 2023-10-03 01:51:21,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:26,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 01:51:26,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:51:29,245 INFO [train.py:1046] (2/4) Epoch 31, batch 5000, loss[loss=0.1676, simple_loss=0.2436, pruned_loss=0.04576, over 23699.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2399, pruned_loss=0.04156, over 4698328.16 frames. ], batch size: 232, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:51:32,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:32,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:51:34,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 01:51:35,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 01:51:36,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:51:39,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 01:51:39,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:51:39,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:51:40,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 01:51:40,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:41,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:51:41,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1095760.0, ans=0.125 2023-10-03 01:51:42,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 01:51:42,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:42,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:51:43,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 01:51:43,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 01:51:45,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:51:46,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 01:51:46,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:51:46,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:47,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:51:47,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 01:51:47,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 01:51:50,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 01:51:50,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:50,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1095826.6666666667, ans=0.1 2023-10-03 01:51:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:51,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 01:51:53,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:51:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:56,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:57,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 01:52:00,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 01:52:00,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:52:01,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:52:04,412 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 01:52:06,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:52:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:52:06,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:10,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 01:52:10,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:52:10,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:52:11,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:52:13,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 01:52:13,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:52:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:52:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:52:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 01:52:28,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:38,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:52:38,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:38,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:52:38,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:52:40,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:52:40,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:52:40,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:42,979 INFO [train.py:1046] (2/4) Epoch 31, batch 5050, loss[loss=0.1491, simple_loss=0.2392, pruned_loss=0.02947, over 24487.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2412, pruned_loss=0.04201, over 4699496.80 frames. ], batch size: 66, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:52:43,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:43,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 01:52:44,300 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.838e+02 2.026e+02 2.267e+02 4.820e+02, threshold=4.051e+02, percent-clipped=1.0 2023-10-03 01:52:44,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:52:46,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:52:47,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:52:49,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 01:52:49,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:52:50,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:52:53,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:52:55,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:52:55,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:53:03,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 01:53:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:53:05,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:53:05,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 01:53:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:53:06,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:06,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:53:06,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:53:06,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 01:53:07,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1096160.0, ans=0.125 2023-10-03 01:53:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 01:53:09,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.84 vs. limit=15.0 2023-10-03 01:53:09,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:11,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:11,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1096226.6666666667, ans=0.04949747468305833 2023-10-03 01:53:13,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:15,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 01:53:15,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:53:19,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 01:53:20,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:53:20,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:53:21,024 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.16 vs. limit=15.0 2023-10-03 01:53:21,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:53:21,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:53:24,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:53:26,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:53:27,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:27,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:53:27,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:53:29,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 01:53:29,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:53:30,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:53:32,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.26 vs. limit=22.5 2023-10-03 01:53:34,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.96 vs. limit=15.0 2023-10-03 01:53:35,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:53:35,312 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 01:53:35,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:53:36,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:53:38,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:38,080 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 01:53:40,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:40,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 01:53:40,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:43,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:53:44,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:44,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 01:53:45,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1096360.0, ans=0.025 2023-10-03 01:53:46,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 01:53:48,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:53:50,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:53:50,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:53:54,184 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 01:53:55,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:56,876 INFO [train.py:1046] (2/4) Epoch 31, batch 5100, loss[loss=0.1459, simple_loss=0.222, pruned_loss=0.03491, over 24261.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2419, pruned_loss=0.04255, over 4696312.62 frames. ], batch size: 56, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:53:58,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 01:53:58,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 01:54:00,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:54:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:54:04,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:54:04,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 01:54:04,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 01:54:10,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:54:11,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:54:14,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:54:16,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 01:54:17,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:54:17,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1096493.3333333333, ans=0.2 2023-10-03 01:54:18,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:54:20,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:54:22,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:23,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 01:54:27,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 01:54:27,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:27,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 01:54:27,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 01:54:32,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:54:35,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1096560.0, ans=0.125 2023-10-03 01:54:39,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:54:43,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 01:54:43,711 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 01:54:43,720 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 01:54:46,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 01:54:46,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:49,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 01:54:52,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 01:54:54,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:54:57,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:54:58,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 01:55:00,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:55:01,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 01:55:06,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:55:06,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:55:06,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:55:06,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:55:07,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:55:08,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:55:09,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 01:55:09,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 01:55:11,126 INFO [train.py:1046] (2/4) Epoch 31, batch 5150, loss[loss=0.1576, simple_loss=0.2366, pruned_loss=0.03927, over 23603.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2423, pruned_loss=0.04217, over 4714790.55 frames. ], batch size: 135, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:55:11,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 01:55:11,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:55:11,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 01:55:11,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:12,447 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.869e+02 2.053e+02 2.257e+02 3.083e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-03 01:55:12,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 01:55:13,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:55:16,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:55:18,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.67 vs. limit=15.0 2023-10-03 01:55:22,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:55:23,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 01:55:24,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=22.5 2023-10-03 01:55:24,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:55:28,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:55:28,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:55:28,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:55:28,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:55:30,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:55:30,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 01:55:30,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1096826.6666666667, ans=0.0 2023-10-03 01:55:31,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:55:31,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:55:34,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:55:37,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 01:55:39,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:55:42,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:55:44,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 01:55:45,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1096893.3333333333, ans=0.125 2023-10-03 01:55:48,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:55:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:55:53,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:56,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:55:56,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:56:00,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 01:56:04,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:56:05,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:56:05,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:56:07,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:09,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:56:10,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 01:56:13,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:56:15,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:56:16,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:56:18,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:56:18,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:56:18,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:56:18,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:56:19,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:56:21,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1097026.6666666667, ans=0.0 2023-10-03 01:56:22,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:56:23,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:56:24,997 INFO [train.py:1046] (2/4) Epoch 31, batch 5200, loss[loss=0.1687, simple_loss=0.235, pruned_loss=0.05121, over 23720.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2421, pruned_loss=0.04215, over 4712884.63 frames. ], batch size: 232, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:56:26,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:56:31,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 01:56:32,110 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.17 vs. limit=15.0 2023-10-03 01:56:33,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:56:33,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:36,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:56:37,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:56:39,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:39,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 01:56:42,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:56:43,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:44,669 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-10-03 01:56:45,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 01:56:48,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:56:49,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:56:49,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 01:56:49,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 01:56:52,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 01:56:52,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:52,493 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 01:56:52,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:53,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:56:53,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:56:55,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 01:56:56,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:57:01,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:57:03,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 01:57:03,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 01:57:03,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 01:57:07,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 01:57:07,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:57:14,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:57:14,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:14,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1097293.3333333333, ans=0.2 2023-10-03 01:57:16,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 01:57:17,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:57:17,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:57:17,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:18,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:57:21,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:57:23,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:57:24,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:57:26,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:57:26,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:33,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:35,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 01:57:35,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:57:35,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:57:36,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:38,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:57:38,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1097426.6666666667, ans=0.2 2023-10-03 01:57:39,484 INFO [train.py:1046] (2/4) Epoch 31, batch 5250, loss[loss=0.1691, simple_loss=0.2538, pruned_loss=0.04219, over 24458.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2419, pruned_loss=0.04205, over 4720448.70 frames. ], batch size: 66, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:57:39,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:57:40,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.931e+02 2.126e+02 2.506e+02 4.070e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 01:57:42,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:57:42,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1097426.6666666667, ans=0.1 2023-10-03 01:57:43,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:57:43,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1097426.6666666667, ans=0.2 2023-10-03 01:57:45,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:57:45,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:57:45,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1097426.6666666667, ans=0.1 2023-10-03 01:57:52,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:53,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:57:56,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:57:57,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:57:59,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 01:57:59,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:58:02,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:58:16,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1097560.0, ans=0.125 2023-10-03 01:58:21,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1097626.6666666667, ans=0.125 2023-10-03 01:58:24,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1097626.6666666667, ans=0.0 2023-10-03 01:58:27,170 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.057e-02 2023-10-03 01:58:34,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1097693.3333333333, ans=0.0 2023-10-03 01:58:45,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1097693.3333333333, ans=0.125 2023-10-03 01:58:47,928 INFO [train.py:1046] (2/4) Epoch 31, batch 5300, loss[loss=0.1363, simple_loss=0.2203, pruned_loss=0.02613, over 20453.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2415, pruned_loss=0.04191, over 4711820.28 frames. ], batch size: 44, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:59:02,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:59:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 01:59:02,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 01:59:02,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:02,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:02,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:02,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:02,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:02,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:03,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:03,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:59:03,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:59:03,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 01:59:03,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 01:59:03,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 01:59:03,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:59:03,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 01:59:03,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 01:59:03,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:04,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:04,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:59:04,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:59:04,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:59:04,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:59:04,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:04,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:04,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:59:04,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:05,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:59:05,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:05,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:59:05,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 01:59:05,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:59:06,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:06,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 01:59:06,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 01:59:06,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:59:06,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 01:59:06,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 01:59:06,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:59:06,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:59:06,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:59:07,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 01:59:07,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 01:59:07,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:59:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:07,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 01:59:07,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 01:59:07,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 01:59:08,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:59:14,801 INFO [train.py:1046] (2/4) Epoch 32, batch 0, loss[loss=0.1724, simple_loss=0.2445, pruned_loss=0.05009, over 23571.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2445, pruned_loss=0.05009, over 23571.00 frames. ], batch size: 256, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 01:59:14,801 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 01:59:26,544 INFO [train.py:1078] (2/4) Epoch 32, validation: loss=0.3377, simple_loss=0.28, pruned_loss=0.1977, over 1125622.00 frames. 2023-10-03 01:59:26,544 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 01:59:26,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 01:59:26,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:59:29,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:59:29,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1097840.0, ans=0.125 2023-10-03 01:59:34,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:34,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:59:34,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:35,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 01:59:35,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1097840.0, ans=0.0 2023-10-03 01:59:36,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 01:59:38,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:39,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:41,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1097906.6666666667, ans=0.125 2023-10-03 01:59:42,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:43,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:59:43,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:59:46,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 01:59:48,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:59:56,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:59:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:58,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 02:00:02,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:00:02,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:00:03,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1097973.3333333333, ans=0.125 2023-10-03 02:00:04,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1097973.3333333333, ans=0.0 2023-10-03 02:00:06,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:00:09,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:00:11,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1098040.0, ans=0.125 2023-10-03 02:00:13,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:00:15,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1098040.0, ans=0.0 2023-10-03 02:00:18,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 02:00:19,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1098040.0, ans=0.125 2023-10-03 02:00:20,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1098040.0, ans=0.125 2023-10-03 02:00:21,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 02:00:21,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:00:21,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:23,171 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.938e+02 2.224e+02 2.523e+02 4.024e+02, threshold=4.448e+02, percent-clipped=0.0 2023-10-03 02:00:23,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:00:23,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:00:24,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 02:00:28,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:31,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:34,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1098106.6666666667, ans=0.125 2023-10-03 02:00:35,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:00:36,983 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:00:38,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 02:00:39,941 INFO [train.py:1046] (2/4) Epoch 32, batch 50, loss[loss=0.1467, simple_loss=0.2272, pruned_loss=0.03316, over 24584.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2444, pruned_loss=0.04238, over 1077036.18 frames. ], batch size: 60, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 02:00:40,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:00:44,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:00:45,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:00:45,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 02:00:46,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:00:46,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:00:48,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:00:50,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:00:52,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:00:56,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 02:00:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:01,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:01:02,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 02:01:05,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 02:01:07,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:01:08,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:01:08,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:10,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:01:11,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:01:13,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 02:01:13,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:20,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:01:20,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:01:20,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1098306.6666666667, ans=0.2 2023-10-03 02:01:21,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:01:21,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 02:01:22,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1098373.3333333333, ans=0.0 2023-10-03 02:01:24,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:01:24,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:01:24,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 02:01:25,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:01:27,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 02:01:33,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:01:33,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:01:35,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:01:38,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:01:38,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:01:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 02:01:39,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 02:01:41,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:01:42,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:01:45,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:01:45,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:01:47,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 02:01:47,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 02:01:48,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 02:01:49,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:01:49,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:01:50,563 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.07 vs. limit=15.0 2023-10-03 02:01:51,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 02:01:51,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 02:01:52,587 INFO [train.py:1046] (2/4) Epoch 32, batch 100, loss[loss=0.1564, simple_loss=0.2446, pruned_loss=0.03414, over 24669.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2431, pruned_loss=0.0411, over 1900283.11 frames. ], batch size: 68, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 02:01:52,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:01:52,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:01:55,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:01:55,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:01:58,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:02:00,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:02:05,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:02:07,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 02:02:07,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:02:12,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:02:12,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:02:12,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:02:12,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:02:13,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:02:14,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 02:02:14,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.65 vs. limit=12.0 2023-10-03 02:02:17,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:02:17,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:18,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:02:18,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:02:20,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1098640.0, ans=0.0 2023-10-03 02:02:22,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 02:02:22,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:23,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:02:23,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:02:25,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1098640.0, ans=0.0 2023-10-03 02:02:26,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:02:29,494 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 02:02:29,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 02:02:30,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:02:30,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:02:35,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:02:38,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:39,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:41,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1098706.6666666667, ans=0.2 2023-10-03 02:02:43,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:45,378 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 02:02:47,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:02:49,848 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.824e+02 1.996e+02 2.210e+02 3.286e+02, threshold=3.993e+02, percent-clipped=0.0 2023-10-03 02:02:51,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:02:52,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:02:54,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:02:58,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:02:59,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1098773.3333333333, ans=0.0 2023-10-03 02:03:00,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:03:03,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:04,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:04,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:04,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:03:04,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:05,706 INFO [train.py:1046] (2/4) Epoch 32, batch 150, loss[loss=0.1584, simple_loss=0.2365, pruned_loss=0.04011, over 23630.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2433, pruned_loss=0.04178, over 2541636.19 frames. ], batch size: 149, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:03:05,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 02:03:05,840 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 02:03:07,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:03:07,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1098840.0, ans=0.125 2023-10-03 02:03:08,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:08,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:08,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:03:08,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:03:09,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:03:10,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:10,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:11,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:11,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:03:13,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:03:15,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:18,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:03:18,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:18,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:21,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:21,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:24,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:03:24,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:29,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 02:03:29,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 02:03:29,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 02:03:32,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:03:32,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:03:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:03:35,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:35,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:35,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:36,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:39,780 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 02:03:41,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:45,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:49,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1099040.0, ans=0.07 2023-10-03 02:03:50,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:03:51,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 02:03:55,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:03:55,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:55,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:03:57,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:04:00,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:04:00,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:04:03,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:03,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 02:04:07,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:09,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:09,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:04:09,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:04:10,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:12,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 02:04:15,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:04:15,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:04:16,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:04:16,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1099106.6666666667, ans=0.2 2023-10-03 02:04:18,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:04:20,037 INFO [train.py:1046] (2/4) Epoch 32, batch 200, loss[loss=0.1631, simple_loss=0.2472, pruned_loss=0.03949, over 23492.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2444, pruned_loss=0.04229, over 3029048.73 frames. ], batch size: 93, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:04:20,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 02:04:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:04:20,145 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 02:04:24,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:04:26,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:04:28,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:04:30,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 02:04:32,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:04:32,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:34,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 02:04:36,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:04:37,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:38,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:42,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:04:42,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:04:43,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:45,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1099240.0, ans=0.2 2023-10-03 02:04:46,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1099240.0, ans=0.07 2023-10-03 02:05:00,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:05:00,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:05:01,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:05:03,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:05:03,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:05:03,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:05:04,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:05:06,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.66 vs. limit=15.0 2023-10-03 02:05:07,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:05:07,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:05:07,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 02:05:09,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:05:09,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:12,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:05:15,803 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.68 vs. limit=15.0 2023-10-03 02:05:16,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1099373.3333333333, ans=0.5 2023-10-03 02:05:16,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1099373.3333333333, ans=0.1 2023-10-03 02:05:17,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1099440.0, ans=0.0 2023-10-03 02:05:18,854 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.373e+02 1.798e+02 1.948e+02 2.252e+02 2.874e+02, threshold=3.895e+02, percent-clipped=0.0 2023-10-03 02:05:18,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:05:22,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1099440.0, ans=0.0 2023-10-03 02:05:25,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:27,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:05:33,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:34,320 INFO [train.py:1046] (2/4) Epoch 32, batch 250, loss[loss=0.1583, simple_loss=0.2214, pruned_loss=0.04758, over 23787.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2437, pruned_loss=0.04285, over 3400827.12 frames. ], batch size: 164, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:05:34,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 02:05:35,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:35,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:05:35,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:05:37,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:05:37,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 02:05:39,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:05:39,138 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 02:05:40,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:43,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:05:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:44,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:46,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.92 vs. limit=22.5 2023-10-03 02:05:47,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:05:47,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:48,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:05:51,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:06:01,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:06:02,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:06:04,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:06:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:06:11,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:06:13,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:06:14,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:06:15,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:06:15,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:06:15,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:06:16,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:06:16,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1099640.0, ans=0.5 2023-10-03 02:06:18,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 02:06:20,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:06:21,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:06:21,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:06:21,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:06:21,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:06:23,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:06:23,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:06:25,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:26,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:06:28,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:06:30,123 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.77 vs. limit=15.0 2023-10-03 02:06:30,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:06:35,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:36,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:06:42,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:06:43,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:06:45,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 02:06:46,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:06:46,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:06:48,126 INFO [train.py:1046] (2/4) Epoch 32, batch 300, loss[loss=0.171, simple_loss=0.2367, pruned_loss=0.05268, over 23722.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2409, pruned_loss=0.04254, over 3675447.13 frames. ], batch size: 179, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:06:48,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 02:06:49,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:06:51,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:06:51,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 02:06:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:06:59,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:06:59,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 02:07:00,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:07:00,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:07:00,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 02:07:00,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:02,123 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:07:02,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1099906.6666666667, ans=0.125 2023-10-03 02:07:05,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:07:10,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:07:10,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 02:07:13,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 02:07:13,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:14,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:18,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:18,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 02:07:18,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:07:19,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:07:21,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:07:21,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:07:27,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:07:27,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 02:07:27,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:07:30,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:31,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 02:07:31,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:07:34,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:07:37,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:07:37,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 02:07:40,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:40,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:07:43,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:45,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:07:45,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 02:07:45,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:07:46,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:07:47,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 02:07:50,053 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.863e+02 2.088e+02 2.387e+02 3.568e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 02:07:50,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:50,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:07:51,009 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.59 vs. limit=12.0 2023-10-03 02:07:51,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:53,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:07:54,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:07:56,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1100106.6666666667, ans=0.0 2023-10-03 02:08:00,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:00,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 02:08:02,547 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=12.0 2023-10-03 02:08:03,151 INFO [train.py:1046] (2/4) Epoch 32, batch 350, loss[loss=0.1571, simple_loss=0.231, pruned_loss=0.04159, over 23838.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2386, pruned_loss=0.04227, over 3886693.61 frames. ], batch size: 212, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:08:03,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:10,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:08:12,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:13,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:14,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 02:08:16,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:16,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 02:08:20,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:21,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 02:08:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:08:24,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 02:08:25,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:08:28,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:08:29,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:08:29,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:08:29,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:08:30,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:08:30,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:30,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:08:31,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:08:31,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:38,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:08:39,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:08:39,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:08:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:44,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 02:08:44,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:51,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:51,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:08:51,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:52,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 02:08:54,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:08:55,852 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 02:08:57,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 02:08:57,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:01,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:09:01,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 02:09:04,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:06,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:09:09,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:10,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:10,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:09:12,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:09:13,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=1100440.0, ans=0.02 2023-10-03 02:09:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:09:16,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=1100506.6666666667, ans=0.95 2023-10-03 02:09:17,452 INFO [train.py:1046] (2/4) Epoch 32, batch 400, loss[loss=0.1793, simple_loss=0.2503, pruned_loss=0.05418, over 23730.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2393, pruned_loss=0.04209, over 4069905.29 frames. ], batch size: 179, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:09:17,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:09:19,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 02:09:19,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:21,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:23,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:09:23,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:26,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:27,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:28,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 02:09:28,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1100506.6666666667, ans=0.2 2023-10-03 02:09:29,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 02:09:29,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:30,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 02:09:31,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:32,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1100573.3333333333, ans=0.125 2023-10-03 02:09:34,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:09:34,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:09:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 02:09:35,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:09:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:35,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:09:35,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:40,196 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 02:09:40,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 02:09:45,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:47,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1100640.0, ans=0.0 2023-10-03 02:09:48,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:49,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 02:09:49,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 02:09:54,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:09:54,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:01,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 02:10:05,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:10:05,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 02:10:06,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:10:09,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:10:09,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 02:10:13,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:10:15,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:10:16,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:10:17,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1100773.3333333333, ans=0.07 2023-10-03 02:10:19,454 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.948e+02 2.204e+02 2.746e+02 3.868e+02, threshold=4.409e+02, percent-clipped=0.0 2023-10-03 02:10:19,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:19,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 02:10:21,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 02:10:22,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 02:10:24,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:10:24,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:10:25,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1100773.3333333333, ans=0.125 2023-10-03 02:10:28,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 02:10:30,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:10:31,425 INFO [train.py:1046] (2/4) Epoch 32, batch 450, loss[loss=0.1366, simple_loss=0.2211, pruned_loss=0.02612, over 24632.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2401, pruned_loss=0.04188, over 4211836.40 frames. ], batch size: 60, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:10:31,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:10:31,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:10:32,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 02:10:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:10:34,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:10:34,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:10:34,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 02:10:36,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:10:36,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:10:37,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:10:43,204 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.18 vs. limit=10.0 2023-10-03 02:10:45,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1100906.6666666667, ans=0.125 2023-10-03 02:10:47,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:47,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:10:49,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 02:10:49,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1100906.6666666667, ans=0.0 2023-10-03 02:10:50,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 02:10:54,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:10:56,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:58,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:02,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:11:04,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:11:05,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 02:11:07,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 02:11:08,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 02:11:09,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:10,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:11,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:11:13,360 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 02:11:13,369 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 02:11:13,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:11:14,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:11:14,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 02:11:18,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:11:18,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:11:19,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:11:19,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 02:11:21,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:11:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:11:25,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:11:26,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 02:11:26,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1101040.0, ans=0.125 2023-10-03 02:11:29,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:11:30,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 02:11:32,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 02:11:33,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:11:40,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:11:41,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:11:42,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:11:42,960 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 02:11:45,609 INFO [train.py:1046] (2/4) Epoch 32, batch 500, loss[loss=0.1659, simple_loss=0.2632, pruned_loss=0.03428, over 24626.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2418, pruned_loss=0.04241, over 4331547.74 frames. ], batch size: 73, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:11:45,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:47,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:11:47,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:47,541 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 02:11:49,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 02:11:49,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:53,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:11:53,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1101173.3333333333, ans=0.0 2023-10-03 02:11:55,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1101173.3333333333, ans=0.125 2023-10-03 02:11:57,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 02:11:57,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:12:00,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:12:00,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:12:01,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:04,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.14 vs. limit=10.0 2023-10-03 02:12:07,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1101240.0, ans=0.07 2023-10-03 02:12:12,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:12,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:12:12,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:12:12,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:13,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 02:12:13,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:12:17,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:12:17,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:12:19,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:12:19,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:21,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 02:12:24,346 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 02:12:25,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:27,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:29,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:29,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:12:30,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 02:12:32,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:12:33,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:12:37,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:12:41,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:44,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1101440.0, ans=0.125 2023-10-03 02:12:46,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:48,037 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.888e+02 2.126e+02 2.426e+02 3.415e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 02:12:48,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1101440.0, ans=0.0 2023-10-03 02:12:51,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 02:12:51,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:12:51,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:55,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 02:12:56,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:12:58,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:13:00,072 INFO [train.py:1046] (2/4) Epoch 32, batch 550, loss[loss=0.1826, simple_loss=0.2682, pruned_loss=0.04845, over 23678.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2427, pruned_loss=0.04272, over 4417991.49 frames. ], batch size: 85, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:13:02,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 02:13:05,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 02:13:05,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:05,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 02:13:05,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:13:05,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:07,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:08,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:08,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:13:10,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:13:11,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:13:13,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 02:13:13,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:13:19,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:19,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:22,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:13:22,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:25,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 02:13:27,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 02:13:29,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:13:34,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:13:34,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:13:37,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:13:39,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:39,982 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 02:13:41,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:42,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.30 vs. limit=22.5 2023-10-03 02:13:43,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:13:44,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:13:46,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:13:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:13:46,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:48,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 02:13:49,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 02:13:49,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:13:49,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1101706.6666666667, ans=0.125 2023-10-03 02:13:50,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:13:50,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:13:50,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:55,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:13:55,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:13:58,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:13:59,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:59,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 02:13:59,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:14:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:02,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:14:02,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:04,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:14:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:14:11,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 02:14:13,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 02:14:14,385 INFO [train.py:1046] (2/4) Epoch 32, batch 600, loss[loss=0.1362, simple_loss=0.2192, pruned_loss=0.02658, over 24463.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2432, pruned_loss=0.04303, over 4491263.22 frames. ], batch size: 63, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:14:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:14:16,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:14:16,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:16,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1101840.0, ans=0.125 2023-10-03 02:14:17,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1101840.0, ans=0.0 2023-10-03 02:14:22,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:14:25,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:14:26,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 02:14:27,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:14:29,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:14:31,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1101906.6666666667, ans=0.1 2023-10-03 02:14:32,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:35,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 02:14:35,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:14:41,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 02:14:45,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:14:45,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:45,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:14:50,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:14:50,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:14:50,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:59,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:15:03,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:15:03,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:15:03,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:15:10,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 02:15:15,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1102106.6666666667, ans=0.0 2023-10-03 02:15:16,459 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.923e+02 2.160e+02 2.483e+02 3.446e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-03 02:15:16,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:15:16,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:15:18,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 02:15:19,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:15:22,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 02:15:22,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:15:22,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:15:24,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1102106.6666666667, ans=0.125 2023-10-03 02:15:28,697 INFO [train.py:1046] (2/4) Epoch 32, batch 650, loss[loss=0.1662, simple_loss=0.2479, pruned_loss=0.04228, over 23949.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2413, pruned_loss=0.04253, over 4523454.68 frames. ], batch size: 86, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:15:28,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 02:15:30,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:15:32,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:15:33,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:15:33,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1102173.3333333333, ans=0.07 2023-10-03 02:15:33,803 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:15:36,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:15:38,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 02:15:39,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:15:43,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:15:43,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:15:47,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:15:51,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 02:15:53,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:15:53,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:15:58,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:15:58,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:16:00,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:01,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:01,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:16:03,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:04,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:16:05,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:16:05,908 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 02:16:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:05,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:16:08,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:11,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:16:11,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:11,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:16:11,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 02:16:13,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:16:13,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:16:14,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:16:14,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:16:14,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1102373.3333333333, ans=0.0 2023-10-03 02:16:15,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:16:19,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 02:16:20,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 02:16:20,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:20,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:16:21,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:16:21,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:16:21,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:16:28,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:28,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:16:29,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:31,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1102440.0, ans=0.0 2023-10-03 02:16:34,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:16:34,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:35,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1102440.0, ans=0.1 2023-10-03 02:16:40,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:16:40,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:16:41,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1102506.6666666667, ans=0.2 2023-10-03 02:16:41,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.86 vs. limit=12.0 2023-10-03 02:16:42,281 INFO [train.py:1046] (2/4) Epoch 32, batch 700, loss[loss=0.1657, simple_loss=0.2524, pruned_loss=0.03944, over 24402.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2392, pruned_loss=0.04204, over 4546963.66 frames. ], batch size: 77, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:16:42,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:16:42,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:16:48,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 02:16:48,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 02:16:50,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.24 vs. limit=15.0 2023-10-03 02:16:51,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 02:16:51,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:53,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:16:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 02:16:55,288 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.36 vs. limit=22.5 2023-10-03 02:17:00,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:17:02,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:17:02,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:17:03,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.17 vs. limit=10.0 2023-10-03 02:17:05,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:17:05,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:17:08,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:17:10,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 02:17:11,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:17:11,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1102640.0, ans=0.05 2023-10-03 02:17:12,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 02:17:14,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 02:17:16,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1102640.0, ans=0.1 2023-10-03 02:17:17,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:17:19,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:17:20,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:17:23,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:17:23,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 02:17:28,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:17:28,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:17:28,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 02:17:32,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.64 vs. limit=15.0 2023-10-03 02:17:32,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:17:34,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:17:34,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1102706.6666666667, ans=0.125 2023-10-03 02:17:36,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:17:42,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:17:43,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 02:17:45,707 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.878e+02 2.011e+02 2.266e+02 3.115e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-03 02:17:45,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 02:17:45,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 02:17:49,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:17:50,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:17:51,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:17:53,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:17:53,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 02:17:58,505 INFO [train.py:1046] (2/4) Epoch 32, batch 750, loss[loss=0.1629, simple_loss=0.2357, pruned_loss=0.04507, over 23695.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.238, pruned_loss=0.04187, over 4578720.62 frames. ], batch size: 232, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:17:58,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 02:17:59,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 02:17:59,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 02:17:59,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 02:18:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 02:18:01,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:18:02,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 02:18:03,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:18:05,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:18:06,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1102840.0, ans=0.1 2023-10-03 02:18:07,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:09,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:18:09,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:18:11,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1102906.6666666667, ans=10.0 2023-10-03 02:18:12,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:18:13,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:18:15,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:18:18,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:18,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:20,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 02:18:20,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:18:21,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:18:23,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:18:24,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:18:26,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 02:18:26,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:18:28,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 02:18:28,097 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 02:18:28,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 02:18:28,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:18:28,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:18:30,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:18:38,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:18:38,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:18:39,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:18:42,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:43,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:18:43,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 02:18:43,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:18:45,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 02:18:45,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:18:48,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:18:49,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.26 vs. limit=6.0 2023-10-03 02:18:49,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 02:18:49,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:18:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:18:55,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:18:56,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:59,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:19:01,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 02:19:01,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:19:01,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:06,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:06,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:06,806 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:19:09,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:09,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:19:11,916 INFO [train.py:1046] (2/4) Epoch 32, batch 800, loss[loss=0.1703, simple_loss=0.2447, pruned_loss=0.04792, over 23640.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2391, pruned_loss=0.04193, over 4609985.99 frames. ], batch size: 256, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:19:15,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1103173.3333333333, ans=0.95 2023-10-03 02:19:16,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:16,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:18,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:19:18,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:19,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:21,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:19:26,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1103240.0, ans=0.125 2023-10-03 02:19:28,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 02:19:28,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:30,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:30,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:19:30,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:19:30,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 02:19:30,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:32,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 02:19:33,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:35,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1103240.0, ans=0.125 2023-10-03 02:19:36,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:39,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:39,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:19:42,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:42,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:48,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:19:48,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:19:48,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 02:19:49,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1103306.6666666667, ans=0.1 2023-10-03 02:19:51,044 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 02:19:51,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1103306.6666666667, ans=0.125 2023-10-03 02:19:52,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 02:19:52,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:19:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:54,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:54,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:20:00,530 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 02:20:00,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 02:20:03,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:20:06,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:20:09,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:20:12,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:20:13,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 02:20:13,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:20:14,953 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.903e+02 2.081e+02 2.304e+02 3.546e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-03 02:20:17,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 02:20:20,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1103440.0, ans=0.125 2023-10-03 02:20:22,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:20:25,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:20:25,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 02:20:25,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1103506.6666666667, ans=0.125 2023-10-03 02:20:26,700 INFO [train.py:1046] (2/4) Epoch 32, batch 850, loss[loss=0.1444, simple_loss=0.2305, pruned_loss=0.02917, over 24505.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2403, pruned_loss=0.04266, over 4626387.55 frames. ], batch size: 63, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:20:27,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:20:27,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1103506.6666666667, ans=0.0 2023-10-03 02:20:28,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:20:28,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 02:20:28,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:30,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:20:31,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:20:34,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:20:34,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:20:34,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 02:20:36,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 02:20:36,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 02:20:37,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:20:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:20:40,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:20:40,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:20:40,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:20:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:46,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:20:46,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 02:20:50,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 02:20:53,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:54,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 02:20:59,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 02:20:59,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 02:21:01,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1103640.0, ans=0.125 2023-10-03 02:21:03,919 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 02:21:03,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:21:03,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:21:03,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:21:06,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:08,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:08,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 02:21:11,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:21:11,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:21:12,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:21:12,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:21:15,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:21:15,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:21:17,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 02:21:18,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1103706.6666666667, ans=0.125 2023-10-03 02:21:21,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:21:21,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:21:21,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:21:21,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:21:22,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:21:24,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.47 vs. limit=15.0 2023-10-03 02:21:25,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:27,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:21:28,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:21:30,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:21:31,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:21:36,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:21:37,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:21:39,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 02:21:39,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:21:39,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.96 vs. limit=22.5 2023-10-03 02:21:40,472 INFO [train.py:1046] (2/4) Epoch 32, batch 900, loss[loss=0.1545, simple_loss=0.2464, pruned_loss=0.03125, over 24444.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2413, pruned_loss=0.04302, over 4655221.72 frames. ], batch size: 69, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:21:40,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:21:43,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 02:21:47,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:21:50,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:21:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 02:21:52,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1103840.0, ans=0.0 2023-10-03 02:21:55,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:21:56,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 02:21:56,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 02:21:58,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:21:58,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:21:58,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:21:59,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:22:07,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:07,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:22:07,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:22:11,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:22:16,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 02:22:16,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:22:22,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:22:22,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:22:23,677 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 02:22:23,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1104040.0, ans=0.05 2023-10-03 02:22:25,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 02:22:31,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:22:32,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:22:32,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:22:38,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:38,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:22:40,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 02:22:40,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:22:43,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 02:22:45,151 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.827e+02 2.004e+02 2.233e+02 3.058e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-03 02:22:45,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:22:46,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:47,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:22:47,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:22:50,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 02:22:52,055 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 02:22:53,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:22:53,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 02:22:54,747 INFO [train.py:1046] (2/4) Epoch 32, batch 950, loss[loss=0.167, simple_loss=0.2352, pruned_loss=0.04942, over 23895.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2423, pruned_loss=0.04297, over 4669841.90 frames. ], batch size: 196, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:22:56,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:59,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 02:23:05,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:06,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:06,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:07,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:23:09,840 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 02:23:14,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:14,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:23:14,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1104240.0, ans=0.1 2023-10-03 02:23:15,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:15,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:23:16,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 02:23:17,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:23:20,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:21,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 02:23:22,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:23:26,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:23:26,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:23:27,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 02:23:30,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 02:23:30,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1104306.6666666667, ans=0.0 2023-10-03 02:23:32,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.70 vs. limit=22.5 2023-10-03 02:23:32,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:23:34,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:23:35,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1104306.6666666667, ans=0.015 2023-10-03 02:23:35,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.15 vs. limit=22.5 2023-10-03 02:23:37,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:23:37,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:41,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 02:23:44,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 02:23:44,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:23:44,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:23:45,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:45,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:23:49,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 02:23:49,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:23:51,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1104373.3333333333, ans=0.2 2023-10-03 02:23:52,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:23:52,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:52,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 02:23:52,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:52,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:23:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 02:23:53,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1104440.0, ans=0.125 2023-10-03 02:23:57,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:23:58,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:59,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.97 vs. limit=15.0 2023-10-03 02:24:04,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:24:06,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 02:24:06,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 02:24:08,735 INFO [train.py:1046] (2/4) Epoch 32, batch 1000, loss[loss=0.1606, simple_loss=0.2288, pruned_loss=0.04619, over 23733.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2416, pruned_loss=0.04288, over 4679069.75 frames. ], batch size: 232, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:24:08,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:24:09,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1104506.6666666667, ans=0.125 2023-10-03 02:24:13,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 02:24:14,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:15,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1104506.6666666667, ans=0.0 2023-10-03 02:24:18,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:24:18,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.03 vs. limit=15.0 2023-10-03 02:24:19,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 02:24:19,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 02:24:23,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:23,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:24:25,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 02:24:32,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 02:24:34,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 02:24:34,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:24:36,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 02:24:37,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 02:24:37,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 02:24:38,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:38,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:40,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1104640.0, ans=0.04949747468305833 2023-10-03 02:24:48,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:49,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:24:49,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:51,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:51,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 02:24:51,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:24:52,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:24:53,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:55,209 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 02:24:58,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 02:24:58,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 02:25:00,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 02:25:00,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1104706.6666666667, ans=0.1 2023-10-03 02:25:01,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:25:07,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:07,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:25:08,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:09,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:25:10,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 02:25:11,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:25:13,047 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.852e+02 2.033e+02 2.255e+02 3.341e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-03 02:25:13,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 02:25:13,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 02:25:13,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:25:13,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:25:18,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:25:19,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:25:22,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:25:23,460 INFO [train.py:1046] (2/4) Epoch 32, batch 1050, loss[loss=0.1542, simple_loss=0.2276, pruned_loss=0.04036, over 23775.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2407, pruned_loss=0.04271, over 4687994.82 frames. ], batch size: 212, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:25:24,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:25:26,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:25:28,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:25:29,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:31,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:25:33,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:25:35,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:25:38,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:25:39,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:25:39,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:25:41,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:25:41,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 02:25:42,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:25:43,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 02:25:46,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:25:46,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 02:25:46,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:25:49,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1104906.6666666667, ans=0.0 2023-10-03 02:25:52,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:54,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:25:54,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:25:55,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 02:25:55,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 02:25:57,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:26:00,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 02:26:03,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 02:26:03,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:07,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:26:09,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:26:09,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:26:10,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:26:13,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:26:13,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1105040.0, ans=0.0 2023-10-03 02:26:16,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 02:26:17,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1105040.0, ans=0.125 2023-10-03 02:26:18,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1105040.0, ans=0.04949747468305833 2023-10-03 02:26:19,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 02:26:19,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 02:26:19,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:26:19,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:26:21,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 02:26:25,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:26:27,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:26:27,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:26:28,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:26:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:31,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:31,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 02:26:33,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:26:33,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 02:26:34,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 02:26:36,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:26:36,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1105173.3333333333, ans=0.0 2023-10-03 02:26:37,351 INFO [train.py:1046] (2/4) Epoch 32, batch 1100, loss[loss=0.1499, simple_loss=0.2294, pruned_loss=0.03525, over 24620.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2404, pruned_loss=0.04231, over 4696122.44 frames. ], batch size: 60, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:26:40,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:26:44,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:26:44,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1105173.3333333333, ans=0.025 2023-10-03 02:26:49,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:26:49,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1105173.3333333333, ans=0.025 2023-10-03 02:26:51,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:26:51,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:26:52,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 02:26:53,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:26:55,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:26:56,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:27:01,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:27:01,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 02:27:02,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:27:03,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:27:03,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:27:07,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:27:08,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:27:13,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:27:15,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 02:27:17,331 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 02:27:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:20,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:20,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:27:22,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:27:23,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 02:27:23,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:27:23,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:27:25,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:27:25,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:25,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 02:27:29,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:27:29,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 02:27:32,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:27:36,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1105440.0, ans=0.0 2023-10-03 02:27:38,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:27:41,292 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.860e+02 2.079e+02 2.474e+02 4.878e+02, threshold=4.158e+02, percent-clipped=1.0 2023-10-03 02:27:41,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 02:27:41,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 02:27:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:42,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:27:44,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:27:45,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 02:27:47,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:27:47,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:27:48,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 02:27:50,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:27:50,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 02:27:51,626 INFO [train.py:1046] (2/4) Epoch 32, batch 1150, loss[loss=0.1668, simple_loss=0.252, pruned_loss=0.0408, over 24524.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2407, pruned_loss=0.04222, over 4703291.59 frames. ], batch size: 66, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:27:51,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:27:51,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:27:53,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:27:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:27:59,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:28:01,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:28:01,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:28:02,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 02:28:03,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:28:05,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 02:28:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:28:06,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:28:10,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1105573.3333333333, ans=0.0 2023-10-03 02:28:11,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 02:28:14,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:28:17,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:28:17,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:18,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 02:28:18,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:28:18,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:28:21,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 02:28:23,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:28:24,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:28:29,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1105640.0, ans=0.1 2023-10-03 02:28:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:36,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1105706.6666666667, ans=0.125 2023-10-03 02:28:36,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1105706.6666666667, ans=0.125 2023-10-03 02:28:38,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1105706.6666666667, ans=0.0 2023-10-03 02:28:41,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:41,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 02:28:42,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:42,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 02:28:49,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:55,496 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 02:28:58,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:28:58,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:29:00,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:29:00,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:29:04,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:29:06,560 INFO [train.py:1046] (2/4) Epoch 32, batch 1200, loss[loss=0.1601, simple_loss=0.2409, pruned_loss=0.03971, over 24577.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2409, pruned_loss=0.04249, over 4708196.77 frames. ], batch size: 60, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:29:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:29:08,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:29:08,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1105840.0, ans=0.0 2023-10-03 02:29:10,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:10,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:10,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:29:13,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:29:15,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:29:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:29:16,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:29:18,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1105840.0, ans=0.1 2023-10-03 02:29:19,602 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 02:29:22,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 02:29:27,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:29:30,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:29:32,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:33,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:29:33,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 02:29:35,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:36,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1105973.3333333333, ans=0.2 2023-10-03 02:29:42,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:29:42,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:29:42,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 02:29:42,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:29:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 02:29:50,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 02:29:50,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:52,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:29:53,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:29:54,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:29:55,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:55,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:29:58,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:29:58,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 02:29:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:29:59,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:29:59,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:30:02,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:30:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:30:07,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:30:09,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:30:11,106 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.987e+02 2.198e+02 2.518e+02 3.756e+02, threshold=4.395e+02, percent-clipped=0.0 2023-10-03 02:30:12,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 02:30:12,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.65 vs. limit=15.0 2023-10-03 02:30:15,359 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 02:30:16,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:30:19,256 INFO [train.py:1046] (2/4) Epoch 32, batch 1250, loss[loss=0.2384, simple_loss=0.2996, pruned_loss=0.08861, over 19562.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.242, pruned_loss=0.04306, over 4706014.20 frames. ], batch size: 388, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:30:19,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:30:20,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:30:22,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:30:24,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 02:30:27,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:30:29,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:30:29,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 02:30:30,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:30:32,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:30:37,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:30:37,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:30:38,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:30:38,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:30:42,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:30:45,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 02:30:45,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:30:45,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:30:47,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:30:47,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:30:49,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:30:50,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:30:57,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 02:30:57,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:30:59,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:31:01,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 02:31:01,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:31:01,483 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 02:31:01,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:01,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:07,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:31:10,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:31:10,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1106373.3333333333, ans=10.0 2023-10-03 02:31:11,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:31:11,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 02:31:11,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 02:31:11,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 02:31:15,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:31:16,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 02:31:17,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:20,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 02:31:21,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:31:23,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 02:31:24,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:31:24,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:31:26,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:31:27,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:31:29,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 02:31:31,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1106440.0, ans=0.0 2023-10-03 02:31:32,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:31:32,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:31:33,838 INFO [train.py:1046] (2/4) Epoch 32, batch 1300, loss[loss=0.1685, simple_loss=0.2561, pruned_loss=0.04048, over 24574.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2432, pruned_loss=0.04361, over 4699734.32 frames. ], batch size: 71, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:31:33,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:31:35,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:31:38,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:31:39,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 02:31:44,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:31:44,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1106506.6666666667, ans=0.125 2023-10-03 02:31:46,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:31:47,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:31:49,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:49,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:31:50,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 02:31:51,287 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.61 vs. limit=15.0 2023-10-03 02:31:54,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:31:55,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1106573.3333333333, ans=0.125 2023-10-03 02:31:56,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:31:56,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 02:31:58,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1106573.3333333333, ans=0.125 2023-10-03 02:31:59,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:32:01,267 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:32:03,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:05,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:32:05,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:32:05,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1106640.0, ans=0.1 2023-10-03 02:32:07,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:07,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:32:08,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:32:09,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.81 vs. limit=15.0 2023-10-03 02:32:09,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 02:32:15,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:32:15,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1106640.0, ans=0.5 2023-10-03 02:32:16,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:32:17,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 02:32:17,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1106706.6666666667, ans=0.125 2023-10-03 02:32:18,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:32:19,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1106706.6666666667, ans=0.125 2023-10-03 02:32:20,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:32:21,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:32:23,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 02:32:23,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:32:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 02:32:24,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:32:28,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:32:28,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:32:28,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1106706.6666666667, ans=0.125 2023-10-03 02:32:29,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.48 vs. limit=22.5 2023-10-03 02:32:31,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.79 vs. limit=10.0 2023-10-03 02:32:33,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 02:32:34,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 02:32:34,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 02:32:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:32:39,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1106773.3333333333, ans=0.09899494936611666 2023-10-03 02:32:41,087 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.861e+02 2.113e+02 2.556e+02 3.728e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 02:32:41,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 02:32:42,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:48,760 INFO [train.py:1046] (2/4) Epoch 32, batch 1350, loss[loss=0.1548, simple_loss=0.2313, pruned_loss=0.03917, over 21131.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2421, pruned_loss=0.04333, over 4688399.62 frames. ], batch size: 46, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:32:50,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 02:32:53,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:32:54,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:32:56,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1106840.0, ans=0.0 2023-10-03 02:32:58,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:58,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:33:00,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:33:00,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:33:03,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:33:04,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 02:33:05,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:33:07,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:33:12,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 02:33:12,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:33:14,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:33:14,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 02:33:17,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 02:33:19,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 02:33:21,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:21,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 02:33:29,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:37,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:37,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:33:37,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 02:33:42,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:33:43,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 02:33:43,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:33:45,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:33:45,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1107040.0, ans=10.0 2023-10-03 02:33:48,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:33:50,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 02:33:51,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:33:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 02:33:54,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1107106.6666666667, ans=0.125 2023-10-03 02:33:57,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 02:33:57,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1107106.6666666667, ans=0.125 2023-10-03 02:33:58,346 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=12.0 2023-10-03 02:34:00,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1107106.6666666667, ans=0.125 2023-10-03 02:34:01,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 02:34:01,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1107173.3333333333, ans=0.125 2023-10-03 02:34:03,004 INFO [train.py:1046] (2/4) Epoch 32, batch 1400, loss[loss=0.1743, simple_loss=0.2549, pruned_loss=0.04685, over 23271.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2417, pruned_loss=0.04281, over 4695127.07 frames. ], batch size: 105, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:34:03,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:34:04,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1107173.3333333333, ans=0.0 2023-10-03 02:34:07,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:34:07,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:34:09,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1107173.3333333333, ans=0.2 2023-10-03 02:34:12,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 02:34:13,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 02:34:20,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.73 vs. limit=6.0 2023-10-03 02:34:25,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:34:26,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1107240.0, ans=0.1 2023-10-03 02:34:27,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:34:28,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:34:28,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:34:31,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:34:33,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 02:34:36,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.95 vs. limit=22.5 2023-10-03 02:34:41,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:41,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:47,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 02:34:47,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:34:49,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:34:49,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:34:49,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:34:49,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1107373.3333333333, ans=0.0 2023-10-03 02:34:51,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:34:51,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:34:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:34:53,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 02:34:53,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:34:56,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1107373.3333333333, ans=0.125 2023-10-03 02:34:58,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:58,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1107373.3333333333, ans=0.0 2023-10-03 02:35:02,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:35:05,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1107440.0, ans=0.125 2023-10-03 02:35:09,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 02:35:09,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1107440.0, ans=0.0 2023-10-03 02:35:10,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.787e+02 1.940e+02 2.244e+02 3.961e+02, threshold=3.881e+02, percent-clipped=0.0 2023-10-03 02:35:11,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:35:11,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:35:14,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 02:35:14,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:16,730 INFO [train.py:1046] (2/4) Epoch 32, batch 1450, loss[loss=0.1458, simple_loss=0.2254, pruned_loss=0.03304, over 23640.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2403, pruned_loss=0.04224, over 4700186.36 frames. ], batch size: 149, lr: 3.20e-03, grad_scale: 4.0 2023-10-03 02:35:16,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:35:17,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1107506.6666666667, ans=15.0 2023-10-03 02:35:20,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:35:22,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:35:22,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:22,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 02:35:22,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-10-03 02:35:22,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.85 vs. limit=15.0 2023-10-03 02:35:25,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1107506.6666666667, ans=0.2 2023-10-03 02:35:26,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:27,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:35:29,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:35:30,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 02:35:30,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:35:31,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1107573.3333333333, ans=0.2 2023-10-03 02:35:32,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 02:35:32,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:32,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1107573.3333333333, ans=0.2 2023-10-03 02:35:33,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:33,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 02:35:33,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:35:34,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.62 vs. limit=15.0 2023-10-03 02:35:34,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:35:35,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 02:35:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:36,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:35:37,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:41,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:43,373 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:35:46,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:35:47,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:35:49,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:49,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:51,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:51,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:35:51,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:53,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:35:57,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 02:35:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:36:01,544 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 02:36:04,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:36:06,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:36:07,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:08,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 02:36:12,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:14,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 02:36:15,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 02:36:15,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:16,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1107773.3333333333, ans=0.1 2023-10-03 02:36:19,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:36:21,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:36:21,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 02:36:22,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 02:36:23,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 02:36:25,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:25,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:36:28,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1107773.3333333333, ans=0.125 2023-10-03 02:36:30,827 INFO [train.py:1046] (2/4) Epoch 32, batch 1500, loss[loss=0.175, simple_loss=0.2476, pruned_loss=0.05117, over 22766.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2414, pruned_loss=0.0421, over 4712931.70 frames. ], batch size: 322, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:36:31,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1107840.0, ans=0.1 2023-10-03 02:36:35,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 02:36:35,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:36:35,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:36:37,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:38,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:36:39,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:36:39,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 02:36:41,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:36:42,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:36:42,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:36:42,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:36:45,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:36:45,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:36:52,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:36:52,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 02:36:52,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:36:54,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:36:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:55,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1107906.6666666667, ans=0.125 2023-10-03 02:36:58,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 02:37:02,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 02:37:04,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:37:04,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 02:37:06,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:37:09,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:37:10,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:37:10,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:37:11,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 02:37:11,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:37:11,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:37:13,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 02:37:14,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:37:17,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:37:17,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 02:37:25,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:37:25,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:37:27,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1108040.0, ans=0.0 2023-10-03 02:37:29,729 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 02:37:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:31,130 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 02:37:32,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:37:33,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:37:33,889 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 02:37:35,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:37:38,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 02:37:39,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1108106.6666666667, ans=0.0 2023-10-03 02:37:40,020 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.890e+02 2.007e+02 2.185e+02 3.461e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 02:37:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:41,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1108106.6666666667, ans=0.125 2023-10-03 02:37:44,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:37:44,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:45,542 INFO [train.py:1046] (2/4) Epoch 32, batch 1550, loss[loss=0.1569, simple_loss=0.2397, pruned_loss=0.03704, over 24548.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2419, pruned_loss=0.04212, over 4699488.12 frames. ], batch size: 66, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:37:45,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:37:45,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:46,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:37:48,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 02:37:48,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 02:37:48,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:37:49,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 02:37:50,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 02:37:50,646 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:37:53,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:37:54,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:54,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:37:54,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:37:57,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:57,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:38:00,229 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 02:38:00,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:00,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:38:01,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:38:01,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1108240.0, ans=0.125 2023-10-03 02:38:03,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:38:03,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 02:38:04,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:38:04,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 02:38:06,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 02:38:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 02:38:06,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:07,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:10,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:38:13,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 02:38:13,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 02:38:19,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1108306.6666666667, ans=0.1 2023-10-03 02:38:22,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:25,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:38:25,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:38:25,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:38:27,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 02:38:31,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:38:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:34,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:38:36,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.36 vs. limit=15.0 2023-10-03 02:38:38,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:38:38,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:38,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 02:38:38,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:38:40,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:38:41,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:42,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 02:38:42,943 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 02:38:43,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1108440.0, ans=0.1 2023-10-03 02:38:44,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:38:46,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1108440.0, ans=0.0 2023-10-03 02:38:48,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 02:38:53,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.77 vs. limit=12.0 2023-10-03 02:38:54,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:38:56,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:58,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 02:38:59,591 INFO [train.py:1046] (2/4) Epoch 32, batch 1600, loss[loss=0.1803, simple_loss=0.2606, pruned_loss=0.05002, over 24324.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2428, pruned_loss=0.04226, over 4708181.73 frames. ], batch size: 77, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:38:59,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:39:01,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:39:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:39:01,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:39:02,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:39:04,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:04,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 02:39:05,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 02:39:08,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 02:39:10,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:39:11,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 02:39:11,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:39:13,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:39:13,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1108573.3333333333, ans=0.125 2023-10-03 02:39:17,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:39:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 02:39:24,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:39:25,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1108573.3333333333, ans=0.2 2023-10-03 02:39:26,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 02:39:26,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:27,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 02:39:31,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 02:39:35,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1108640.0, ans=0.07 2023-10-03 02:39:41,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:39:43,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 02:39:43,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:39:44,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:39:44,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:39:47,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 02:39:50,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 02:39:51,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:39:52,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:53,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:53,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1108706.6666666667, ans=0.125 2023-10-03 02:39:54,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:39:58,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:39:58,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:40:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:40:04,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:40:05,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:40:07,257 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.902e+02 2.151e+02 2.646e+02 3.941e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-03 02:40:07,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 02:40:07,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:40:08,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 02:40:12,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:40:13,650 INFO [train.py:1046] (2/4) Epoch 32, batch 1650, loss[loss=0.1562, simple_loss=0.2338, pruned_loss=0.03933, over 23695.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2427, pruned_loss=0.04182, over 4722208.99 frames. ], batch size: 149, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:40:13,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:40:13,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:40:13,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 02:40:15,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 02:40:15,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 02:40:15,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 02:40:15,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1108840.0, ans=0.125 2023-10-03 02:40:17,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:40:19,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:40:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:40:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:40:23,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:40:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 02:40:27,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:40:27,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:40:27,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:40:27,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:40:29,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 02:40:30,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 02:40:30,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1108906.6666666667, ans=0.125 2023-10-03 02:40:34,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:40:36,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:40:36,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-10-03 02:40:45,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1108973.3333333333, ans=0.1 2023-10-03 02:40:46,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 02:40:47,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:40:50,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 02:40:53,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:40:55,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:40:55,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:40:55,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:40:57,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:40:57,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:00,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:01,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:01,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:41:01,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:41:02,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:04,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:41:06,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:41:08,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 02:41:09,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:41:09,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 02:41:11,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 02:41:11,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 02:41:11,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:13,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:41:14,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:41:14,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:14,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 02:41:17,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:41:20,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:41:20,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:41:22,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 02:41:26,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:41:26,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:41:28,037 INFO [train.py:1046] (2/4) Epoch 32, batch 1700, loss[loss=0.1463, simple_loss=0.2078, pruned_loss=0.04241, over 22752.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2415, pruned_loss=0.04176, over 4725720.92 frames. ], batch size: 322, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:41:28,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 02:41:28,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:41:28,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:41:28,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:30,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:41:32,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:41:32,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 02:41:35,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:41:43,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:45,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:41:52,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:41:52,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:41:52,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:41:52,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:41:53,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 02:41:55,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:41:55,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:58,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:42:00,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:42:01,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 02:42:01,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 02:42:03,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:05,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1109306.6666666667, ans=0.07 2023-10-03 02:42:06,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 02:42:06,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:42:13,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:14,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:42:17,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:42:19,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 02:42:19,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:42:19,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1109373.3333333333, ans=0.125 2023-10-03 02:42:21,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:21,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 02:42:23,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:42:23,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:42:23,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:23,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:26,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:42:26,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:42:27,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:27,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:42:29,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:34,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:42:35,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 02:42:36,918 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.849e+02 2.096e+02 2.324e+02 3.909e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-03 02:42:37,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:38,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:42:39,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 02:42:41,315 INFO [train.py:1046] (2/4) Epoch 32, batch 1750, loss[loss=0.1687, simple_loss=0.2377, pruned_loss=0.04986, over 23709.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2404, pruned_loss=0.04165, over 4731651.04 frames. ], batch size: 179, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:42:43,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.86 vs. limit=10.0 2023-10-03 02:42:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:46,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:47,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:42:47,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 02:42:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:50,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:42:50,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:54,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 02:42:58,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:00,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 02:43:00,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:43:02,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:43:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:43:04,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1109573.3333333333, ans=0.2 2023-10-03 02:43:05,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 02:43:08,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:43:08,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 02:43:15,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:43:18,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:43:18,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:43:22,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:22,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:43:24,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:43:27,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:27,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1109706.6666666667, ans=0.0 2023-10-03 02:43:29,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:43:29,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:43:30,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 02:43:32,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:43:35,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 02:43:35,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:43:36,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:38,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:43:41,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:43:42,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:43:42,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:44,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:43:47,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:49,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=22.5 2023-10-03 02:43:50,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:43:50,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:43:51,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 02:43:51,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:43:52,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:43:52,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:43:52,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:43:52,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:43:54,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:43:56,226 INFO [train.py:1046] (2/4) Epoch 32, batch 1800, loss[loss=0.1526, simple_loss=0.2273, pruned_loss=0.03898, over 23413.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2402, pruned_loss=0.04142, over 4723527.82 frames. ], batch size: 285, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:43:57,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:43:59,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:44:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:44:03,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:44:05,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1109840.0, ans=0.125 2023-10-03 02:44:06,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 02:44:07,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1109840.0, ans=0.0 2023-10-03 02:44:08,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:44:12,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:15,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:15,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:16,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:44:18,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:44:18,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 02:44:18,820 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-03 02:44:19,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:21,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:24,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.25 vs. limit=15.0 2023-10-03 02:44:25,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 02:44:25,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 02:44:25,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 02:44:27,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:28,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:28,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:44:28,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:44:29,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.81 vs. limit=6.0 2023-10-03 02:44:36,612 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 02:44:37,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:44:39,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:40,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 02:44:42,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 02:44:42,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:44:42,788 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.00 vs. limit=10.0 2023-10-03 02:44:43,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1110040.0, ans=0.1 2023-10-03 02:44:44,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:44:45,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1110040.0, ans=0.09899494936611666 2023-10-03 02:44:46,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:44:49,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 02:44:55,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:44:55,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 02:44:56,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:44:56,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:56,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:44:58,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 02:45:01,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:45:01,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:45:02,133 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:45:05,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 02:45:05,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:45:06,183 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.827e+02 1.959e+02 2.120e+02 2.855e+02, threshold=3.918e+02, percent-clipped=0.0 2023-10-03 02:45:07,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:45:07,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:45:07,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:45:09,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:45:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:45:10,449 INFO [train.py:1046] (2/4) Epoch 32, batch 1850, loss[loss=0.1542, simple_loss=0.2396, pruned_loss=0.03441, over 24443.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2407, pruned_loss=0.04141, over 4731592.33 frames. ], batch size: 63, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:45:11,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:45:11,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:45:15,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:45:15,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:45:16,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1110173.3333333333, ans=0.125 2023-10-03 02:45:22,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:45:22,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 02:45:26,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 02:45:30,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 02:45:35,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:45:35,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 02:45:35,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 02:45:40,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1110306.6666666667, ans=0.125 2023-10-03 02:45:44,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:45:47,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 02:45:50,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:45:51,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:45:55,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 02:45:55,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:45:56,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:45:56,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:45:58,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:46:00,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:46:05,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:46:05,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:06,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:46:06,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:08,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:46:10,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:46:13,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 02:46:14,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:46:17,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:46:17,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:46:17,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 02:46:17,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 02:46:20,144 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 02:46:20,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 02:46:21,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:46:21,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:46:21,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:46:21,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:23,012 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 02:46:24,260 INFO [train.py:1046] (2/4) Epoch 32, batch 1900, loss[loss=0.1591, simple_loss=0.2318, pruned_loss=0.04319, over 23676.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2417, pruned_loss=0.04167, over 4727038.57 frames. ], batch size: 164, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:46:24,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:46:24,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:26,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:46:27,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:46:27,734 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:46:27,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1110506.6666666667, ans=0.2 2023-10-03 02:46:28,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:46:28,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 02:46:30,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:30,419 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 02:46:30,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:46:31,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:32,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.13 vs. limit=15.0 2023-10-03 02:46:36,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:40,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:46:40,971 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 02:46:42,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 02:46:44,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:46:44,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:46:44,131 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 02:46:44,166 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 02:46:48,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 02:46:49,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:46:53,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 02:46:57,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 02:47:06,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 02:47:08,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 02:47:08,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:08,806 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 02:47:08,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 02:47:08,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 02:47:10,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 02:47:10,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:47:15,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 02:47:16,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:47:20,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:47:20,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 02:47:23,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:47:26,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 02:47:26,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:47:31,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1110773.3333333333, ans=0.125 2023-10-03 02:47:33,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:47:33,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:47:33,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:47:35,106 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.918e+02 2.080e+02 2.358e+02 3.129e+02, threshold=4.160e+02, percent-clipped=0.0 2023-10-03 02:47:35,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:47:36,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:47:36,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 02:47:37,801 INFO [train.py:1046] (2/4) Epoch 32, batch 1950, loss[loss=0.1599, simple_loss=0.2455, pruned_loss=0.03709, over 24356.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2423, pruned_loss=0.04172, over 4739382.06 frames. ], batch size: 77, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:47:37,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:47:39,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:47:39,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:47:44,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:47:44,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:47:44,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:47:46,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:47:48,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:47:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:47:50,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:50,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:47:54,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 02:47:54,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:47:54,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:56,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:57,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:47:59,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:47:59,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:00,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:48:03,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:48:03,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:48:03,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:48:05,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:05,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1110906.6666666667, ans=0.125 2023-10-03 02:48:07,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:09,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:48:09,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:09,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:48:09,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 02:48:10,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:48:10,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:48:10,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:14,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:17,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:48:21,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:48:23,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:48:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:48:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 02:48:25,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:48:28,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:48:30,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:48:30,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1111040.0, ans=0.0 2023-10-03 02:48:31,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:48:37,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:38,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:42,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:44,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:46,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:48:46,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:47,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 02:48:47,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:48:49,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:49,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1111106.6666666667, ans=0.125 2023-10-03 02:48:49,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1111106.6666666667, ans=0.125 2023-10-03 02:48:50,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 02:48:51,917 INFO [train.py:1046] (2/4) Epoch 32, batch 2000, loss[loss=0.151, simple_loss=0.2225, pruned_loss=0.03977, over 19196.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2428, pruned_loss=0.04215, over 4719583.23 frames. ], batch size: 42, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:48:51,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:48:54,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:48:54,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:48:56,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:48:58,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:48:59,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:02,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 02:49:03,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:49:06,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:49:07,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 02:49:09,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:49:09,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:49:15,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:49:15,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 02:49:15,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:17,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:17,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:19,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 02:49:20,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:49:21,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.65 vs. limit=12.0 2023-10-03 02:49:22,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 02:49:22,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:49:24,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:49:24,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:49:24,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:24,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:49:25,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1111306.6666666667, ans=0.0 2023-10-03 02:49:28,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:49:28,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 02:49:31,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 02:49:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:49:31,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:33,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.22 vs. limit=15.0 2023-10-03 02:49:36,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:38,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:49:38,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:49:39,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:49:41,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:49:41,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:42,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:49:42,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:44,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:47,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:49:48,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 02:49:50,743 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.84 vs. limit=15.0 2023-10-03 02:49:51,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1111440.0, ans=0.2 2023-10-03 02:49:52,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:49:54,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:56,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1111440.0, ans=0.0 2023-10-03 02:49:57,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:57,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:49:59,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:02,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1111440.0, ans=0.2 2023-10-03 02:50:03,073 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 2.023e+02 2.252e+02 2.571e+02 3.525e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-03 02:50:03,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:50:03,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:03,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:50:04,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:50:05,926 INFO [train.py:1046] (2/4) Epoch 32, batch 2050, loss[loss=0.1542, simple_loss=0.2359, pruned_loss=0.03628, over 24458.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.04251, over 4709167.33 frames. ], batch size: 63, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:50:06,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:07,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:08,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:50:10,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:15,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:50:17,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:50:18,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:19,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:50:21,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 02:50:21,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:50:22,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1111573.3333333333, ans=0.05 2023-10-03 02:50:24,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:50:25,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:50:32,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:50:32,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:35,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 02:50:35,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:37,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 02:50:38,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:50:39,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:50:43,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:50:44,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:50:44,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:50:46,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:50:48,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:50:48,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:50:49,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:50:52,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:50:54,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:50:56,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:51:00,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:51:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:51:06,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 02:51:11,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:51:11,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1111773.3333333333, ans=0.2 2023-10-03 02:51:12,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:51:14,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:51:16,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 02:51:19,411 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 02:51:19,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:20,584 INFO [train.py:1046] (2/4) Epoch 32, batch 2100, loss[loss=0.1453, simple_loss=0.1959, pruned_loss=0.0474, over 19321.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2403, pruned_loss=0.04214, over 4702567.58 frames. ], batch size: 388, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:51:20,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:51:20,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:51:22,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:51:22,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 02:51:22,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 02:51:24,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:51:27,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:51:27,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:51:27,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1111840.0, ans=0.125 2023-10-03 02:51:30,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:31,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:51:31,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 02:51:31,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:51:32,611 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.64 vs. limit=15.0 2023-10-03 02:51:32,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 02:51:32,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 02:51:34,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:51:36,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:51:36,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 02:51:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:51:42,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 02:51:42,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:51:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:51:45,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:51:45,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1111906.6666666667, ans=0.0 2023-10-03 02:51:47,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:51:48,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 02:51:50,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:51:50,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:51:51,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 02:51:53,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:53,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 02:51:53,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 02:51:53,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 02:51:56,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:51:57,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:51:58,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1111973.3333333333, ans=0.5 2023-10-03 02:52:00,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:52:00,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:52:01,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:04,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:04,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 02:52:04,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:04,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:06,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:06,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 02:52:09,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 02:52:09,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 02:52:14,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:52:16,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:52:18,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 02:52:24,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:26,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:52:26,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:52:26,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:52:26,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 02:52:26,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:52:28,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:29,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:52:29,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:52:29,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:30,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 02:52:32,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 02:52:33,714 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.908e+02 2.112e+02 2.525e+02 3.507e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 02:52:33,739 INFO [train.py:1046] (2/4) Epoch 32, batch 2150, loss[loss=0.1555, simple_loss=0.2294, pruned_loss=0.04083, over 23781.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2391, pruned_loss=0.04182, over 4704509.12 frames. ], batch size: 164, lr: 3.19e-03, grad_scale: 4.0 2023-10-03 02:52:33,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:52:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:35,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:52:35,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:52:36,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:52:43,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1112173.3333333333, ans=0.2 2023-10-03 02:52:44,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:52:46,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:52:48,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:49,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:52:49,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:52:49,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:52:52,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:52,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:52:52,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:52:52,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1112240.0, ans=0.0 2023-10-03 02:52:55,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:52:55,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 02:53:00,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:02,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:53:03,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:03,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:03,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:05,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:53:05,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:53:05,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:53:06,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:53:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 02:53:08,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:53:09,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:11,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:11,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:53:13,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:53:17,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:17,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:53:17,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1112373.3333333333, ans=0.1 2023-10-03 02:53:18,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:18,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 02:53:18,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:53:21,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:21,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:23,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:24,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:53:24,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:25,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:26,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 02:53:27,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 02:53:29,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:53:29,298 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 02:53:29,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:29,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:53:30,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 02:53:30,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:53:30,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 02:53:30,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 02:53:30,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 02:53:32,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 02:53:33,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:33,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:53:33,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:53:35,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:36,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:53:37,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:37,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:46,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:53:48,049 INFO [train.py:1046] (2/4) Epoch 32, batch 2200, loss[loss=0.1734, simple_loss=0.2496, pruned_loss=0.04854, over 23501.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2399, pruned_loss=0.04172, over 4720662.55 frames. ], batch size: 120, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:53:48,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 02:53:51,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:53:53,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1112506.6666666667, ans=0.035 2023-10-03 02:53:56,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:56,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:53:57,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:58,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-10-03 02:53:59,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:54:02,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:54:02,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:54:02,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 02:54:06,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 02:54:08,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:54:14,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 02:54:17,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:54:18,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:54:20,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:54:21,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1112640.0, ans=0.1 2023-10-03 02:54:23,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:54:24,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 02:54:24,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1112640.0, ans=0.125 2023-10-03 02:54:27,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:54:28,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:54:28,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:54:30,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1112706.6666666667, ans=0.0 2023-10-03 02:54:31,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:54:34,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:54:35,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:54:36,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:38,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 02:54:40,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:40,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 02:54:43,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:43,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:54:43,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:45,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:54:46,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:54:46,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:47,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:48,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:54:50,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:54:51,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 02:54:51,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1112773.3333333333, ans=0.1 2023-10-03 02:54:53,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1112773.3333333333, ans=0.1 2023-10-03 02:54:54,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:54:55,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:54:58,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:54:58,683 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 02:55:00,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:55:01,288 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.834e+02 1.970e+02 2.169e+02 2.586e+02, threshold=3.939e+02, percent-clipped=0.0 2023-10-03 02:55:01,314 INFO [train.py:1046] (2/4) Epoch 32, batch 2250, loss[loss=0.1741, simple_loss=0.258, pruned_loss=0.04509, over 23287.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2408, pruned_loss=0.04184, over 4729684.31 frames. ], batch size: 93, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:55:01,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 02:55:03,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:55:03,207 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 02:55:04,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:04,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 02:55:06,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:07,525 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 02:55:07,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:55:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:55:16,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:55:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:55:21,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:22,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:55:22,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:55:25,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 02:55:26,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:55:26,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:55:28,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 02:55:29,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:55:29,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:31,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:55:31,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1112973.3333333333, ans=0.2 2023-10-03 02:55:35,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:55:35,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 02:55:37,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:55:37,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 02:55:38,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:41,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:55:46,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:55:47,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.30 vs. limit=15.0 2023-10-03 02:55:47,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:55:49,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:49,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:55:52,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:55:53,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:55:54,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1113040.0, ans=0.125 2023-10-03 02:55:58,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:55:59,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:56:03,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:56:03,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:56:03,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:56:07,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:56:08,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1113106.6666666667, ans=0.125 2023-10-03 02:56:11,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:56:11,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 02:56:11,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:11,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:56:14,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 02:56:15,766 INFO [train.py:1046] (2/4) Epoch 32, batch 2300, loss[loss=0.1445, simple_loss=0.2257, pruned_loss=0.03162, over 20790.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2417, pruned_loss=0.04247, over 4712385.82 frames. ], batch size: 45, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:56:19,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:56:19,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:23,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-10-03 02:56:25,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1113173.3333333333, ans=0.0 2023-10-03 02:56:26,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:26,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:56:27,996 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 02:56:30,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:35,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:56:35,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:56:36,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:56:36,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:36,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 02:56:38,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:56:40,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:56:40,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:56:44,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:56:48,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:56:53,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:56:55,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:56:56,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:59,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:57:01,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:57:03,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:57:04,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:57:04,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:57:04,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 02:57:06,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1113373.3333333333, ans=10.0 2023-10-03 02:57:08,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:57:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:10,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:10,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:57:10,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:57:11,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1113373.3333333333, ans=0.0 2023-10-03 02:57:12,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 02:57:12,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:57:13,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 02:57:13,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:57:13,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:13,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 02:57:21,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:57:23,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1113440.0, ans=0.0 2023-10-03 02:57:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:57:28,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:57:30,093 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.927e+02 2.149e+02 2.530e+02 4.352e+02, threshold=4.298e+02, percent-clipped=1.0 2023-10-03 02:57:30,118 INFO [train.py:1046] (2/4) Epoch 32, batch 2350, loss[loss=0.1784, simple_loss=0.2547, pruned_loss=0.05104, over 23391.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2424, pruned_loss=0.04267, over 4713141.17 frames. ], batch size: 119, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:57:30,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:57:30,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:57:31,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:57:31,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:57:31,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:57:31,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 02:57:37,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1113506.6666666667, ans=0.125 2023-10-03 02:57:39,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:57:39,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 02:57:44,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 02:57:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:47,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:47,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:47,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:57:47,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:57:47,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1113573.3333333333, ans=0.2 2023-10-03 02:57:49,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 02:57:52,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:57:56,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 02:58:00,121 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.04 vs. limit=10.0 2023-10-03 02:58:00,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:58:03,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:58:03,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:58:05,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:58:06,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 02:58:06,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:58:08,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1113640.0, ans=0.025 2023-10-03 02:58:09,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:58:09,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:58:11,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:58:13,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-10-03 02:58:15,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:58:16,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1113706.6666666667, ans=0.0 2023-10-03 02:58:17,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 02:58:17,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:58:18,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:58:20,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:58:22,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 02:58:23,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1113706.6666666667, ans=0.0 2023-10-03 02:58:24,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:58:24,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.55 vs. limit=22.5 2023-10-03 02:58:26,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 02:58:26,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:58:27,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1113706.6666666667, ans=0.125 2023-10-03 02:58:30,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 02:58:33,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 02:58:35,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:58:35,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:58:35,216 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 02:58:35,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 02:58:37,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 02:58:41,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:58:43,816 INFO [train.py:1046] (2/4) Epoch 32, batch 2400, loss[loss=0.1748, simple_loss=0.2437, pruned_loss=0.0529, over 23774.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2417, pruned_loss=0.04272, over 4708071.13 frames. ], batch size: 212, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 02:58:43,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:58:48,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:58:50,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:58:50,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 02:58:51,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 02:58:53,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1113840.0, ans=0.125 2023-10-03 02:58:55,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1113840.0, ans=0.125 2023-10-03 02:58:57,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.19 vs. limit=15.0 2023-10-03 02:58:58,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1113906.6666666667, ans=0.0 2023-10-03 02:58:59,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 02:58:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:59:00,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 02:59:02,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:59:02,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:02,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 02:59:02,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.42 vs. limit=15.0 2023-10-03 02:59:07,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:10,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 02:59:14,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:59:18,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 02:59:20,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1113973.3333333333, ans=0.0 2023-10-03 02:59:21,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:59:24,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:28,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:59:28,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 02:59:28,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:59:36,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:36,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1114040.0, ans=0.125 2023-10-03 02:59:38,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1114040.0, ans=0.0 2023-10-03 02:59:39,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:59:40,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:59:42,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:59:42,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:59:42,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:59:42,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:44,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:59:44,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:59:45,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1114106.6666666667, ans=0.2 2023-10-03 02:59:48,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:59:49,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:59:50,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 02:59:51,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 02:59:53,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:59:53,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:54,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 02:59:55,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 02:59:55,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 02:59:55,991 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 02:59:56,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 02:59:57,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:59:59,077 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 1.995e+02 2.171e+02 3.013e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-03 02:59:59,102 INFO [train.py:1046] (2/4) Epoch 32, batch 2450, loss[loss=0.1531, simple_loss=0.2404, pruned_loss=0.03287, over 24485.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2398, pruned_loss=0.04229, over 4697174.64 frames. ], batch size: 66, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 02:59:59,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:00,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:01,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.49 vs. limit=12.0 2023-10-03 03:00:01,968 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 03:00:02,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:03,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:00:04,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:00:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:09,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:09,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:09,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 03:00:13,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:00:13,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:18,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1114240.0, ans=0.125 2023-10-03 03:00:19,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:00:19,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:00:19,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:00:19,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 03:00:23,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1114240.0, ans=0.125 2023-10-03 03:00:25,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:26,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:00:26,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:00:30,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:00:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:32,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:32,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 03:00:35,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:00:43,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:44,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:44,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:00:46,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:00:46,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:46,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:00:48,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 03:00:51,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:51,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:00:54,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:54,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:01:00,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:01:00,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 03:01:00,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:01:02,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:01:02,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 03:01:02,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:03,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:01:06,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:01:07,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:01:09,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:01:12,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 03:01:13,439 INFO [train.py:1046] (2/4) Epoch 32, batch 2500, loss[loss=0.1688, simple_loss=0.2443, pruned_loss=0.04667, over 23485.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2394, pruned_loss=0.04205, over 4700084.72 frames. ], batch size: 134, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:01:13,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:01:18,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:01:20,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1114506.6666666667, ans=0.0 2023-10-03 03:01:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:01:26,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:01:26,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:01:27,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-03 03:01:28,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 03:01:31,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1114573.3333333333, ans=0.0 2023-10-03 03:01:35,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:01:36,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:01:36,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:01:36,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:01:38,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 03:01:39,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:40,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:01:40,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 03:01:42,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:42,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 03:01:42,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:46,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:01:48,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:01:49,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:01:51,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 03:01:51,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:01:52,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:57,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:02,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:04,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:02:10,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:02:12,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 03:02:12,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:02:12,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:02:14,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:02:14,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:02:16,188 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 03:02:16,188 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 03:02:16,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 03:02:16,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1114773.3333333333, ans=0.125 2023-10-03 03:02:17,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1114773.3333333333, ans=0.125 2023-10-03 03:02:18,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-10-03 03:02:19,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:02:21,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 03:02:21,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 03:02:23,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:02:23,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 03:02:27,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 03:02:29,564 INFO [train.py:1046] (2/4) Epoch 32, batch 2550, loss[loss=0.163, simple_loss=0.2538, pruned_loss=0.03605, over 24646.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2402, pruned_loss=0.04203, over 4709663.00 frames. ], batch size: 73, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:02:30,927 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.835e+02 1.976e+02 2.166e+02 3.435e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-03 03:02:31,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:02:31,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:02:32,011 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.98 vs. limit=15.0 2023-10-03 03:02:32,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:02:34,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1114840.0, ans=0.0 2023-10-03 03:02:35,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:02:35,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 03:02:36,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:02:38,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1114840.0, ans=0.125 2023-10-03 03:02:39,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 03:02:40,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:02:43,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:46,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:02:46,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 03:02:46,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:02:47,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:02:47,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:02:49,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:02:49,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 03:02:49,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:02:49,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:49,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 03:03:03,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:03:07,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:07,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:07,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:03:08,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1114973.3333333333, ans=0.0 2023-10-03 03:03:09,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:03:15,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:03:19,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:03:19,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:03:19,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:03:21,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 03:03:21,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:03:23,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.38 vs. limit=15.0 2023-10-03 03:03:25,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:25,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:26,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.19 vs. limit=15.0 2023-10-03 03:03:30,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:03:30,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 03:03:30,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:03:32,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:32,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:03:34,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:03:36,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:03:41,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:03:41,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:03:42,495 INFO [train.py:1046] (2/4) Epoch 32, batch 2600, loss[loss=0.1601, simple_loss=0.24, pruned_loss=0.04005, over 24308.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2408, pruned_loss=0.04165, over 4716779.86 frames. ], batch size: 61, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:03:43,908 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 03:03:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 03:03:45,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:03:45,468 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 03:03:46,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 03:03:48,075 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 03:03:49,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:49,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1115173.3333333333, ans=0.0 2023-10-03 03:03:51,312 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 03:03:51,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 03:03:53,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 03:03:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:03:54,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1115173.3333333333, ans=0.125 2023-10-03 03:03:56,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 03:03:57,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 03:03:58,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:03:59,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 03:04:02,684 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 03:04:02,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 03:04:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:08,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:08,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:04:08,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 03:04:11,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:04:15,320 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 03:04:24,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:24,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:24,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 03:04:26,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:04:26,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:04:27,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 03:04:29,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:04:29,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:04:31,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:04:35,761 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 03:04:35,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:04:35,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:04:35,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1115373.3333333333, ans=0.1 2023-10-03 03:04:39,010 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.19 vs. limit=15.0 2023-10-03 03:04:41,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:04:42,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:04:42,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 03:04:44,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:46,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:04:46,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:04:52,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 03:04:54,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:55,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:04:57,805 INFO [train.py:1046] (2/4) Epoch 32, batch 2650, loss[loss=0.1671, simple_loss=0.2504, pruned_loss=0.04189, over 23985.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04229, over 4713232.24 frames. ], batch size: 80, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:04:59,100 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.872e+02 2.008e+02 2.203e+02 2.987e+02, threshold=4.015e+02, percent-clipped=0.0 2023-10-03 03:05:01,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 03:05:01,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:01,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:05:02,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 03:05:02,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:03,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:04,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.96 vs. limit=22.5 2023-10-03 03:05:05,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1115506.6666666667, ans=0.125 2023-10-03 03:05:06,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.83 vs. limit=22.5 2023-10-03 03:05:07,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:05:07,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:05:09,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:05:11,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 03:05:11,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:05:12,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:05:13,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 03:05:15,242 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 03:05:16,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1115573.3333333333, ans=0.125 2023-10-03 03:05:17,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:05:18,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1115573.3333333333, ans=0.125 2023-10-03 03:05:19,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 03:05:19,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:20,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.37 vs. limit=10.0 2023-10-03 03:05:20,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 03:05:24,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:24,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:05:24,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:24,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:28,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 03:05:28,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 03:05:31,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:05:31,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1115640.0, ans=0.125 2023-10-03 03:05:36,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 03:05:36,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:38,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:38,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:05:39,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:40,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:05:41,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:43,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:05:44,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:45,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:05:47,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:05:48,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:48,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:05:50,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:50,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.39 vs. limit=12.0 2023-10-03 03:05:51,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:05:51,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:05:54,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:56,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:05:56,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:56,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 03:06:00,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:06:02,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:04,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:05,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:05,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:06:05,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:07,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:06:07,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 03:06:10,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:06:11,711 INFO [train.py:1046] (2/4) Epoch 32, batch 2700, loss[loss=0.1366, simple_loss=0.2198, pruned_loss=0.02674, over 24329.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2424, pruned_loss=0.04238, over 4713792.03 frames. ], batch size: 61, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:06:11,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 03:06:14,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:06:14,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:14,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:14,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1115840.0, ans=0.0 2023-10-03 03:06:16,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:06:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:06:16,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:06:16,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 03:06:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 03:06:17,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:06:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:06:21,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:06:22,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:26,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:06:27,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 03:06:27,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:06:34,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:06:34,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:06:39,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:06:39,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:06:39,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:06:41,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:06:43,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:06:46,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:06:46,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:06:47,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:06:49,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:49,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:06:56,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1116040.0, ans=0.125 2023-10-03 03:06:59,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:07:00,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:07:03,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1116040.0, ans=0.2 2023-10-03 03:07:04,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:07:04,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:08,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:07:09,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:09,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:07:11,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:11,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1116106.6666666667, ans=0.125 2023-10-03 03:07:12,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:07:12,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:07:16,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:07:17,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:07:17,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:07:18,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 03:07:20,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:20,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1116106.6666666667, ans=0.0 2023-10-03 03:07:23,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:07:23,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 03:07:24,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 03:07:24,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:25,771 INFO [train.py:1046] (2/4) Epoch 32, batch 2750, loss[loss=0.159, simple_loss=0.236, pruned_loss=0.04097, over 23627.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2431, pruned_loss=0.04248, over 4711622.04 frames. ], batch size: 149, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:07:27,189 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.923e+02 2.045e+02 2.291e+02 3.532e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 03:07:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:27,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:30,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:30,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:07:32,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:35,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:07:35,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:07:36,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:07:36,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:36,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 03:07:36,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:07:36,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:43,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 03:07:43,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:07:44,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:46,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:07:46,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:07:46,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:48,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:07:48,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:48,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:52,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:07:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:07:52,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:07:53,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:55,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:08:05,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:08:07,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:08:07,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:11,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:08:11,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:08:11,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:08:14,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=1116373.3333333333, ans=0.02 2023-10-03 03:08:18,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:08:18,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:08:18,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 03:08:19,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1116373.3333333333, ans=0.125 2023-10-03 03:08:22,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:23,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 03:08:28,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:08:32,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:08:32,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 03:08:33,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:08:35,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:08:37,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 03:08:37,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:08:37,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1116440.0, ans=0.0 2023-10-03 03:08:40,181 INFO [train.py:1046] (2/4) Epoch 32, batch 2800, loss[loss=0.1401, simple_loss=0.2256, pruned_loss=0.02728, over 15635.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2419, pruned_loss=0.04205, over 4703640.02 frames. ], batch size: 33, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:08:40,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 03:08:40,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:08:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:08:41,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 03:08:41,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:08:41,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:44,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:08:44,547 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 03:08:44,548 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 03:08:50,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:51,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:08:51,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:08:54,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:08:57,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 03:08:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 03:08:59,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 03:09:01,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:01,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:09:01,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:03,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:05,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:05,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:09:05,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:09:13,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:09:15,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:09:17,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:17,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:09:19,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:23,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:09:23,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 03:09:24,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:09:24,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:24,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:09:28,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:09:29,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:33,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:09:36,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:09:36,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:36,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:09:36,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:09:38,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:09:40,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:40,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 03:09:40,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:09:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:09:42,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:09:43,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 03:09:44,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:45,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:09:45,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:09:48,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 03:09:52,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:52,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:09:54,144 INFO [train.py:1046] (2/4) Epoch 32, batch 2850, loss[loss=0.1693, simple_loss=0.2493, pruned_loss=0.04471, over 24364.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2406, pruned_loss=0.04151, over 4713581.66 frames. ], batch size: 77, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:09:54,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:09:55,419 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.850e+02 1.983e+02 2.213e+02 2.652e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-03 03:09:56,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:09:57,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1116840.0, ans=0.1 2023-10-03 03:10:00,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:10:00,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:01,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:10:02,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:03,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1116840.0, ans=0.1 2023-10-03 03:10:03,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1116840.0, ans=0.125 2023-10-03 03:10:03,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1116840.0, ans=0.07 2023-10-03 03:10:04,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:10:05,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:10:06,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 03:10:13,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 03:10:13,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:13,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1116906.6666666667, ans=0.1 2023-10-03 03:10:15,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 03:10:16,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:17,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 03:10:17,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 03:10:21,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:29,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1116973.3333333333, ans=0.125 2023-10-03 03:10:31,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:32,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:10:32,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:10:34,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:10:34,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:10:34,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:10:36,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:10:36,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 03:10:38,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.22 vs. limit=6.0 2023-10-03 03:10:39,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:10:39,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:10:41,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:43,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:45,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:45,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:47,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:48,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:10:51,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:10:51,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:53,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:53,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:10:53,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:10:54,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:10:57,761 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:11:00,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:11:01,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 03:11:01,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 03:11:01,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1117106.6666666667, ans=0.2 2023-10-03 03:11:03,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:11:03,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:03,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 03:11:05,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:11:05,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:05,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:05,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:11:05,190 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 03:11:06,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 03:11:06,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:11:07,739 INFO [train.py:1046] (2/4) Epoch 32, batch 2900, loss[loss=0.1898, simple_loss=0.2608, pruned_loss=0.05935, over 23840.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04152, over 4726215.69 frames. ], batch size: 179, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:11:07,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:11,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:11:11,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:11,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:11:13,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 03:11:17,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1117173.3333333333, ans=0.0 2023-10-03 03:11:17,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1117173.3333333333, ans=0.125 2023-10-03 03:11:18,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:11:18,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 03:11:18,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 03:11:19,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1117173.3333333333, ans=0.1 2023-10-03 03:11:21,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:11:21,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:11:21,977 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.74 vs. limit=15.0 2023-10-03 03:11:23,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:11:25,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:11:28,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:11:28,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:11:30,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-10-03 03:11:31,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:11:32,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 03:11:32,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:11:36,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:37,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 03:11:38,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 03:11:39,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1117306.6666666667, ans=0.1 2023-10-03 03:11:41,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:41,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 03:11:41,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:11:43,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:11:43,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:11:46,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:11:48,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:49,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:51,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:11:54,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 03:11:54,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 03:11:54,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:11:58,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:12:01,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 03:12:03,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:12:09,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:12:13,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1117440.0, ans=0.0 2023-10-03 03:12:13,994 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:12:16,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:12:16,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:12:17,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 03:12:22,464 INFO [train.py:1046] (2/4) Epoch 32, batch 2950, loss[loss=0.1648, simple_loss=0.2474, pruned_loss=0.04111, over 24480.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2412, pruned_loss=0.04134, over 4737580.00 frames. ], batch size: 69, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:12:22,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:22,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 03:12:23,829 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.841e+02 2.014e+02 2.273e+02 4.138e+02, threshold=4.027e+02, percent-clipped=1.0 2023-10-03 03:12:23,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:12:23,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:12:28,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:12:31,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 03:12:31,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:12:32,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:34,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:12:34,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:12:37,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 03:12:37,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 03:12:37,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1117573.3333333333, ans=0.125 2023-10-03 03:12:38,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:12:38,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:12:44,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:12:46,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:12:47,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:12:49,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:12:52,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:12:52,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:12:53,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:55,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:55,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:12:56,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 03:13:02,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 03:13:02,383 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 03:13:02,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:13:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 03:13:05,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 03:13:05,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:13:07,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:13:07,309 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 03:13:07,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:13:10,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 03:13:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:13:11,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:13:12,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:13:14,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:13:14,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 03:13:15,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:13:17,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 03:13:19,735 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.97 vs. limit=22.5 2023-10-03 03:13:21,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:22,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:13:22,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 03:13:22,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:13:24,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 03:13:24,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1117773.3333333333, ans=0.0 2023-10-03 03:13:26,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:13:28,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:13:28,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:13:29,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:29,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:13:31,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1117773.3333333333, ans=0.125 2023-10-03 03:13:32,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:13:32,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:32,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:13:34,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:13:34,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:13:36,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:13:37,482 INFO [train.py:1046] (2/4) Epoch 32, batch 3000, loss[loss=0.171, simple_loss=0.2613, pruned_loss=0.04033, over 24307.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2419, pruned_loss=0.04187, over 4724398.99 frames. ], batch size: 74, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:13:37,482 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 03:13:43,086 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.4364, 4.0040, 4.2322, 4.4069], device='cuda:2') 2023-10-03 03:13:49,426 INFO [train.py:1078] (2/4) Epoch 32, validation: loss=0.3583, simple_loss=0.2851, pruned_loss=0.2157, over 1125622.00 frames. 2023-10-03 03:13:49,427 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 03:13:49,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:49,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 03:13:50,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:53,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:13:53,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:13:56,440 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 03:13:56,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 03:13:59,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:14:01,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:14:01,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 03:14:01,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:14:07,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1117906.6666666667, ans=0.2 2023-10-03 03:14:08,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:14:10,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1117906.6666666667, ans=0.1 2023-10-03 03:14:14,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1117906.6666666667, ans=0.2 2023-10-03 03:14:16,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:14:22,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 03:14:23,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:14:24,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:14:26,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:14:26,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:14:27,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:14:27,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 03:14:31,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 03:14:32,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:14:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:14:34,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:14:35,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:14:35,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:35,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:14:38,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:14:38,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:14:38,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:14:40,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1118040.0, ans=0.0 2023-10-03 03:14:41,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:14:42,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 03:14:42,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:14:43,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1118040.0, ans=0.0 2023-10-03 03:14:44,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:14:44,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:14:48,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:48,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:49,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 03:14:49,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 03:14:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:14:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 03:14:52,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:14:53,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 03:14:54,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.40 vs. limit=22.5 2023-10-03 03:14:55,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:14:57,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:14:57,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 03:14:57,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1118106.6666666667, ans=0.0 2023-10-03 03:14:57,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1118106.6666666667, ans=0.0 2023-10-03 03:14:58,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 03:14:58,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:15:00,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:15:01,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:15:01,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:15:01,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:03,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:15:04,550 INFO [train.py:1046] (2/4) Epoch 32, batch 3050, loss[loss=0.2329, simple_loss=0.299, pruned_loss=0.08338, over 19776.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2435, pruned_loss=0.04288, over 4714493.85 frames. ], batch size: 388, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:15:06,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 03:15:07,346 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.893e+02 2.072e+02 2.427e+02 3.731e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 03:15:07,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:15:10,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:10,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:15:15,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:17,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 03:15:22,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 03:15:23,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 03:15:24,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:27,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:15:31,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:31,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:32,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:35,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:15:35,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:15:35,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:15:35,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:35,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:37,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:40,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:42,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:15:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 03:15:42,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:42,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:15:47,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:15:47,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:15:47,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:15:49,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:15:52,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:53,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:15:55,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1118373.3333333333, ans=0.125 2023-10-03 03:15:59,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:59,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:15:59,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:16:02,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:16:02,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:16:02,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:16:03,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 03:16:05,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:16:07,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:08,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 03:16:09,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:16:15,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:16:16,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:16:18,174 INFO [train.py:1046] (2/4) Epoch 32, batch 3100, loss[loss=0.1745, simple_loss=0.2558, pruned_loss=0.04658, over 23338.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2428, pruned_loss=0.04288, over 4720758.97 frames. ], batch size: 93, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:16:21,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:16:23,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 03:16:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 03:16:26,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 03:16:27,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:16:32,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:16:32,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:32,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1118573.3333333333, ans=0.125 2023-10-03 03:16:34,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 03:16:38,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 03:16:47,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:16:48,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:16:48,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:16:49,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:16:51,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 03:16:54,453 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.01 vs. limit=6.0 2023-10-03 03:16:54,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:16:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 03:16:54,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:16:56,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:57,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 03:16:58,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:17:01,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:17:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 03:17:05,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 03:17:05,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:06,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:17:08,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:08,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:08,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:17:09,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:17:09,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:17:10,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:17:12,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:17:12,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:17:15,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1118706.6666666667, ans=0.0 2023-10-03 03:17:16,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:17:18,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 03:17:19,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:17:20,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 03:17:20,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:20,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:22,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 03:17:24,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1118773.3333333333, ans=0.125 2023-10-03 03:17:31,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 03:17:33,070 INFO [train.py:1046] (2/4) Epoch 32, batch 3150, loss[loss=0.152, simple_loss=0.215, pruned_loss=0.04444, over 23652.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2408, pruned_loss=0.04238, over 4711044.68 frames. ], batch size: 256, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:17:35,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:35,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:36,439 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.898e+02 2.081e+02 2.464e+02 4.773e+02, threshold=4.162e+02, percent-clipped=1.0 2023-10-03 03:17:36,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:17:36,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:17:37,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 03:17:37,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:17:40,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 03:17:42,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:45,383 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 03:17:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 03:17:48,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:17:48,262 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 03:17:49,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 03:17:51,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1118906.6666666667, ans=0.0 2023-10-03 03:17:52,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 03:17:52,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 03:17:52,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 03:17:52,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:52,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:17:53,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:57,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 03:17:58,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:58,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:17:59,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:18:01,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1118973.3333333333, ans=0.0 2023-10-03 03:18:03,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 03:18:04,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:18:06,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:18:07,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:18:07,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 03:18:07,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1118973.3333333333, ans=0.125 2023-10-03 03:18:10,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 03:18:11,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:18:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:18:11,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:18:12,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:18:12,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:18:14,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:18:14,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:18:14,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 03:18:16,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:18:16,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:16,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1119040.0, ans=0.2 2023-10-03 03:18:17,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:18:17,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:18:18,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 03:18:19,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:20,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 03:18:20,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:22,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 03:18:23,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1119040.0, ans=0.1 2023-10-03 03:18:24,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 03:18:25,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:18:25,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:27,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 03:18:28,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 03:18:28,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:18:31,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:18:33,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:33,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:18:37,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:18:38,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:40,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 03:18:43,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.64 vs. limit=6.0 2023-10-03 03:18:46,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:18:46,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 03:18:47,542 INFO [train.py:1046] (2/4) Epoch 32, batch 3200, loss[loss=0.1573, simple_loss=0.2453, pruned_loss=0.03463, over 24593.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2397, pruned_loss=0.0419, over 4717470.92 frames. ], batch size: 71, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:18:50,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:51,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:18:51,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 03:18:55,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:59,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:19:03,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:19:08,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.83 vs. limit=10.0 2023-10-03 03:19:11,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:19:17,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1119306.6666666667, ans=0.125 2023-10-03 03:19:21,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 03:19:22,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:19:24,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1119306.6666666667, ans=0.1 2023-10-03 03:19:26,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 03:19:27,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:19:28,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1119306.6666666667, ans=0.0 2023-10-03 03:19:29,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:19:31,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:19:31,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:19:31,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1119373.3333333333, ans=0.125 2023-10-03 03:19:32,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.31 vs. limit=22.5 2023-10-03 03:19:35,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 03:19:35,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1119373.3333333333, ans=0.1 2023-10-03 03:19:37,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 03:19:38,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 03:19:39,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 03:19:42,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:19:48,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:19:48,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:19:48,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:19:50,330 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 03:19:50,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:19:53,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:19:55,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 03:19:55,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 03:19:57,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 03:19:58,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 03:19:59,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:20:02,527 INFO [train.py:1046] (2/4) Epoch 32, batch 3250, loss[loss=0.1689, simple_loss=0.2412, pruned_loss=0.0483, over 23567.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.24, pruned_loss=0.04182, over 4717171.16 frames. ], batch size: 256, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:20:04,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:20:04,443 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 03:20:04,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:04,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:05,812 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.990e+02 2.302e+02 2.522e+02 3.377e+02, threshold=4.604e+02, percent-clipped=0.0 2023-10-03 03:20:05,970 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 03:20:10,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:20:14,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:20:17,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.16 vs. limit=6.0 2023-10-03 03:20:17,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.58 vs. limit=15.0 2023-10-03 03:20:18,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:20:18,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 03:20:19,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:20:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:20:19,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:20:21,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:20:21,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:20:23,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:23,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1119573.3333333333, ans=0.0 2023-10-03 03:20:24,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:20:25,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:25,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:25,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:26,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:20:30,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:30,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:20:33,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:33,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:35,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:37,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:20:37,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:20:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 03:20:42,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:20:42,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:20:42,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1119640.0, ans=0.125 2023-10-03 03:20:42,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1119640.0, ans=0.0 2023-10-03 03:20:43,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:20:44,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:20:48,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1119706.6666666667, ans=0.125 2023-10-03 03:20:50,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:20:54,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1119706.6666666667, ans=0.2 2023-10-03 03:20:58,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:20:58,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:58,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 03:20:58,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:20:58,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:20:58,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:02,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 03:21:02,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 03:21:02,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:21:03,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:05,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:21:05,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 03:21:05,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:21:08,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1119773.3333333333, ans=0.1 2023-10-03 03:21:09,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:21:09,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:21:11,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 03:21:11,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:14,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:21:14,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 03:21:15,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:21:15,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 03:21:16,787 INFO [train.py:1046] (2/4) Epoch 32, batch 3300, loss[loss=0.1719, simple_loss=0.2515, pruned_loss=0.04613, over 23243.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2408, pruned_loss=0.0421, over 4722999.84 frames. ], batch size: 93, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:21:17,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1119840.0, ans=0.0 2023-10-03 03:21:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 03:21:19,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 03:21:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:22,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:21:23,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:21:23,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:25,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:21:25,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:21:29,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:32,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:21:32,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1119906.6666666667, ans=0.0 2023-10-03 03:21:35,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 03:21:35,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:21:36,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:38,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:38,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 03:21:39,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:21:39,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:21:41,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:21:41,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:21:42,486 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 03:21:45,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:45,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:21:48,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:48,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 03:21:49,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 03:21:49,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:50,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:21:54,770 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 03:21:57,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 03:21:57,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:22:00,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 03:22:01,419 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.66 vs. limit=15.0 2023-10-03 03:22:03,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:22:06,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:22:06,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:22:07,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:07,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1120040.0, ans=0.07 2023-10-03 03:22:08,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:22:08,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:22:08,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:22:10,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:22:11,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:22:13,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:22:14,564 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 03:22:15,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 03:22:17,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:22:18,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:22:18,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:20,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:22:20,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:21,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:22:21,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:21,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:22:23,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:22:23,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1120106.6666666667, ans=0.0 2023-10-03 03:22:25,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:22:29,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 03:22:29,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:30,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:30,643 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.32 vs. limit=15.0 2023-10-03 03:22:32,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:22:33,240 INFO [train.py:1046] (2/4) Epoch 32, batch 3350, loss[loss=0.1614, simple_loss=0.2512, pruned_loss=0.03578, over 24042.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2418, pruned_loss=0.0427, over 4719819.06 frames. ], batch size: 80, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:22:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:22:34,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:36,515 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.816e+02 1.964e+02 2.229e+02 3.119e+02, threshold=3.928e+02, percent-clipped=0.0 2023-10-03 03:22:36,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:36,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:39,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:22:40,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:43,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:22:46,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:48,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:22:48,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:49,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:22:49,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 03:22:51,256 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 03:22:52,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:52,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1120240.0, ans=0.1 2023-10-03 03:22:54,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 03:22:54,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 03:22:55,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:22:56,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:22:56,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:22:58,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 03:22:58,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:58,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:22:59,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1120240.0, ans=0.125 2023-10-03 03:23:01,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:02,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:04,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:04,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:23:07,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:09,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:10,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1120306.6666666667, ans=0.125 2023-10-03 03:23:14,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:23:16,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:17,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:17,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:20,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:20,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 03:23:20,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:23:20,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 03:23:20,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:23:23,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 03:23:23,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:25,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:32,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:32,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 03:23:32,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:23:34,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:23:36,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:23:41,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:23:43,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 03:23:44,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:23:44,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:23:47,645 INFO [train.py:1046] (2/4) Epoch 32, batch 3400, loss[loss=0.1665, simple_loss=0.2516, pruned_loss=0.04064, over 24117.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2422, pruned_loss=0.04289, over 4715308.51 frames. ], batch size: 80, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:23:47,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:47,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 03:23:47,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:47,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 03:23:49,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:23:50,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:23:51,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:23:52,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:23:52,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 03:23:58,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 03:23:58,875 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 03:23:58,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:01,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1120573.3333333333, ans=0.1 2023-10-03 03:24:02,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:24:02,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:24:03,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:04,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:24:09,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:24:10,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 03:24:10,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1120573.3333333333, ans=0.125 2023-10-03 03:24:11,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1120573.3333333333, ans=0.125 2023-10-03 03:24:14,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:24:16,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1120640.0, ans=0.125 2023-10-03 03:24:19,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:19,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:24:20,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:24:27,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:24:31,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 03:24:35,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:37,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:37,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 03:24:37,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:24:39,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:24:39,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:24:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:24:42,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:45,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:24:45,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:24:50,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:24:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 03:24:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:24:57,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1120773.3333333333, ans=0.0 2023-10-03 03:24:58,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-03 03:24:59,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1120773.3333333333, ans=0.125 2023-10-03 03:25:00,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 03:25:01,503 INFO [train.py:1046] (2/4) Epoch 32, batch 3450, loss[loss=0.1825, simple_loss=0.248, pruned_loss=0.05851, over 23555.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2413, pruned_loss=0.04247, over 4711942.11 frames. ], batch size: 134, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:25:03,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 03:25:03,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:25:04,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:25:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 03:25:04,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:25:06,245 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.880e+02 2.016e+02 2.211e+02 2.960e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-03 03:25:07,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1120840.0, ans=0.1 2023-10-03 03:25:09,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:25:14,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:25:16,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:25:16,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1120906.6666666667, ans=0.07 2023-10-03 03:25:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:25:17,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:20,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:26,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 03:25:31,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 03:25:31,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:25:32,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:25:32,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:25:38,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 03:25:39,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:25:43,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:25:43,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:25:45,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:25:46,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:25:48,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 03:25:48,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:25:50,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:52,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:25:54,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 03:25:57,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:25:57,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1121040.0, ans=0.2 2023-10-03 03:26:02,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:26:04,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:11,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:11,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:26:11,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:26:11,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:26:12,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1121106.6666666667, ans=0.0 2023-10-03 03:26:15,034 INFO [train.py:1046] (2/4) Epoch 32, batch 3500, loss[loss=0.1542, simple_loss=0.2183, pruned_loss=0.04504, over 23736.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2402, pruned_loss=0.04216, over 4717518.42 frames. ], batch size: 232, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:26:19,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:21,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:26:21,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 03:26:23,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:26:26,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:26:28,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:28,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 03:26:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:26:33,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1121240.0, ans=0.0 2023-10-03 03:26:34,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:26:36,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:26:36,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:26:37,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:26:37,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:38,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.06 vs. limit=15.0 2023-10-03 03:26:39,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:26:39,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 03:26:40,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:40,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:26:42,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:26:45,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:47,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 03:26:47,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:26:50,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:26:51,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:26:51,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1121306.6666666667, ans=0.5 2023-10-03 03:26:52,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:54,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:26:54,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:26:55,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 03:26:55,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 03:26:57,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 03:26:57,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:26:58,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:59,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:27:00,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:27:03,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:27:04,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:27:08,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:27:08,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 03:27:08,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 03:27:08,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:12,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:27:12,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:27:15,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:27:16,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 03:27:16,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:27:19,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:27:19,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 03:27:22,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 03:27:23,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:27:24,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1121440.0, ans=0.125 2023-10-03 03:27:25,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:27:25,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:27:25,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:27,951 INFO [train.py:1046] (2/4) Epoch 32, batch 3550, loss[loss=0.1664, simple_loss=0.2367, pruned_loss=0.04802, over 23823.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2388, pruned_loss=0.04154, over 4721541.58 frames. ], batch size: 179, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:27:29,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:27:32,573 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.881e+02 2.109e+02 2.515e+02 3.801e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-03 03:27:37,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:38,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 03:27:44,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:27:44,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:27:46,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:27:46,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:27:46,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:27:49,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:49,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:27:50,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:27:52,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:27:56,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:27:56,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:56,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1121640.0, ans=0.125 2023-10-03 03:27:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:27:57,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:59,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:27:59,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 03:27:59,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:00,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:01,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:28:05,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:06,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:28:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:09,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 03:28:09,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1121640.0, ans=0.125 2023-10-03 03:28:11,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:28:13,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 03:28:14,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:28:16,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:28:16,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:28:19,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 03:28:20,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:28:26,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:28:26,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 03:28:26,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1121773.3333333333, ans=0.125 2023-10-03 03:28:27,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:29,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:31,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 03:28:36,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 03:28:38,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:28:38,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:28:41,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:41,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:42,743 INFO [train.py:1046] (2/4) Epoch 32, batch 3600, loss[loss=0.1741, simple_loss=0.2613, pruned_loss=0.04343, over 24407.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2392, pruned_loss=0.04177, over 4712254.58 frames. ], batch size: 77, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:28:43,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:28:48,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:28:49,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:49,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1121840.0, ans=0.0 2023-10-03 03:28:51,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:28:52,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:28:52,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:52,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 03:28:57,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:28:57,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:29:00,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:29:02,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:29:04,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:29:04,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:29:04,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 03:29:04,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1121906.6666666667, ans=0.125 2023-10-03 03:29:05,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:29:06,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:29:08,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:29:09,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:12,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:29:13,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:29:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 03:29:20,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:29:21,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:29:21,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 03:29:26,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:29:32,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:29:40,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:29:40,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 03:29:42,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 03:29:42,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 03:29:45,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:29:45,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:29:46,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.37 vs. limit=10.0 2023-10-03 03:29:47,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1122106.6666666667, ans=0.07 2023-10-03 03:29:48,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 03:29:48,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:29:50,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:29:50,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:29:51,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 03:29:52,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 03:29:55,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:57,113 INFO [train.py:1046] (2/4) Epoch 32, batch 3650, loss[loss=0.1644, simple_loss=0.2566, pruned_loss=0.03605, over 24431.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2398, pruned_loss=0.0419, over 4728243.18 frames. ], batch size: 69, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:29:57,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 03:30:02,616 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.895e+02 2.042e+02 2.308e+02 4.121e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-03 03:30:02,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 03:30:03,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1122173.3333333333, ans=0.2 2023-10-03 03:30:04,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:30:07,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1122173.3333333333, ans=0.125 2023-10-03 03:30:08,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 03:30:09,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 03:30:14,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:30:14,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:30:14,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:30:17,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:30:17,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:30:19,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 03:30:19,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:30:19,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:30:21,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 03:30:22,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:30:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:30:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:24,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1122240.0, ans=0.125 2023-10-03 03:30:25,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:30:28,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 03:30:28,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 03:30:30,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:30:31,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 03:30:32,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:30:32,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:30:35,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1122306.6666666667, ans=0.125 2023-10-03 03:30:37,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:30:39,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:39,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:30:41,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:30:41,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1122373.3333333333, ans=0.125 2023-10-03 03:30:43,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:30:45,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:30:49,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:30:50,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:30:50,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:30:51,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:30:53,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:53,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:30:54,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1122440.0, ans=0.1 2023-10-03 03:31:00,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 03:31:03,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:31:04,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:04,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:31:04,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:06,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:31:07,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:09,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 03:31:09,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:10,289 INFO [train.py:1046] (2/4) Epoch 32, batch 3700, loss[loss=0.1772, simple_loss=0.2508, pruned_loss=0.05175, over 23771.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2407, pruned_loss=0.04213, over 4730567.17 frames. ], batch size: 212, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:31:13,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:31:13,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1122506.6666666667, ans=0.0 2023-10-03 03:31:14,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:31:16,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:31:17,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:17,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 03:31:17,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:19,963 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:31:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:31:21,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:31:25,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:31:28,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:31:28,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:31:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:31:29,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:29,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:31:32,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:31:34,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 03:31:42,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:31:42,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:31:43,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1122640.0, ans=0.125 2023-10-03 03:31:44,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:31:44,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 03:31:44,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:31:49,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:49,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 03:31:50,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:51,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:31:55,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:56,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:31:57,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:32:00,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:32:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 03:32:02,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:02,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 03:32:07,715 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.26 vs. limit=10.0 2023-10-03 03:32:08,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:32:08,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:32:09,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1122773.3333333333, ans=0.0 2023-10-03 03:32:10,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 03:32:12,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:32:12,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:32:13,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:32:13,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:32:17,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 03:32:19,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 03:32:19,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:32:19,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:21,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:32:23,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:32:24,554 INFO [train.py:1046] (2/4) Epoch 32, batch 3750, loss[loss=0.1745, simple_loss=0.2664, pruned_loss=0.04126, over 24296.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2421, pruned_loss=0.04276, over 4718330.45 frames. ], batch size: 74, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:32:25,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:32:27,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:32:27,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:32:30,020 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.897e+02 2.092e+02 2.385e+02 3.379e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 03:32:30,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 03:32:31,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 03:32:33,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:32:34,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 03:32:34,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:32:35,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:37,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:39,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:32:41,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:44,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:32:46,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:32:47,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:49,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:32:51,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 03:32:52,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:32:54,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:32:54,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:58,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 03:33:01,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 03:33:02,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:33:02,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:33:05,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:10,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:11,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:33:14,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 03:33:16,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:20,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:33:20,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1123040.0, ans=0.0 2023-10-03 03:33:21,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:33:25,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:33:29,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:33:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:33:33,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:33:33,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:33:35,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:33:36,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1123106.6666666667, ans=0.125 2023-10-03 03:33:39,275 INFO [train.py:1046] (2/4) Epoch 32, batch 3800, loss[loss=0.1446, simple_loss=0.223, pruned_loss=0.03315, over 24400.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.04248, over 4732783.28 frames. ], batch size: 58, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:33:39,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1123173.3333333333, ans=0.1 2023-10-03 03:33:40,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1123173.3333333333, ans=0.0 2023-10-03 03:33:42,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:33:46,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:46,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1123173.3333333333, ans=0.125 2023-10-03 03:33:47,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:33:47,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 03:33:49,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:33:51,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:33:54,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 03:33:54,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:55,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:33:57,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:58,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:33:58,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:33:59,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 03:34:01,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 03:34:02,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:34:04,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1123240.0, ans=0.125 2023-10-03 03:34:05,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:34:08,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:34:08,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:34:10,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:34:11,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:34:12,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:13,426 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=15.0 2023-10-03 03:34:14,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:34:14,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1123306.6666666667, ans=0.1 2023-10-03 03:34:18,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:34:18,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 03:34:22,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:34:26,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:34:32,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:34:33,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 03:34:36,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 03:34:36,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:34:38,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:34:38,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:41,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 03:34:44,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 03:34:44,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 03:34:44,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:45,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:34:51,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:34:52,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.78 vs. limit=15.0 2023-10-03 03:34:52,992 INFO [train.py:1046] (2/4) Epoch 32, batch 3850, loss[loss=0.1456, simple_loss=0.2296, pruned_loss=0.03081, over 24504.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2409, pruned_loss=0.04236, over 4722876.59 frames. ], batch size: 63, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:34:53,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:34:56,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1123506.6666666667, ans=0.0 2023-10-03 03:34:57,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:34:57,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 03:34:59,364 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.881e+02 2.039e+02 2.318e+02 3.209e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 03:34:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:35:00,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:35:05,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:35:08,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:35:10,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.50 vs. limit=15.0 2023-10-03 03:35:11,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:35:12,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 03:35:12,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1123573.3333333333, ans=0.125 2023-10-03 03:35:16,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:18,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:35:21,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:35:21,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:35:25,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:26,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:35:26,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1123640.0, ans=0.125 2023-10-03 03:35:27,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:35:27,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:35:29,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:35:30,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:35:32,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:35:33,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 03:35:33,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 03:35:34,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:35:34,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:36,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:37,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:37,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 03:35:39,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 03:35:41,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:43,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1123706.6666666667, ans=0.0 2023-10-03 03:35:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 03:35:44,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:35:45,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.38 vs. limit=22.5 2023-10-03 03:35:49,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:50,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:53,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.52 vs. limit=15.0 2023-10-03 03:35:54,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:54,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 03:35:58,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 03:36:01,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:02,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:03,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:36:03,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:36:04,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:05,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:05,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:36:05,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 03:36:05,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:36:06,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 03:36:08,139 INFO [train.py:1046] (2/4) Epoch 32, batch 3900, loss[loss=0.1445, simple_loss=0.2289, pruned_loss=0.03009, over 24326.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2402, pruned_loss=0.04185, over 4722607.71 frames. ], batch size: 61, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:36:08,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:08,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:09,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:36:09,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:09,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:36:11,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:11,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:36:12,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:36:12,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 03:36:12,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:16,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:36:17,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:36:18,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:36:19,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:36:22,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:36:22,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:24,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:36:24,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1123906.6666666667, ans=0.5 2023-10-03 03:36:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 03:36:26,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:36:27,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 03:36:27,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:29,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 03:36:30,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 03:36:33,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:36:34,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:36:34,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:36:34,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:36:37,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:36:39,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:36:42,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:36:42,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:36:43,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:36:48,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1123973.3333333333, ans=0.2 2023-10-03 03:36:48,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1123973.3333333333, ans=0.1 2023-10-03 03:36:49,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:36:49,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:36:56,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:36:57,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:36:57,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1124040.0, ans=0.0 2023-10-03 03:37:01,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1124040.0, ans=0.95 2023-10-03 03:37:07,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:37:10,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:37:10,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 03:37:10,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 03:37:10,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:37:12,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 03:37:13,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:37:14,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 03:37:20,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:37:21,743 INFO [train.py:1046] (2/4) Epoch 32, batch 3950, loss[loss=0.1675, simple_loss=0.2575, pruned_loss=0.03873, over 24557.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2394, pruned_loss=0.04187, over 4709226.96 frames. ], batch size: 71, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:37:22,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 03:37:22,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:37:25,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:37:26,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:37:28,013 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.850e+02 2.026e+02 2.281e+02 3.100e+02, threshold=4.052e+02, percent-clipped=0.0 2023-10-03 03:37:30,888 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 03:37:30,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:37:32,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 03:37:32,349 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 03:37:32,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:37:36,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:37:36,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:37:36,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:37:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 03:37:38,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.66 vs. limit=22.5 2023-10-03 03:37:40,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:37:41,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:37:41,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:37:42,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:37:42,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1124240.0, ans=0.0 2023-10-03 03:37:43,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:37:56,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:37:56,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:38:01,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 03:38:06,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.00 vs. limit=15.0 2023-10-03 03:38:07,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 03:38:07,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 03:38:07,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:38:07,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1124373.3333333333, ans=0.0 2023-10-03 03:38:08,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:38:13,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:38:15,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:38:15,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:38:15,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:38:15,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 03:38:19,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:38:21,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:38:26,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 03:38:34,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:35,757 INFO [train.py:1046] (2/4) Epoch 32, batch 4000, loss[loss=0.1772, simple_loss=0.243, pruned_loss=0.05572, over 23758.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2399, pruned_loss=0.04175, over 4711812.47 frames. ], batch size: 212, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:38:36,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1124506.6666666667, ans=0.125 2023-10-03 03:38:38,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:44,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:38:44,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:38:44,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:45,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 03:38:46,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:38:47,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 03:38:47,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:38:47,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 03:38:50,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:38:55,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:38:55,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:38:55,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:38:55,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:38:55,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:38:55,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:38:56,943 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 03:38:58,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:38:58,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:02,488 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 03:39:02,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:39:02,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:39:07,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 03:39:08,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:39:10,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:39:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 03:39:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:39:14,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1124640.0, ans=0.125 2023-10-03 03:39:15,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 03:39:15,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:39:16,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:16,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1124640.0, ans=0.0 2023-10-03 03:39:18,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:39:19,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:39:19,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:39:21,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:39:22,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 03:39:22,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:24,568 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 03:39:27,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.98 vs. limit=15.0 2023-10-03 03:39:29,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:39:33,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 03:39:34,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:39:34,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:39:36,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:39:37,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:39:41,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:39:44,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:39:45,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 03:39:46,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:39:46,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:39:47,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:39:49,022 INFO [train.py:1046] (2/4) Epoch 32, batch 4050, loss[loss=0.1468, simple_loss=0.2274, pruned_loss=0.03304, over 24312.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2405, pruned_loss=0.04185, over 4704379.88 frames. ], batch size: 61, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:39:49,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:39:50,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:39:53,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:39:55,673 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.778e+02 1.970e+02 2.195e+02 3.325e+02, threshold=3.940e+02, percent-clipped=0.0 2023-10-03 03:39:55,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1124840.0, ans=0.125 2023-10-03 03:39:57,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:39:58,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:40:01,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:40:01,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:40:04,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.91 vs. limit=15.0 2023-10-03 03:40:05,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:40:06,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:40:08,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 03:40:11,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 03:40:11,274 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 03:40:13,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:40:21,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 03:40:23,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:40:25,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:40:29,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:40:29,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1124973.3333333333, ans=0.1 2023-10-03 03:40:31,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:40:31,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:40:35,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:40:38,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 03:40:38,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:40:39,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:40:40,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 03:40:43,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:40:49,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 03:40:52,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:40:52,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:40:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 03:40:53,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 03:40:53,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:40:57,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:40:57,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:40:58,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:41:03,125 INFO [train.py:1046] (2/4) Epoch 32, batch 4100, loss[loss=0.1603, simple_loss=0.251, pruned_loss=0.03481, over 24611.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2422, pruned_loss=0.04243, over 4704068.93 frames. ], batch size: 68, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:41:05,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1125173.3333333333, ans=0.125 2023-10-03 03:41:06,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 03:41:07,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 03:41:08,468 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.71 vs. limit=22.5 2023-10-03 03:41:09,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 03:41:10,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 03:41:10,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:11,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:11,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:11,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:41:12,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.92 vs. limit=22.5 2023-10-03 03:41:14,655 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 03:41:17,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:41:17,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:41:17,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:18,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:41:24,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:41:24,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:41:24,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:41:24,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 03:41:27,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:27,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:41:27,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:41:27,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:41:27,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 03:41:27,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1125240.0, ans=0.0 2023-10-03 03:41:31,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:41:31,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 03:41:33,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:41:36,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:41:36,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 03:41:37,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:41:37,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:41:37,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:41:40,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 03:41:41,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:41:42,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:41:42,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125306.6666666667, ans=0.1 2023-10-03 03:41:43,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 03:41:45,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:45,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:41:47,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:41:50,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:41:52,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:41:54,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:56,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.96 vs. limit=15.0 2023-10-03 03:42:00,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1125373.3333333333, ans=0.0 2023-10-03 03:42:03,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:03,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:42:06,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1125440.0, ans=0.1 2023-10-03 03:42:09,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:42:10,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:42:16,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:42:16,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:42:17,478 INFO [train.py:1046] (2/4) Epoch 32, batch 4150, loss[loss=0.167, simple_loss=0.2224, pruned_loss=0.05585, over 19623.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.04251, over 4697787.00 frames. ], batch size: 388, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:42:17,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:42:17,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:42:20,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 03:42:20,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:42:20,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 03:42:21,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 03:42:21,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 03:42:23,094 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.897e+02 2.094e+02 2.297e+02 3.189e+02, threshold=4.189e+02, percent-clipped=0.0 2023-10-03 03:42:23,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:42:27,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:42:27,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:31,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:42:33,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:42:34,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:42:36,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:42:36,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:42:37,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:42:40,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:40,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1125573.3333333333, ans=0.125 2023-10-03 03:42:43,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:42:43,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1125573.3333333333, ans=10.0 2023-10-03 03:42:44,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 03:42:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 03:42:47,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:42:49,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 03:42:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:42:49,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:42:50,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:42:51,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:42:55,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 03:42:55,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.29 vs. limit=15.0 2023-10-03 03:42:58,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:43:00,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:02,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 03:43:02,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:43:02,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1125706.6666666667, ans=0.125 2023-10-03 03:43:04,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 03:43:06,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:43:07,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125706.6666666667, ans=0.1 2023-10-03 03:43:08,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:43:10,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:11,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 03:43:11,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:11,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:43:11,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:43:11,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1125706.6666666667, ans=0.0 2023-10-03 03:43:14,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 03:43:14,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:14,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:43:14,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:43:17,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 03:43:17,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:43:17,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:43:17,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:43:18,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:18,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 03:43:20,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:43:24,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:43:27,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 03:43:27,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125773.3333333333, ans=0.1 2023-10-03 03:43:30,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:43:31,723 INFO [train.py:1046] (2/4) Epoch 32, batch 4200, loss[loss=0.1382, simple_loss=0.1961, pruned_loss=0.04014, over 23494.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2404, pruned_loss=0.04195, over 4701891.49 frames. ], batch size: 285, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:43:31,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:43:33,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:43:35,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:43:35,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:43:37,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 03:43:39,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 03:43:41,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:46,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:43:49,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:43:51,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:43:51,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:51,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1125906.6666666667, ans=0.125 2023-10-03 03:43:51,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1125906.6666666667, ans=0.0 2023-10-03 03:43:52,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 03:43:52,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:53,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:43:55,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:43:56,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:43:59,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 03:43:59,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:44:04,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:44:06,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:44:08,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:44:09,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:44:12,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:44:12,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 03:44:12,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:44:14,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:44:19,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:44:20,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:44:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:44:27,353 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.44 vs. limit=6.0 2023-10-03 03:44:27,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 03:44:29,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:44:35,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:44:36,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1126106.6666666667, ans=0.125 2023-10-03 03:44:37,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:39,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 03:44:40,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1126106.6666666667, ans=0.1 2023-10-03 03:44:41,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:44:45,802 INFO [train.py:1046] (2/4) Epoch 32, batch 4250, loss[loss=0.1527, simple_loss=0.2239, pruned_loss=0.04076, over 23447.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2396, pruned_loss=0.04161, over 4691698.48 frames. ], batch size: 285, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:44:46,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1126173.3333333333, ans=0.07 2023-10-03 03:44:47,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:44:47,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:44:50,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:51,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.847e+02 2.009e+02 2.181e+02 2.689e+02, threshold=4.019e+02, percent-clipped=0.0 2023-10-03 03:44:52,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:44:54,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 03:44:54,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:44:56,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:59,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:45:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:06,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:09,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:45:09,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:45:10,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:11,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1126240.0, ans=0.04949747468305833 2023-10-03 03:45:12,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:12,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1126240.0, ans=0.125 2023-10-03 03:45:13,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:15,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:45:16,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:17,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 03:45:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 03:45:21,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:23,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:45:23,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:24,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:45:24,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:24,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:27,832 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.69 vs. limit=10.0 2023-10-03 03:45:28,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 03:45:30,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:45:35,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:45:36,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:37,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 03:45:37,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:45:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 03:45:39,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:45:42,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1126373.3333333333, ans=0.125 2023-10-03 03:45:42,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:45:43,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:43,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:45:45,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 03:45:45,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:45:47,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:45:52,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:55,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:55,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:45:55,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1126440.0, ans=0.125 2023-10-03 03:45:56,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:45:58,286 INFO [train.py:1046] (2/4) Epoch 32, batch 4300, loss[loss=0.1654, simple_loss=0.2401, pruned_loss=0.04538, over 23649.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2396, pruned_loss=0.04112, over 4709777.72 frames. ], batch size: 256, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:45:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:45:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:46:01,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:46:01,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 03:46:04,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:46:08,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:46:08,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:46:11,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1126573.3333333333, ans=0.035 2023-10-03 03:46:14,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:46:19,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:46:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 03:46:20,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:46:22,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:46:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:46:22,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 03:46:25,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:46:26,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:46:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 03:46:29,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:46:29,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 03:46:31,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1126640.0, ans=0.1 2023-10-03 03:46:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:46:34,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:46:37,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:46:37,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:46:39,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:46:39,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:46:41,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:46:42,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 03:46:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 03:46:42,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1126706.6666666667, ans=0.2 2023-10-03 03:46:44,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1126706.6666666667, ans=0.125 2023-10-03 03:46:45,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:46:47,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:46:47,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:46:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:46:48,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:46:48,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 03:46:48,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 03:46:49,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.83 vs. limit=15.0 2023-10-03 03:46:49,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 03:46:51,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:46:51,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 03:46:51,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 03:46:55,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:46:56,808 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 03:46:56,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:46:59,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:46:59,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:47:02,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 03:47:03,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:47:03,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:03,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:47:03,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:47:05,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:47:06,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:47:08,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.40 vs. limit=15.0 2023-10-03 03:47:10,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:12,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:12,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:47:12,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1126840.0, ans=0.125 2023-10-03 03:47:13,583 INFO [train.py:1046] (2/4) Epoch 32, batch 4350, loss[loss=0.1632, simple_loss=0.2365, pruned_loss=0.04492, over 23675.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2404, pruned_loss=0.04152, over 4704737.51 frames. ], batch size: 149, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:47:17,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 03:47:17,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:47:19,714 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.807e+02 2.015e+02 2.247e+02 3.972e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-03 03:47:21,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:47:21,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1126840.0, ans=0.125 2023-10-03 03:47:22,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:25,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:47:25,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:47:29,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:47:32,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:36,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:47:36,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:47:39,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:47:39,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1126906.6666666667, ans=0.0 2023-10-03 03:47:42,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:47:44,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:47:49,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 03:47:49,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:47:49,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:53,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:55,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 03:47:57,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:47:59,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:48:01,843 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 03:48:03,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:04,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:48:06,391 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 03:48:07,739 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 03:48:07,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:48:07,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:09,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:48:09,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:11,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:48:11,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:48:14,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 03:48:14,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:14,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:48:14,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:16,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 03:48:16,381 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 03:48:16,387 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 03:48:17,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 03:48:19,331 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:48:20,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:48:20,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:48:20,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:21,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:48:24,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 03:48:27,280 INFO [train.py:1046] (2/4) Epoch 32, batch 4400, loss[loss=0.1425, simple_loss=0.2203, pruned_loss=0.03236, over 21416.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2412, pruned_loss=0.04214, over 4698770.14 frames. ], batch size: 47, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 03:48:27,331 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 03:48:27,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:30,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:48:30,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:31,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:48:32,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.66 vs. limit=6.0 2023-10-03 03:48:33,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 03:48:33,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 03:48:33,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 03:48:33,108 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 03:48:34,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:48:34,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:48:37,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 03:48:39,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:39,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:39,213 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 03:48:41,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1127240.0, ans=0.0 2023-10-03 03:48:42,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:42,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 03:48:44,283 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 03:48:47,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 03:48:47,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1127240.0, ans=0.2 2023-10-03 03:48:48,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 03:48:48,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 03:48:48,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1127240.0, ans=0.125 2023-10-03 03:48:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:51,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:52,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:48:54,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 03:48:54,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 03:48:55,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:57,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:48:57,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:58,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:58,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:59,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 03:49:00,039 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 03:49:02,142 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.49 vs. limit=22.5 2023-10-03 03:49:03,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:10,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:49:13,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 03:49:16,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:49:20,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:49:21,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:49:21,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 03:49:22,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:49:23,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:49:23,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:49:23,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:49:27,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 03:49:29,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 03:49:29,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 03:49:31,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:49:31,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 03:49:32,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:49:32,860 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:49:35,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:49:38,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 03:49:41,424 INFO [train.py:1046] (2/4) Epoch 32, batch 4450, loss[loss=0.1554, simple_loss=0.2363, pruned_loss=0.03726, over 24612.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.242, pruned_loss=0.04268, over 4696732.05 frames. ], batch size: 60, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 03:49:41,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:49:41,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1127506.6666666667, ans=0.0 2023-10-03 03:49:45,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:45,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:49:47,470 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.837e+02 2.024e+02 2.337e+02 3.195e+02, threshold=4.048e+02, percent-clipped=0.0 2023-10-03 03:49:52,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:49:53,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:49:57,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:59,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:50:02,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:50:02,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:50:02,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 03:50:02,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:50:03,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:03,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:50:03,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:50:03,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1127573.3333333333, ans=0.125 2023-10-03 03:50:06,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:50:08,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1127573.3333333333, ans=0.0 2023-10-03 03:50:09,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:10,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:12,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:50:12,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:50:14,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:50:16,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1127640.0, ans=0.125 2023-10-03 03:50:18,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1127640.0, ans=0.1 2023-10-03 03:50:19,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:50:19,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 03:50:19,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 03:50:19,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:50:22,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:50:23,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 03:50:26,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:50:31,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:31,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 03:50:31,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:31,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:50:31,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:50:31,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:50:33,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:33,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1127706.6666666667, ans=0.0 2023-10-03 03:50:33,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1127706.6666666667, ans=0.0 2023-10-03 03:50:35,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:50:36,554 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.15 vs. limit=10.0 2023-10-03 03:50:37,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 03:50:37,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1127706.6666666667, ans=0.1 2023-10-03 03:50:38,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:50:38,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:50:39,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1127773.3333333333, ans=15.0 2023-10-03 03:50:40,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:50:42,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:42,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:50:42,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1127773.3333333333, ans=0.125 2023-10-03 03:50:45,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:50:48,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 03:50:49,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:50:54,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:50:54,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1127840.0, ans=0.125 2023-10-03 03:50:55,309 INFO [train.py:1046] (2/4) Epoch 32, batch 4500, loss[loss=0.1485, simple_loss=0.2113, pruned_loss=0.04285, over 23660.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2424, pruned_loss=0.04238, over 4708221.13 frames. ], batch size: 256, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:50:56,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 03:50:56,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 03:50:57,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1127840.0, ans=0.09899494936611666 2023-10-03 03:50:58,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:50:59,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1127840.0, ans=0.125 2023-10-03 03:51:03,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:51:04,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:51:05,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:51:06,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:51:06,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:06,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:16,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:51:17,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:51:20,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:51:20,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:51:22,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:51:22,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.06 vs. limit=15.0 2023-10-03 03:51:26,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1127973.3333333333, ans=0.0 2023-10-03 03:51:29,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:51:33,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:51:36,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:51:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:51:38,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 03:51:40,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:40,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1128040.0, ans=0.0 2023-10-03 03:51:41,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:51:43,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:51:43,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:51:45,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:46,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 03:51:46,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:51:46,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:50,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1128040.0, ans=0.0 2023-10-03 03:51:51,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:51:51,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:51:52,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-10-03 03:51:55,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:56,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:51:56,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:51:58,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 03:51:59,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 03:51:59,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 03:52:02,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 03:52:05,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 03:52:06,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:52:08,136 INFO [train.py:1046] (2/4) Epoch 32, batch 4550, loss[loss=0.1768, simple_loss=0.2633, pruned_loss=0.04512, over 24430.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.241, pruned_loss=0.04237, over 4702521.67 frames. ], batch size: 69, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:52:10,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:52:12,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:52:14,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:52:15,514 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.941e+02 2.112e+02 2.362e+02 4.046e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 03:52:19,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:52:22,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:52:24,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:52:24,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:52:24,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:52:26,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:52:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:52:35,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 03:52:35,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 03:52:36,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:52:39,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 03:52:40,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 03:52:40,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:52:43,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 03:52:45,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:52:48,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:48,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:48,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:52:50,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 03:52:52,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:52:55,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:55,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:52:56,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:52:57,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 03:52:57,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 03:52:59,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:52:59,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 03:53:00,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 03:53:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:53:01,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1128373.3333333333, ans=0.125 2023-10-03 03:53:02,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:02,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:53:03,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:53:03,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:53:04,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:53:05,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 03:53:05,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1128373.3333333333, ans=0.0 2023-10-03 03:53:06,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:53:06,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 03:53:07,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 03:53:07,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:53:07,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 03:53:10,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:53:10,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:53:14,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:53:15,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:53:17,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:53:17,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:53:19,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:53:21,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:22,887 INFO [train.py:1046] (2/4) Epoch 32, batch 4600, loss[loss=0.1495, simple_loss=0.2411, pruned_loss=0.02899, over 24469.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2395, pruned_loss=0.04183, over 4702188.69 frames. ], batch size: 66, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:53:22,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:53:25,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:53:26,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1128506.6666666667, ans=0.125 2023-10-03 03:53:27,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:53:28,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:29,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 03:53:31,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:53:35,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:53:35,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1128573.3333333333, ans=0.0 2023-10-03 03:53:36,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:44,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.93 vs. limit=15.0 2023-10-03 03:53:45,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 03:53:45,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:49,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:52,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:53:52,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:57,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-10-03 03:53:58,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 03:53:58,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:53:59,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:03,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:03,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:54:06,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:54:07,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 03:54:10,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:54:11,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1128706.6666666667, ans=0.0 2023-10-03 03:54:13,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:16,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:54:18,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:19,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 03:54:19,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:19,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 03:54:20,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:20,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:21,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:23,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:54:24,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:24,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 03:54:26,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 03:54:26,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 03:54:26,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:28,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:54:28,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1128773.3333333333, ans=0.125 2023-10-03 03:54:29,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:30,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:35,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1128840.0, ans=0.0 2023-10-03 03:54:36,605 INFO [train.py:1046] (2/4) Epoch 32, batch 4650, loss[loss=0.147, simple_loss=0.2257, pruned_loss=0.0342, over 20232.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2392, pruned_loss=0.0416, over 4699531.66 frames. ], batch size: 44, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:54:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:54:40,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:40,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:42,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:54:43,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.926e+02 2.147e+02 2.476e+02 3.690e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-03 03:54:43,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:43,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:54:44,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:47,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1128840.0, ans=0.125 2023-10-03 03:54:48,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 03:54:50,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:54:50,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1128906.6666666667, ans=0.0 2023-10-03 03:54:51,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 03:54:53,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:55,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 03:54:55,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:54:55,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 03:54:55,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 03:54:55,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:57,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:54:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:55:01,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:02,501 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 03:55:05,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:06,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 03:55:10,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:10,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:55:10,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1128973.3333333333, ans=0.1 2023-10-03 03:55:11,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 03:55:13,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:55:16,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:55:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:55:25,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:26,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:28,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:28,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:55:30,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1129040.0, ans=0.015 2023-10-03 03:55:33,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 03:55:33,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 03:55:33,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 03:55:33,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 03:55:36,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:55:40,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1129106.6666666667, ans=0.1 2023-10-03 03:55:40,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=1129106.6666666667, ans=0.02 2023-10-03 03:55:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:55:43,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:55:43,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 03:55:43,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:55:44,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:55:44,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:55:45,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:55:47,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:55:47,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:55:47,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1129106.6666666667, ans=0.1 2023-10-03 03:55:49,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:50,641 INFO [train.py:1046] (2/4) Epoch 32, batch 4700, loss[loss=0.1766, simple_loss=0.2485, pruned_loss=0.05231, over 22798.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2404, pruned_loss=0.0414, over 4719250.76 frames. ], batch size: 322, lr: 3.17e-03, grad_scale: 8.0 2023-10-03 03:55:50,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1129173.3333333333, ans=0.125 2023-10-03 03:55:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:55:52,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:55:52,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:55:53,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 03:55:55,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:55:55,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 03:56:01,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1129173.3333333333, ans=0.125 2023-10-03 03:56:04,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:06,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:56:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:07,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:56:09,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:56:12,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.20 vs. limit=15.0 2023-10-03 03:56:13,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 03:56:13,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 03:56:17,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:17,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:56:18,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:56:19,507 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=13.12 vs. limit=15.0 2023-10-03 03:56:22,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:28,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:56:28,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:56:31,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:56:35,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 03:56:37,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:56:38,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:41,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 03:56:42,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:56:47,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:56:47,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 03:56:49,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:49,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:49,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1129440.0, ans=0.0 2023-10-03 03:56:53,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:53,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:56:54,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 03:56:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 03:56:57,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:57,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1129440.0, ans=0.125 2023-10-03 03:56:58,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:58,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:58,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 03:57:00,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:57:03,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 03:57:05,058 INFO [train.py:1046] (2/4) Epoch 32, batch 4750, loss[loss=0.1707, simple_loss=0.255, pruned_loss=0.04323, over 24534.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2418, pruned_loss=0.042, over 4711214.00 frames. ], batch size: 71, lr: 3.17e-03, grad_scale: 8.0 2023-10-03 03:57:07,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:57:09,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:14,007 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.934e+02 2.171e+02 2.465e+02 4.386e+02, threshold=4.342e+02, percent-clipped=1.0 2023-10-03 03:57:14,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:14,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:57:15,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 03:57:15,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:18,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 03:57:19,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:57:19,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:57:20,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:57:24,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 03:57:29,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:57:29,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1129573.3333333333, ans=0.0 2023-10-03 03:57:30,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 03:57:30,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:57:33,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:57:33,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:57:35,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:35,312 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 03:57:35,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 03:57:42,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 03:57:43,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:45,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:57:46,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:57:46,853 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 03:57:46,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:57:51,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:57:52,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:57:55,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 03:57:55,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 03:57:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:57,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:57:58,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:58,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:57:58,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 03:58:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 03:58:05,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:05,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1129773.3333333333, ans=0.0 2023-10-03 03:58:09,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:58:09,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 03:58:09,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:58:10,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:11,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.18 vs. limit=12.0 2023-10-03 03:58:13,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:58:13,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:15,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:58:16,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1129773.3333333333, ans=0.1 2023-10-03 03:58:17,627 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.30 vs. limit=15.0 2023-10-03 03:58:18,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:58:18,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 03:58:18,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 03:58:19,443 INFO [train.py:1046] (2/4) Epoch 32, batch 4800, loss[loss=0.1742, simple_loss=0.2515, pruned_loss=0.04847, over 23324.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2422, pruned_loss=0.04207, over 4717247.89 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:58:19,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 03:58:21,346 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-10-03 03:58:24,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:58:24,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:58:24,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 03:58:28,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:29,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:34,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:58:36,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:58:37,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:39,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 03:58:40,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:58:40,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:58:42,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:58:46,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:58:47,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.0 2023-10-03 03:58:47,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:47,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:58:49,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:49,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 03:58:49,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:49,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:58:51,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1129973.3333333333, ans=0.0 2023-10-03 03:58:52,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:54,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:56,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:58:56,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:58:57,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:58,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 03:58:58,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 03:58:59,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:01,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:59:01,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:59:01,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:59:01,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:59:03,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:59:04,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:59:07,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:59:08,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:09,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1130040.0, ans=0.025 2023-10-03 03:59:10,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:16,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 03:59:16,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:59:17,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:17,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:59:17,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1130106.6666666667, ans=0.1 2023-10-03 03:59:18,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:23,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:59:23,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:59:23,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:59:24,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:59:26,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:59:26,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1130106.6666666667, ans=0.125 2023-10-03 03:59:29,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:29,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:29,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:59:30,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 03:59:33,561 INFO [train.py:1046] (2/4) Epoch 32, batch 4850, loss[loss=0.1722, simple_loss=0.2537, pruned_loss=0.04541, over 23165.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2432, pruned_loss=0.04251, over 4713036.12 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:59:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 03:59:34,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:59:34,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:59:35,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:59:35,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:38,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:38,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1130173.3333333333, ans=0.125 2023-10-03 03:59:42,445 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.847e+02 2.113e+02 2.344e+02 3.781e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 03:59:44,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 03:59:45,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1130173.3333333333, ans=0.0 2023-10-03 03:59:47,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:51,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:59:52,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:59:52,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:55,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:57,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:59:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:59:58,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 04:00:02,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:00:02,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1130306.6666666667, ans=0.2 2023-10-03 04:00:06,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:00:06,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:00:06,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:00:06,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 04:00:09,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:00:09,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:12,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 04:00:13,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 04:00:15,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:00:17,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1130373.3333333333, ans=0.0 2023-10-03 04:00:22,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:00:22,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 04:00:23,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1130373.3333333333, ans=0.125 2023-10-03 04:00:24,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:00:24,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:00:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:00:25,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1130373.3333333333, ans=0.125 2023-10-03 04:00:27,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 04:00:27,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:27,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 04:00:28,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:00:30,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:00:30,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 04:00:39,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:43,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:00:43,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:00:48,770 INFO [train.py:1046] (2/4) Epoch 32, batch 4900, loss[loss=0.1591, simple_loss=0.2418, pruned_loss=0.03819, over 24436.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2422, pruned_loss=0.04222, over 4709296.68 frames. ], batch size: 69, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:00:50,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 04:00:50,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:00:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:00:54,681 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:00:57,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:00:57,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:00:58,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 04:01:05,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 04:01:05,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1130573.3333333333, ans=0.0 2023-10-03 04:01:09,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 04:01:10,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 04:01:10,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:01:10,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:01:10,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:01:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:01:11,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:01:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 04:01:12,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.53 vs. limit=6.0 2023-10-03 04:01:13,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 04:01:14,446 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.87 vs. limit=22.5 2023-10-03 04:01:15,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:01:15,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:01:15,829 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-10-03 04:01:17,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:01:19,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:01:19,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:01:22,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:01:22,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 04:01:24,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:01:25,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:01:25,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 04:01:25,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 04:01:26,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.67 vs. limit=22.5 2023-10-03 04:01:29,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 04:01:31,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1130706.6666666667, ans=0.0 2023-10-03 04:01:32,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:01:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:01:34,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:01:35,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:01:35,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:01:35,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:01:37,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 04:01:38,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:01:39,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1130706.6666666667, ans=0.125 2023-10-03 04:01:41,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:01:43,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:01:46,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 04:01:46,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:01:46,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 04:01:46,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 04:01:53,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:01:55,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:01:55,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 04:01:55,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:01:55,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:01:57,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:00,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:02:00,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1130773.3333333333, ans=0.125 2023-10-03 04:02:00,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1130773.3333333333, ans=0.125 2023-10-03 04:02:01,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:02:01,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:02:01,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 04:02:01,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1130840.0, ans=0.0 2023-10-03 04:02:02,562 INFO [train.py:1046] (2/4) Epoch 32, batch 4950, loss[loss=0.166, simple_loss=0.2491, pruned_loss=0.04149, over 23608.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04155, over 4698930.83 frames. ], batch size: 94, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:02:02,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:02:07,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:02:07,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:02:07,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1130840.0, ans=0.125 2023-10-03 04:02:08,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 04:02:08,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 04:02:10,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:02:10,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 04:02:10,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:10,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:02:11,964 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.926e+02 2.133e+02 2.555e+02 3.988e+02, threshold=4.267e+02, percent-clipped=0.0 2023-10-03 04:02:12,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:02:12,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:13,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:13,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:02:14,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.21 vs. limit=6.0 2023-10-03 04:02:16,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:02:18,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:02:18,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.43 vs. limit=15.0 2023-10-03 04:02:19,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:19,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:02:22,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:02:29,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:31,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:02:32,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:32,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:35,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:02:36,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 04:02:37,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 04:02:39,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:41,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:02:41,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:02:44,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:02:44,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:02:44,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:02:45,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:48,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:02:50,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:02:53,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:53,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:53,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 04:02:53,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:02:56,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:02:57,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1131040.0, ans=0.125 2023-10-03 04:03:00,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:03:01,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:03:01,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:03:02,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:03:02,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:03:03,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:03:04,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:03:06,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:03:06,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:03:06,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 04:03:10,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:15,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 04:03:15,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:03:17,015 INFO [train.py:1046] (2/4) Epoch 32, batch 5000, loss[loss=0.144, simple_loss=0.2239, pruned_loss=0.03207, over 24587.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2392, pruned_loss=0.04112, over 4698435.86 frames. ], batch size: 60, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:03:21,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:03:21,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:03:23,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 04:03:24,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 04:03:25,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:03:27,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 04:03:27,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1131173.3333333333, ans=0.125 2023-10-03 04:03:28,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:03:28,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:03:28,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 04:03:30,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:03:30,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:03:31,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 04:03:31,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:31,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:03:34,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 04:03:36,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 04:03:36,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:03:36,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 04:03:38,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:03:38,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:40,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:03:40,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 04:03:40,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 04:03:41,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 04:03:41,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:03:43,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:44,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 04:03:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:03:45,031 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.77 vs. limit=10.0 2023-10-03 04:03:45,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:47,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:47,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 04:03:48,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 04:03:48,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:03:48,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:03:49,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1131306.6666666667, ans=0.125 2023-10-03 04:03:52,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1131306.6666666667, ans=0.125 2023-10-03 04:03:54,580 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 04:03:56,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.57 vs. limit=22.5 2023-10-03 04:03:58,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:03:58,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:58,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:01,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 04:04:01,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:04:01,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:04:01,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:04:02,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.35 vs. limit=15.0 2023-10-03 04:04:03,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 04:04:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:04:08,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:04:09,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:14,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 04:04:18,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:18,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1131440.0, ans=0.125 2023-10-03 04:04:25,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:04:27,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:27,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:04:27,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:04:28,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:04:28,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:04:29,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:31,091 INFO [train.py:1046] (2/4) Epoch 32, batch 5050, loss[loss=0.213, simple_loss=0.2783, pruned_loss=0.07383, over 19188.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2406, pruned_loss=0.04157, over 4703403.41 frames. ], batch size: 388, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:04:32,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:34,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 04:04:34,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:04:37,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:04:38,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.13 vs. limit=22.5 2023-10-03 04:04:39,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:04:39,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 04:04:40,632 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.847e+02 2.039e+02 2.357e+02 3.411e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 04:04:40,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:40,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:04:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:04:43,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:04:43,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1131506.6666666667, ans=0.125 2023-10-03 04:04:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:04:50,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 04:04:51,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:04:53,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:04:53,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 04:04:53,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:04:53,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1131573.3333333333, ans=0.125 2023-10-03 04:04:53,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1131573.3333333333, ans=0.125 2023-10-03 04:04:55,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:04:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:56,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:04:56,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 04:04:56,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 04:04:57,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:05:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:03,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:05:05,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 04:05:06,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:05:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 04:05:13,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:05:13,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:05:13,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:14,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:05:16,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=15.0 2023-10-03 04:05:17,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:05:18,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:05:20,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:20,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:05:20,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:05:21,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 04:05:21,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:05:23,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:05:24,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1131706.6666666667, ans=0.125 2023-10-03 04:05:26,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:05:26,652 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 04:05:27,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:05:28,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:05:28,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1131706.6666666667, ans=0.1 2023-10-03 04:05:29,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:29,402 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 04:05:31,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1131773.3333333333, ans=0.0 2023-10-03 04:05:32,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:32,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 04:05:32,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:36,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.63 vs. limit=10.0 2023-10-03 04:05:36,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:38,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:38,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 04:05:38,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 04:05:41,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:05:41,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:05:41,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:05:44,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 04:05:45,975 INFO [train.py:1046] (2/4) Epoch 32, batch 5100, loss[loss=0.1766, simple_loss=0.2439, pruned_loss=0.05466, over 23630.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2415, pruned_loss=0.04205, over 4703634.60 frames. ], batch size: 256, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:05:46,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:47,777 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:05:48,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 04:05:48,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 04:05:50,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:05:52,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:05:53,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1131840.0, ans=0.0 2023-10-03 04:05:54,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:05:55,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 04:05:55,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 04:05:59,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:59,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:06:02,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:06:06,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 04:06:07,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:06:08,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:06:08,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 04:06:11,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:11,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:11,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 04:06:14,638 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 04:06:14,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:14,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 04:06:14,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 04:06:17,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1131973.3333333333, ans=0.125 2023-10-03 04:06:20,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:06:27,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:06:30,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 04:06:30,533 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 04:06:30,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 04:06:33,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 04:06:33,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:33,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1132040.0, ans=0.07 2023-10-03 04:06:37,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 04:06:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 04:06:44,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:06:46,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:06:48,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 04:06:53,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:06:53,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 04:06:56,561 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.56 vs. limit=22.5 2023-10-03 04:06:58,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:06:58,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:06:58,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:07:00,022 INFO [train.py:1046] (2/4) Epoch 32, batch 5150, loss[loss=0.1743, simple_loss=0.2553, pruned_loss=0.04662, over 23385.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2426, pruned_loss=0.04242, over 4704293.92 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:07:00,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:07:00,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:07:01,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:07:01,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 04:07:01,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 04:07:02,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 04:07:02,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:07:02,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 04:07:04,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:04,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 04:07:06,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:07,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:09,079 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.942e+02 2.192e+02 2.524e+02 4.905e+02, threshold=4.384e+02, percent-clipped=1.0 2023-10-03 04:07:11,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:07:12,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 04:07:12,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:12,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:07:15,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:07:15,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:07:15,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:07:16,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:07:16,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:07:17,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 04:07:18,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:07:18,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:07:20,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:07:22,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 04:07:24,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:07:29,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:07:31,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 04:07:35,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:07:41,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:07:43,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:43,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1132373.3333333333, ans=0.2 2023-10-03 04:07:47,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:07:47,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:07:48,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1132373.3333333333, ans=0.1 2023-10-03 04:07:50,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 04:07:52,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:54,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:07:54,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:07:56,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1132373.3333333333, ans=0.0 2023-10-03 04:07:57,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:07:57,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:07:59,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 04:08:00,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.04 vs. limit=15.0 2023-10-03 04:08:04,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:08:05,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:08:08,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:08:08,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:08:10,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:08:10,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:08:10,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:08:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:08:14,902 INFO [train.py:1046] (2/4) Epoch 32, batch 5200, loss[loss=0.1679, simple_loss=0.2424, pruned_loss=0.04668, over 15376.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2432, pruned_loss=0.04286, over 4696849.47 frames. ], batch size: 33, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 04:08:15,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:08:16,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:08:19,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:24,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 04:08:25,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:08:25,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1132506.6666666667, ans=0.2 2023-10-03 04:08:26,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:28,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:29,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:08:29,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 04:08:35,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:08:35,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:08:36,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 04:08:39,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:08:39,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:08:41,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 04:08:41,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 04:08:44,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 04:08:46,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:08:46,314 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 04:08:46,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:47,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:08:47,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:08:48,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 04:08:49,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:08:50,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:53,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 04:08:55,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 04:08:55,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 04:09:00,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 04:09:00,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:09:06,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:09:06,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:07,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 04:09:07,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:09:08,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:09:08,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:09,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:09:09,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1132706.6666666667, ans=0.1 2023-10-03 04:09:09,664 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.13 vs. limit=15.0 2023-10-03 04:09:13,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:09:13,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:09:18,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:09:18,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:18,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:24,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:25,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 04:09:25,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:09:25,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:09:25,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1132773.3333333333, ans=10.0 2023-10-03 04:09:27,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:28,318 INFO [train.py:1046] (2/4) Epoch 32, batch 5250, loss[loss=0.1668, simple_loss=0.2435, pruned_loss=0.04498, over 19417.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2423, pruned_loss=0.0426, over 4686572.34 frames. ], batch size: 42, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:09:28,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:09:28,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:09:31,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:09:34,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:34,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:09:35,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:09:38,124 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.839e+02 2.059e+02 2.239e+02 2.945e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-03 04:09:39,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:41,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:09:44,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:09:46,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:09:48,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 04:09:49,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:49,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:04,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1132973.3333333333, ans=0.125 2023-10-03 04:10:36,640 INFO [train.py:1046] (2/4) Epoch 32, batch 5300, loss[loss=0.1481, simple_loss=0.2259, pruned_loss=0.03519, over 18473.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2408, pruned_loss=0.0423, over 4678842.36 frames. ], batch size: 40, lr: 3.16e-03, grad_scale: 16.0 2023-10-03 04:10:40,857 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:10:40,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1133173.3333333333, ans=0.0 2023-10-03 04:10:51,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:10:51,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 04:10:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 04:10:51,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:51,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:51,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:51,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:51,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:10:51,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:51,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:10:51,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:10:51,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 04:10:52,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 04:10:52,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 04:10:52,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:10:52,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 04:10:52,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 04:10:52,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:53,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:53,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:53,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:10:53,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:10:53,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:10:53,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:53,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:53,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:53,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:53,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:10:53,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:53,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:10:54,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 04:10:54,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:10:54,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:54,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 04:10:54,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 04:10:54,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:10:54,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:10:54,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 04:10:55,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 04:10:55,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:10:55,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:10:55,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:10:56,082 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 04:10:56,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 04:10:56,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:10:56,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:56,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 04:10:56,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 04:10:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 04:10:56,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:10:58,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133253.3333333333, ans=0.1 2023-10-03 04:11:03,406 INFO [train.py:1046] (2/4) Epoch 33, batch 0, loss[loss=0.1638, simple_loss=0.2365, pruned_loss=0.04554, over 23967.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2365, pruned_loss=0.04554, over 23967.00 frames. ], batch size: 195, lr: 3.12e-03, grad_scale: 32.0 2023-10-03 04:11:03,407 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 04:11:15,266 INFO [train.py:1078] (2/4) Epoch 33, validation: loss=0.326, simple_loss=0.2728, pruned_loss=0.1896, over 1125622.00 frames. 2023-10-03 04:11:15,267 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 04:11:16,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 04:11:16,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:11:18,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:11:21,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1133253.3333333333, ans=0.125 2023-10-03 04:11:22,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:22,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:11:24,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:24,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 04:11:25,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 04:11:27,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:27,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:31,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:31,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:32,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:11:32,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:11:34,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 04:11:35,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:11:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:11:42,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:44,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.14 vs. limit=22.5 2023-10-03 04:11:45,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 04:11:48,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:11:48,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:11:51,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:11:51,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133386.6666666667, ans=0.1 2023-10-03 04:11:55,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:11:59,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:05,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 04:12:06,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133453.3333333333, ans=0.1 2023-10-03 04:12:07,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 04:12:09,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:12:09,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:11,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:12:11,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:12:13,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 04:12:14,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:16,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:20,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:12:22,656 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 04:12:23,923 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.812e+02 1.985e+02 2.280e+02 3.382e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 04:12:24,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:12:26,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:12:28,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1133586.6666666667, ans=0.1 2023-10-03 04:12:29,396 INFO [train.py:1046] (2/4) Epoch 33, batch 50, loss[loss=0.159, simple_loss=0.2493, pruned_loss=0.03432, over 24316.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2401, pruned_loss=0.04043, over 1069831.41 frames. ], batch size: 74, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:12:29,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:12:29,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 04:12:29,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:12:29,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1133586.6666666667, ans=0.2 2023-10-03 04:12:30,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:12:31,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1133586.6666666667, ans=0.2 2023-10-03 04:12:32,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:12:32,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:12:35,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:12:36,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.39 vs. limit=15.0 2023-10-03 04:12:37,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 04:12:37,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:41,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1133586.6666666667, ans=0.0 2023-10-03 04:12:44,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:12:47,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 04:12:48,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 04:12:50,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:12:52,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:12:52,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:52,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1133653.3333333333, ans=0.0 2023-10-03 04:12:53,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:12:54,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:12:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:12:54,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:58,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1133720.0, ans=0.0 2023-10-03 04:13:02,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:13:02,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1133720.0, ans=0.2 2023-10-03 04:13:03,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:03,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:13:04,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 04:13:06,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:13:07,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:13:07,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 04:13:07,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1133720.0, ans=0.125 2023-10-03 04:13:08,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:13:10,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 04:13:18,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:13:20,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:13:20,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:21,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:13:21,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:13:25,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 04:13:25,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 04:13:26,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:27,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:13:28,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.90 vs. limit=15.0 2023-10-03 04:13:29,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:13:29,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:13:30,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 04:13:30,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 04:13:32,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 04:13:33,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:13:34,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:13:34,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 04:13:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 04:13:37,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:13:37,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:40,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:13:40,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:13:42,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:13:43,216 INFO [train.py:1046] (2/4) Epoch 33, batch 100, loss[loss=0.1774, simple_loss=0.2574, pruned_loss=0.04868, over 24036.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2413, pruned_loss=0.04149, over 1875158.92 frames. ], batch size: 80, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:13:44,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:13:48,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:13:50,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 04:13:50,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:55,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:13:55,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:13:55,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:55,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:13:56,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:13:57,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 04:13:57,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1133986.6666666667, ans=0.125 2023-10-03 04:13:57,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1133986.6666666667, ans=0.0 2023-10-03 04:14:00,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:14:00,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:00,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:00,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:14:00,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1133986.6666666667, ans=0.0 2023-10-03 04:14:00,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.78 vs. limit=6.0 2023-10-03 04:14:03,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1133986.6666666667, ans=0.0 2023-10-03 04:14:04,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 04:14:04,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:05,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:07,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:14:08,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:14:12,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 04:14:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 04:14:14,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:14,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:14:17,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:14:19,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:19,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1134053.3333333333, ans=0.07 2023-10-03 04:14:20,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:25,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1134053.3333333333, ans=0.125 2023-10-03 04:14:26,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:27,989 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 04:14:29,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 04:14:33,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:14:36,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:14:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:40,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:41,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:14:42,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.43 vs. limit=22.5 2023-10-03 04:14:43,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:14:44,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:46,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:46,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1134186.6666666667, ans=0.125 2023-10-03 04:14:48,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:48,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:14:48,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:49,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 04:14:49,427 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 04:14:50,692 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.866e+02 1.991e+02 2.230e+02 3.082e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-03 04:14:50,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:52,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:14:53,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:14:53,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:53,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 04:14:53,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:14:53,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:14:53,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:14:55,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:57,219 INFO [train.py:1046] (2/4) Epoch 33, batch 150, loss[loss=0.158, simple_loss=0.247, pruned_loss=0.03449, over 24467.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2431, pruned_loss=0.04212, over 2510434.73 frames. ], batch size: 69, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:14:57,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:57,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:14:57,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:14:58,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:03,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:15:03,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:03,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:06,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:15:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:09,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:15:10,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:13,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 04:15:13,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 04:15:13,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 04:15:14,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1134320.0, ans=0.2 2023-10-03 04:15:17,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:15:17,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:15:17,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:15:18,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:15:18,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:15:19,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1134320.0, ans=0.125 2023-10-03 04:15:20,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:21,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:24,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 04:15:24,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:15:30,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:32,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.95 vs. limit=22.5 2023-10-03 04:15:35,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:15:36,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 04:15:36,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1134386.6666666667, ans=0.1 2023-10-03 04:15:40,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:15:40,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:40,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:15:42,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:15:43,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:15:45,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:15:46,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 04:15:52,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:52,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:15:54,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:15:54,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:15:55,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1134520.0, ans=0.0 2023-10-03 04:15:57,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:57,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1134520.0, ans=0.2 2023-10-03 04:15:58,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 04:16:00,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:16:01,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:16:02,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:02,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:16:04,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 04:16:04,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:16:04,883 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 04:16:07,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:16:10,289 INFO [train.py:1046] (2/4) Epoch 33, batch 200, loss[loss=0.1641, simple_loss=0.2406, pruned_loss=0.04377, over 23571.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2449, pruned_loss=0.0429, over 2996880.70 frames. ], batch size: 120, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:16:10,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:16:10,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:16:13,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 04:16:14,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:16,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1134586.6666666667, ans=0.125 2023-10-03 04:16:17,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1134586.6666666667, ans=0.0 2023-10-03 04:16:19,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 04:16:21,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:16:23,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:16:28,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:16:28,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:16:28,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:46,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:16:47,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:16:47,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:16:49,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:16:50,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:16:50,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:16:53,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:16:54,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:16:55,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1134786.6666666667, ans=0.125 2023-10-03 04:16:56,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:56,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:16:58,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 04:16:58,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:16:58,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:59,026 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.79 vs. limit=15.0 2023-10-03 04:17:02,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:17:06,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:17:12,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:13,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.95 vs. limit=15.0 2023-10-03 04:17:14,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:17:17,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1134853.3333333333, ans=0.0 2023-10-03 04:17:18,227 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.915e+02 2.133e+02 2.434e+02 3.393e+02, threshold=4.265e+02, percent-clipped=0.0 2023-10-03 04:17:18,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1134853.3333333333, ans=0.04949747468305833 2023-10-03 04:17:19,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:23,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 04:17:23,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:17:23,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:17:23,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:17:24,352 INFO [train.py:1046] (2/4) Epoch 33, batch 250, loss[loss=0.1685, simple_loss=0.253, pruned_loss=0.04201, over 24092.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2439, pruned_loss=0.04275, over 3387859.21 frames. ], batch size: 80, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:17:26,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:17:26,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 04:17:27,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:17:27,623 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 04:17:29,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:29,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:17:29,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:30,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:17:32,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:17:33,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:34,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:17:37,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:17:37,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1134986.6666666667, ans=0.125 2023-10-03 04:17:44,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1134986.6666666667, ans=0.125 2023-10-03 04:17:44,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1134986.6666666667, ans=0.125 2023-10-03 04:17:48,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1134986.6666666667, ans=10.0 2023-10-03 04:17:50,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1134986.6666666667, ans=0.125 2023-10-03 04:17:51,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:17:53,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:17:54,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:17:59,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:18:00,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:18:01,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:18:02,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:18:03,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:18:03,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:18:04,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:18:04,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:18:07,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 04:18:07,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:18:09,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:18:10,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:18:10,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:18:12,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:18:13,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:18:13,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:18:15,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:15,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:18:17,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:21,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:18:23,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:23,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1135186.6666666667, ans=0.025 2023-10-03 04:18:25,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:18:28,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.12 vs. limit=6.0 2023-10-03 04:18:31,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:32,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:18:36,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 04:18:37,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:18:37,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:18:38,775 INFO [train.py:1046] (2/4) Epoch 33, batch 300, loss[loss=0.159, simple_loss=0.229, pruned_loss=0.04453, over 23786.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.241, pruned_loss=0.042, over 3678178.98 frames. ], batch size: 212, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:18:38,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 04:18:38,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:18:42,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:18:42,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 04:18:46,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:46,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1135253.3333333333, ans=0.125 2023-10-03 04:18:48,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:18:51,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:18:52,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 04:18:52,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:55,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:18:55,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 04:18:55,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:18:58,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1135320.0, ans=0.1 2023-10-03 04:19:00,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:19:04,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:19:04,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 04:19:04,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1135320.0, ans=0.1 2023-10-03 04:19:06,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1135320.0, ans=0.04949747468305833 2023-10-03 04:19:06,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1135320.0, ans=0.125 2023-10-03 04:19:08,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 04:19:08,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:10,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:12,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:12,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 04:19:13,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:19:14,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:19:14,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:19:16,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:19:16,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1135386.6666666667, ans=0.125 2023-10-03 04:19:19,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:19:19,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 04:19:20,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:19:22,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:25,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 04:19:26,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:19:31,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:19:31,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1135453.3333333333, ans=0.125 2023-10-03 04:19:33,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:19:33,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 04:19:37,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:37,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:19:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:42,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:19:42,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 04:19:44,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:19:44,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:19:45,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 04:19:46,840 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.972e+02 2.281e+02 2.681e+02 3.966e+02, threshold=4.562e+02, percent-clipped=0.0 2023-10-03 04:19:46,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:47,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:49,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:49,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:19:51,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:52,672 INFO [train.py:1046] (2/4) Epoch 33, batch 350, loss[loss=0.1569, simple_loss=0.2524, pruned_loss=0.03073, over 24348.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2403, pruned_loss=0.04158, over 3916791.15 frames. ], batch size: 74, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:19:55,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:19:55,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 04:19:56,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.45 vs. limit=22.5 2023-10-03 04:19:58,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:03,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:20:04,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:04,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:06,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1135653.3333333333, ans=0.125 2023-10-03 04:20:07,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 04:20:08,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:20:08,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 04:20:12,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:12,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 04:20:13,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:20:17,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 04:20:18,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:20:22,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:20:22,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:20:22,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:23,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:23,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:20:25,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:20:26,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:20:26,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:34,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:20:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:20:35,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:20:35,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:39,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1135786.6666666667, ans=0.1 2023-10-03 04:20:40,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 04:20:40,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:44,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:44,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:20:44,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:20:47,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 04:20:50,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:20:50,270 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 04:20:51,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 04:20:51,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:51,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1135853.3333333333, ans=0.0 2023-10-03 04:20:53,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:20:53,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 04:20:58,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:20:59,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:21:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:02,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:02,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:21:05,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:21:06,771 INFO [train.py:1046] (2/4) Epoch 33, batch 400, loss[loss=0.1597, simple_loss=0.2471, pruned_loss=0.03621, over 24444.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2399, pruned_loss=0.04151, over 4091572.17 frames. ], batch size: 63, lr: 3.11e-03, grad_scale: 32.0 2023-10-03 04:21:08,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:21:11,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:21:11,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 04:21:11,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:11,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:13,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:21:13,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:15,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:17,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:17,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 04:21:18,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 04:21:18,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 04:21:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:21,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1135986.6666666667, ans=0.0 2023-10-03 04:21:25,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:21:25,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:25,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 04:21:26,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:21:26,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:26,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:26,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1135986.6666666667, ans=0.125 2023-10-03 04:21:27,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:29,351 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 04:21:30,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 04:21:34,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:34,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1135986.6666666667, ans=0.125 2023-10-03 04:21:35,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:36,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 04:21:38,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 04:21:40,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:21:42,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:21:51,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 04:21:54,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:21:55,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 04:21:58,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:59,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1136120.0, ans=0.125 2023-10-03 04:22:00,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:22:00,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 04:22:00,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1136120.0, ans=0.125 2023-10-03 04:22:04,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:22:07,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:22:09,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:22:11,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:12,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 04:22:14,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:22:15,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 04:22:15,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1136186.6666666667, ans=0.1 2023-10-03 04:22:16,277 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-10-03 04:22:16,715 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.810e+02 1.929e+02 2.053e+02 2.839e+02, threshold=3.858e+02, percent-clipped=0.0 2023-10-03 04:22:16,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:22:16,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:22:19,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 04:22:20,934 INFO [train.py:1046] (2/4) Epoch 33, batch 450, loss[loss=0.1634, simple_loss=0.2389, pruned_loss=0.04397, over 23512.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2399, pruned_loss=0.04154, over 4216528.69 frames. ], batch size: 285, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:22:21,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:22:22,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:22:22,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:22:23,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 04:22:23,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:22:23,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1136253.3333333333, ans=0.0 2023-10-03 04:22:25,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:22:25,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:22:25,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 04:22:25,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:22:27,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:22:29,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:22:29,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1136253.3333333333, ans=0.0 2023-10-03 04:22:35,815 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:22:39,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:39,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:22:42,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 04:22:43,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 04:22:46,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:22:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:50,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:22:53,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:22:53,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:22:56,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 04:22:57,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 04:22:58,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 04:22:59,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:00,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:01,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:23:02,804 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 04:23:02,812 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 04:23:04,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:23:05,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:23:07,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 04:23:10,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:23:10,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:23:10,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:23:12,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 04:23:15,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:23:15,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-10-03 04:23:17,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:23:17,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:23:19,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 04:23:20,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1136520.0, ans=0.0 2023-10-03 04:23:21,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:23:23,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 04:23:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 04:23:25,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1136520.0, ans=0.0 2023-10-03 04:23:26,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:23:30,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:23:30,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1136520.0, ans=0.125 2023-10-03 04:23:31,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:23:33,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1136586.6666666667, ans=0.0 2023-10-03 04:23:35,412 INFO [train.py:1046] (2/4) Epoch 33, batch 500, loss[loss=0.1701, simple_loss=0.2544, pruned_loss=0.04295, over 24091.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2401, pruned_loss=0.04127, over 4329963.34 frames. ], batch size: 80, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:23:35,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:23:35,501 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 04:23:36,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.77 vs. limit=15.0 2023-10-03 04:23:38,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:39,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:23:41,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:41,237 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 04:23:41,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1136586.6666666667, ans=0.125 2023-10-03 04:23:42,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 04:23:43,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:46,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:23:50,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:23:51,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:23:53,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:23:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:53,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:03,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:03,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:24:04,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:24:04,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:04,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 04:24:05,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:24:09,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:24:09,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:24:11,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:24:11,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:11,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 04:24:15,646 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 04:24:17,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:19,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:19,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1136786.6666666667, ans=0.125 2023-10-03 04:24:20,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:20,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:21,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:24:23,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 04:24:25,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.62 vs. limit=15.0 2023-10-03 04:24:25,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:24:27,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:31,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:34,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:39,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:42,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 04:24:42,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:42,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1136853.3333333333, ans=0.2 2023-10-03 04:24:45,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 04:24:46,561 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.881e+02 2.075e+02 2.361e+02 3.441e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-03 04:24:46,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:24:47,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.16 vs. limit=15.0 2023-10-03 04:24:48,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:50,005 INFO [train.py:1046] (2/4) Epoch 33, batch 550, loss[loss=0.1651, simple_loss=0.2569, pruned_loss=0.0367, over 24334.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2407, pruned_loss=0.04144, over 4421145.63 frames. ], batch size: 74, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:24:52,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 04:24:55,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 04:24:55,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:55,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 04:24:57,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:24:57,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:58,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:58,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:58,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:24:59,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:25:01,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:25:02,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 04:25:04,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:25:06,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:06,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:10,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:25:10,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:10,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1136986.6666666667, ans=0.0 2023-10-03 04:25:14,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 04:25:14,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 04:25:18,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:25:22,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:25:22,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:25:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:25:26,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:26,503 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 04:25:26,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:27,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:25:28,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.66 vs. limit=22.5 2023-10-03 04:25:31,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:25:31,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:25:31,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:25:33,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:34,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 04:25:36,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 04:25:37,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:25:37,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:25:39,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:25:39,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:25:43,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:25:45,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:25:46,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:25:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:48,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 04:25:49,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:25:51,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:25:52,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:25:53,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:54,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:25:55,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 04:25:59,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 04:26:02,184 INFO [train.py:1046] (2/4) Epoch 33, batch 600, loss[loss=0.1658, simple_loss=0.2347, pruned_loss=0.04848, over 23811.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2416, pruned_loss=0.04168, over 4489628.55 frames. ], batch size: 179, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:26:03,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 04:26:06,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:26:06,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:26:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:13,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:26:14,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:26:14,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1137253.3333333333, ans=0.125 2023-10-03 04:26:17,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 04:26:19,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:26:20,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:26:22,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:23,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 04:26:23,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:26:29,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 04:26:32,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:26:32,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:33,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:26:37,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:26:37,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:26:39,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:47,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:26:52,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:52,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:26:52,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:57,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 04:27:02,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:27:02,795 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-10-03 04:27:03,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:27:06,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 04:27:06,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:27:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 04:27:10,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:27:10,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:27:14,269 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.854e+02 2.113e+02 2.384e+02 3.554e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 04:27:17,648 INFO [train.py:1046] (2/4) Epoch 33, batch 650, loss[loss=0.1634, simple_loss=0.2553, pruned_loss=0.0357, over 24298.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.241, pruned_loss=0.04195, over 4539202.84 frames. ], batch size: 74, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:27:17,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:27:19,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1137586.6666666667, ans=0.0 2023-10-03 04:27:20,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:27:21,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:27:23,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:27:24,019 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.09 vs. limit=15.0 2023-10-03 04:27:24,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:27,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 04:27:28,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:27:30,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1137653.3333333333, ans=0.2 2023-10-03 04:27:34,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:27:34,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:27:37,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:40,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 04:27:40,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1137653.3333333333, ans=0.125 2023-10-03 04:27:42,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:27:42,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:27:44,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1137653.3333333333, ans=0.125 2023-10-03 04:27:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:27:46,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 04:27:48,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:48,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:48,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:27:49,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:52,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:27:55,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:27:55,244 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 04:27:55,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:55,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:27:59,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:59,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:27:59,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:01,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:28:02,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 04:28:02,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:28:03,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:28:03,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:28:03,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:28:06,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:28:06,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 04:28:08,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 04:28:08,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:09,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:28:09,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:28:09,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:28:11,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:28:17,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:18,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1137853.3333333333, ans=0.125 2023-10-03 04:28:18,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1137853.3333333333, ans=0.125 2023-10-03 04:28:19,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:28:21,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:28:22,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:22,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:28:24,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:31,196 INFO [train.py:1046] (2/4) Epoch 33, batch 700, loss[loss=0.1564, simple_loss=0.2389, pruned_loss=0.03693, over 23602.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.239, pruned_loss=0.04168, over 4552783.27 frames. ], batch size: 135, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:28:31,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:28:31,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:28:31,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:28:31,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:28:32,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:35,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1137920.0, ans=0.1 2023-10-03 04:28:35,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:36,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 04:28:36,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 04:28:40,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 04:28:41,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:44,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:28:44,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 04:28:49,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:28:50,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1137986.6666666667, ans=0.1 2023-10-03 04:28:52,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:28:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:55,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:28:55,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:28:58,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:59,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 04:28:59,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:29:02,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 04:29:04,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1138053.3333333333, ans=0.125 2023-10-03 04:29:05,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 04:29:08,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:29:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:29:10,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:29:14,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:29:15,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=11.43 vs. limit=15.0 2023-10-03 04:29:16,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 04:29:19,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:19,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:29:19,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 04:29:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:29:25,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:28,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:29:30,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-10-03 04:29:33,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:29:33,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 04:29:37,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 04:29:37,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 04:29:41,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:41,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:29:43,030 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.966e+02 2.295e+02 2.647e+02 3.706e+02, threshold=4.591e+02, percent-clipped=0.0 2023-10-03 04:29:43,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:29:44,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:45,849 INFO [train.py:1046] (2/4) Epoch 33, batch 750, loss[loss=0.1743, simple_loss=0.2586, pruned_loss=0.045, over 24003.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2387, pruned_loss=0.04136, over 4580606.51 frames. ], batch size: 80, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:29:45,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 04:29:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 04:29:48,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 04:29:48,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 04:29:50,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 04:29:50,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 04:29:52,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:29:53,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 04:29:53,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:55,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:29:56,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:29:58,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:58,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:29:59,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:30:02,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:30:02,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:30:03,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1138320.0, ans=0.07 2023-10-03 04:30:05,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:30:06,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:30:08,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:30:08,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 04:30:08,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:30:09,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.76 vs. limit=22.5 2023-10-03 04:30:11,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:30:12,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:30:12,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1138320.0, ans=0.125 2023-10-03 04:30:13,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:30:15,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 04:30:15,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:30:18,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 04:30:18,452 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 04:30:19,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 04:30:19,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:30:19,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:30:19,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1138386.6666666667, ans=0.1 2023-10-03 04:30:21,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:30:28,227 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.92 vs. limit=15.0 2023-10-03 04:30:29,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:30:29,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:30:29,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:30:31,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:30:33,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:30:33,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 04:30:34,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:30:34,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 04:30:34,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:30:38,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:30:39,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 04:30:40,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:30:43,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:30:46,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:30:46,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:30:49,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:30:52,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 04:30:52,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:30:54,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:30:56,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:30:56,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:00,091 INFO [train.py:1046] (2/4) Epoch 33, batch 800, loss[loss=0.1554, simple_loss=0.2487, pruned_loss=0.03103, over 24623.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2396, pruned_loss=0.04185, over 4603297.82 frames. ], batch size: 73, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:31:00,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:00,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:31:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:07,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:31:07,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:10,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:10,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:11,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:14,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:16,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:31:16,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1138653.3333333333, ans=0.125 2023-10-03 04:31:19,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 04:31:19,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:20,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:20,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:31:20,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:31:22,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 04:31:22,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:22,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 04:31:25,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:26,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:29,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:31:29,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:31:31,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:32,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:36,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:31:37,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:31:37,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 04:31:39,014 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 04:31:39,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 04:31:39,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:31:40,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:40,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1138720.0, ans=0.0 2023-10-03 04:31:41,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:41,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:31:44,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1138786.6666666667, ans=0.1 2023-10-03 04:31:46,563 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 04:31:47,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 04:31:48,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1138786.6666666667, ans=0.125 2023-10-03 04:31:49,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:31:50,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:31:54,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:31:58,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:59,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 04:32:00,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:32:04,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 04:32:10,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:32:11,441 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.861e+02 2.064e+02 2.320e+02 3.416e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-03 04:32:11,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1138853.3333333333, ans=0.125 2023-10-03 04:32:12,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:32:12,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 04:32:14,284 INFO [train.py:1046] (2/4) Epoch 33, batch 850, loss[loss=0.1763, simple_loss=0.2641, pruned_loss=0.04423, over 24372.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2404, pruned_loss=0.04169, over 4645495.74 frames. ], batch size: 77, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:32:14,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:32:14,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:32:15,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 04:32:15,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:17,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:32:17,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:32:20,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:32:21,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 04:32:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 04:32:21,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 04:32:23,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:32:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:32:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:26,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:32:26,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:32:30,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:30,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:32:30,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 04:32:34,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 04:32:37,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:38,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 04:32:41,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 04:32:41,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 04:32:42,164 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.05 vs. limit=15.0 2023-10-03 04:32:44,346 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 04:32:44,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:32:46,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:32:46,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:32:49,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:50,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:50,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 04:32:53,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:32:55,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:32:55,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:32:55,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:32:56,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:32:57,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:32:58,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 04:33:02,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:33:02,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:33:02,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:33:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:33:04,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:33:05,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1139120.0, ans=0.125 2023-10-03 04:33:06,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:33:07,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:33:08,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:33:08,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:10,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:33:18,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:33:21,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:33:21,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 04:33:21,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:33:21,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:33:24,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 04:33:28,981 INFO [train.py:1046] (2/4) Epoch 33, batch 900, loss[loss=0.1801, simple_loss=0.2509, pruned_loss=0.05468, over 23880.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2418, pruned_loss=0.04174, over 4663555.84 frames. ], batch size: 195, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:33:29,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:33:30,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1139253.3333333333, ans=0.1 2023-10-03 04:33:33,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:33,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 04:33:36,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:33:36,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 04:33:37,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 04:33:37,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:33:37,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:33:39,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:33:39,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:33:40,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1139253.3333333333, ans=0.125 2023-10-03 04:33:47,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:33:47,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:48,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:33:52,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:33:53,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1139320.0, ans=0.04949747468305833 2023-10-03 04:33:56,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 04:33:58,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:33:59,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.12 vs. limit=15.0 2023-10-03 04:34:01,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1139386.6666666667, ans=0.125 2023-10-03 04:34:03,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1139386.6666666667, ans=0.125 2023-10-03 04:34:06,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:34:06,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:34:07,740 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 04:34:07,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 04:34:12,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:34:12,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:34:14,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:34:19,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1139453.3333333333, ans=0.2 2023-10-03 04:34:20,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:20,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:34:21,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 04:34:21,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:34:21,753 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:34:25,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 04:34:27,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:34:27,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:34:30,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:34:34,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 04:34:34,670 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 04:34:37,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:34:37,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 04:34:40,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:43,679 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.900e+02 2.085e+02 2.299e+02 3.582e+02, threshold=4.170e+02, percent-clipped=0.0 2023-10-03 04:34:43,718 INFO [train.py:1046] (2/4) Epoch 33, batch 950, loss[loss=0.151, simple_loss=0.2167, pruned_loss=0.0427, over 22787.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2411, pruned_loss=0.04193, over 4672517.87 frames. ], batch size: 322, lr: 3.11e-03, grad_scale: 4.0 2023-10-03 04:34:45,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 04:34:50,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:34:52,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:34:54,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:34:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:34:55,796 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 04:35:00,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:01,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:35:01,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:35:03,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:35:03,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 04:35:03,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:35:04,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:06,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 04:35:07,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:35:09,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1139653.3333333333, ans=0.2 2023-10-03 04:35:12,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:12,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:35:12,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:35:13,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 04:35:16,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:35:17,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:35:19,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:35:24,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:35:24,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:35:26,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 04:35:27,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1139786.6666666667, ans=0.2 2023-10-03 04:35:28,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 04:35:28,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:35:29,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:35:29,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:29,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:35:34,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 04:35:34,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:35:38,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:35:38,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:38,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 04:35:38,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:38,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:35:39,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 04:35:41,746 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:35:43,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:35:45,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:47,675 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.78 vs. limit=5.0 2023-10-03 04:35:50,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:35:50,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 04:35:52,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 04:35:55,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:58,276 INFO [train.py:1046] (2/4) Epoch 33, batch 1000, loss[loss=0.1429, simple_loss=0.2223, pruned_loss=0.03174, over 24478.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2402, pruned_loss=0.04175, over 4662999.57 frames. ], batch size: 58, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:35:58,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.07 vs. limit=15.0 2023-10-03 04:35:59,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 04:36:01,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:05,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:36:06,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1139920.0, ans=0.2 2023-10-03 04:36:07,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 04:36:07,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 04:36:07,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1139920.0, ans=0.015 2023-10-03 04:36:11,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:11,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:36:14,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:17,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 04:36:19,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 04:36:21,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1139986.6666666667, ans=0.125 2023-10-03 04:36:22,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 04:36:22,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:36:24,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 04:36:25,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 04:36:25,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 04:36:28,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:28,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:37,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:37,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:36:38,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:38,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:40,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 04:36:40,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:36:41,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:36:41,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:42,998 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 04:36:45,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 04:36:47,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 04:36:49,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 04:36:52,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:36:58,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:58,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:36:59,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:00,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:37:01,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 04:37:02,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:37:02,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 04:37:04,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 04:37:05,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:37:05,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:37:08,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:37:10,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:37:12,721 INFO [train.py:1046] (2/4) Epoch 33, batch 1050, loss[loss=0.17, simple_loss=0.2399, pruned_loss=0.05009, over 23730.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2391, pruned_loss=0.0416, over 4670188.45 frames. ], batch size: 164, lr: 3.11e-03, grad_scale: 4.0 2023-10-03 04:37:12,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:37:14,199 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.886e+02 2.130e+02 2.502e+02 4.211e+02, threshold=4.261e+02, percent-clipped=1.0 2023-10-03 04:37:14,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:37:15,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:37:19,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:37:19,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:20,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:37:23,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:37:25,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:37:28,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:37:28,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:37:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:37:29,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:37:29,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 04:37:31,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:37:31,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 04:37:32,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:37:32,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 04:37:32,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:37:38,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:39,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:37:39,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:37:42,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 04:37:42,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 04:37:42,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:37:46,023 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.16 vs. limit=15.0 2023-10-03 04:37:46,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 04:37:50,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 04:37:52,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:37:55,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 04:37:58,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 04:37:58,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:37:58,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:38:02,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:38:05,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 04:38:08,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 04:38:08,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 04:38:08,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:38:08,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:38:11,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 04:38:11,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.03 vs. limit=15.0 2023-10-03 04:38:15,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:38:18,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:38:18,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:38:19,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:38:19,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:38:22,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:38:22,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 04:38:24,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:38:24,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 04:38:25,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 04:38:26,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:38:27,222 INFO [train.py:1046] (2/4) Epoch 33, batch 1100, loss[loss=0.1694, simple_loss=0.2377, pruned_loss=0.0505, over 22814.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2389, pruned_loss=0.04101, over 4678842.81 frames. ], batch size: 322, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:38:30,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:38:31,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1140586.6666666667, ans=0.125 2023-10-03 04:38:34,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:38:39,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:38:40,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:38:40,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:38:41,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 04:38:43,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:38:44,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:38:47,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:38:49,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1140653.3333333333, ans=0.125 2023-10-03 04:38:50,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:38:50,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 04:38:52,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:38:53,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:38:53,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:38:56,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:38:57,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:39:04,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:39:06,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.90 vs. limit=15.0 2023-10-03 04:39:07,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 04:39:07,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 04:39:07,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:09,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:10,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:39:10,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:39:11,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 04:39:13,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:39:13,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:39:13,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:39:14,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:14,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 04:39:16,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1140786.6666666667, ans=0.0 2023-10-03 04:39:20,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:39:20,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 04:39:21,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.47 vs. limit=15.0 2023-10-03 04:39:23,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:39:23,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1140786.6666666667, ans=0.125 2023-10-03 04:39:27,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:39:28,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 04:39:28,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:39:30,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:33,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:39:33,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:39:37,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 04:39:38,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:39:38,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:39:39,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 04:39:39,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:39:39,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 04:39:41,297 INFO [train.py:1046] (2/4) Epoch 33, batch 1150, loss[loss=0.1497, simple_loss=0.2276, pruned_loss=0.03591, over 24648.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2394, pruned_loss=0.04118, over 4686284.06 frames. ], batch size: 60, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:39:41,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:39:41,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:39:41,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:39:42,042 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=15.0 2023-10-03 04:39:42,688 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.826e+02 2.024e+02 2.251e+02 4.261e+02, threshold=4.048e+02, percent-clipped=0.0 2023-10-03 04:39:45,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:39:46,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:39:48,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:39:48,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:39:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 04:39:49,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:39:54,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 04:39:54,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:39:54,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:39:55,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1140986.6666666667, ans=0.125 2023-10-03 04:40:00,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 04:40:03,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:07,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:40:07,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:07,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 04:40:07,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:40:09,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:40:10,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 04:40:11,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:13,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:40:20,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:27,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:28,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 04:40:29,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:31,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:31,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.15 vs. limit=10.0 2023-10-03 04:40:34,899 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 04:40:36,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:42,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1141186.6666666667, ans=0.0 2023-10-03 04:40:43,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 04:40:47,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:40:47,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1141186.6666666667, ans=0.0 2023-10-03 04:40:48,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:40:48,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:40:48,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:40:51,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:40:54,735 INFO [train.py:1046] (2/4) Epoch 33, batch 1200, loss[loss=0.178, simple_loss=0.2602, pruned_loss=0.0479, over 24008.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2404, pruned_loss=0.04148, over 4690605.81 frames. ], batch size: 80, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:40:56,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:40:56,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:40:57,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:57,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:40:57,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:40:59,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:40:59,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1141253.3333333333, ans=0.0 2023-10-03 04:41:01,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:41:04,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:41:04,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:41:06,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 04:41:09,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 04:41:13,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:41:13,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1141320.0, ans=0.0 2023-10-03 04:41:15,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:41:15,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1141320.0, ans=0.0 2023-10-03 04:41:17,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:41:18,763 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.40 vs. limit=15.0 2023-10-03 04:41:21,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:41:21,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 04:41:21,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:41:29,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:41:29,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:41:30,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 04:41:30,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:41:33,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 04:41:39,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 04:41:39,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:41:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:41:42,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:41:42,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:41:44,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:41:44,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:41:45,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:41:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 04:41:46,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:41:46,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:41:48,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:41:50,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:41:50,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:41:52,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1141520.0, ans=0.125 2023-10-03 04:41:53,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1141520.0, ans=0.0 2023-10-03 04:41:54,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:41:58,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:42:00,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 04:42:01,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1141520.0, ans=0.2 2023-10-03 04:42:03,642 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 04:42:06,911 INFO [train.py:1046] (2/4) Epoch 33, batch 1250, loss[loss=0.1451, simple_loss=0.2201, pruned_loss=0.03507, over 24309.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2412, pruned_loss=0.04168, over 4710499.38 frames. ], batch size: 56, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:42:06,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:42:08,772 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.967e+02 2.213e+02 2.630e+02 3.265e+02, threshold=4.425e+02, percent-clipped=0.0 2023-10-03 04:42:08,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:42:11,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:42:12,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:42:14,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 04:42:17,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:42:18,334 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.75 vs. limit=15.0 2023-10-03 04:42:19,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:19,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 04:42:20,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:42:21,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:42:26,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:42:28,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:28,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:42:28,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:42:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:42:32,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1141653.3333333333, ans=0.1 2023-10-03 04:42:33,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:42:33,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:42:33,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:42:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:42:36,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:38,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:40,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:42:45,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 04:42:46,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:42:48,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:42:49,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 04:42:49,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 04:42:51,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:51,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:53,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:57,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:58,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:42:58,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1141786.6666666667, ans=0.125 2023-10-03 04:42:59,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 04:42:59,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 04:42:59,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 04:43:05,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:05,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 04:43:05,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:43:07,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=15.0 2023-10-03 04:43:08,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 04:43:08,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:43:10,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 04:43:10,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:43:10,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:43:12,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 04:43:13,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:43:15,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 04:43:15,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1141853.3333333333, ans=0.07 2023-10-03 04:43:16,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:43:18,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:43:18,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:43:19,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:43:21,092 INFO [train.py:1046] (2/4) Epoch 33, batch 1300, loss[loss=0.1588, simple_loss=0.2325, pruned_loss=0.04256, over 24428.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2427, pruned_loss=0.0422, over 4700982.65 frames. ], batch size: 58, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:43:22,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:43:22,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 04:43:28,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:29,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:43:31,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:43:32,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:43:34,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:43:34,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 04:43:38,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1141986.6666666667, ans=0.125 2023-10-03 04:43:40,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:43:42,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:43:44,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 04:43:46,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:43:50,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:43:52,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:43:53,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:54,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:43:56,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:43:56,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:43:56,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 04:44:01,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:44:01,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:44:02,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 04:44:02,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:44:03,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:44:07,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:44:07,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 04:44:08,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:44:08,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 04:44:11,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:44:15,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:44:15,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:44:19,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 04:44:20,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 04:44:20,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 04:44:20,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1142186.6666666667, ans=0.035 2023-10-03 04:44:25,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:44:27,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 04:44:29,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:44:35,323 INFO [train.py:1046] (2/4) Epoch 33, batch 1350, loss[loss=0.1512, simple_loss=0.2296, pruned_loss=0.03638, over 23459.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2412, pruned_loss=0.04152, over 4708104.69 frames. ], batch size: 119, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:44:35,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 04:44:36,746 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.912e+02 2.066e+02 2.352e+02 3.364e+02, threshold=4.132e+02, percent-clipped=0.0 2023-10-03 04:44:38,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:44:43,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:44:46,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:44:46,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:44:48,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:44:48,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:44:52,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:44:53,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 04:44:54,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:44:54,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:44:57,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 04:44:59,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:45:00,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:45:00,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 04:45:02,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 04:45:03,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 04:45:04,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:04,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 04:45:15,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1142386.6666666667, ans=0.0 2023-10-03 04:45:17,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:27,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:27,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:27,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 04:45:31,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:32,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 04:45:32,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:45:32,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1142520.0, ans=0.1 2023-10-03 04:45:33,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:45:35,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:45:38,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 04:45:38,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:45:43,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 04:45:45,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 04:45:49,684 INFO [train.py:1046] (2/4) Epoch 33, batch 1400, loss[loss=0.1528, simple_loss=0.2365, pruned_loss=0.03458, over 24319.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2407, pruned_loss=0.04129, over 4716752.30 frames. ], batch size: 61, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:45:49,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 04:45:52,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:53,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:45:55,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:45:55,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1142586.6666666667, ans=0.2 2023-10-03 04:45:58,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1142586.6666666667, ans=0.125 2023-10-03 04:45:59,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 04:46:00,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1142586.6666666667, ans=0.125 2023-10-03 04:46:01,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 04:46:09,709 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:46:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:46:13,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:46:13,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:46:15,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:46:15,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-10-03 04:46:19,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:46:22,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 04:46:29,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1142720.0, ans=0.0 2023-10-03 04:46:31,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:31,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:33,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 04:46:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:46:35,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:46:37,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:46:37,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:46:38,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:46:38,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:46:40,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:46:41,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 04:46:43,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:46:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:50,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:46:57,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 04:46:57,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:46:58,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:47:01,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 04:47:03,138 INFO [train.py:1046] (2/4) Epoch 33, batch 1450, loss[loss=0.1449, simple_loss=0.2246, pruned_loss=0.03264, over 24368.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2394, pruned_loss=0.04076, over 4728803.76 frames. ], batch size: 56, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:47:03,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:04,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:47:05,799 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.841e+02 1.972e+02 2.239e+02 2.970e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-03 04:47:07,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:47:09,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:47:09,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:09,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 04:47:09,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1142920.0, ans=0.0 2023-10-03 04:47:13,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:13,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:47:14,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1142920.0, ans=0.2 2023-10-03 04:47:16,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:47:16,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 04:47:17,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:47:18,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 04:47:18,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1142986.6666666667, ans=0.125 2023-10-03 04:47:19,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:19,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:19,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 04:47:21,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:47:22,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:47:22,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 04:47:22,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:24,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:47:25,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:28,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:31,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:47:31,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:47:32,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:34,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:34,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:47:35,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:35,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:47:41,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 04:47:43,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:47:46,179 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 04:47:47,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:47:48,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.96 vs. limit=15.0 2023-10-03 04:47:48,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:47:50,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:47:52,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 04:47:56,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:47:57,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 04:47:57,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1143120.0, ans=0.2 2023-10-03 04:47:59,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 04:47:59,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:48:03,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:48:03,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1143186.6666666667, ans=0.125 2023-10-03 04:48:05,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 04:48:07,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 04:48:07,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 04:48:11,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:12,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:48:17,072 INFO [train.py:1046] (2/4) Epoch 33, batch 1500, loss[loss=0.1663, simple_loss=0.2535, pruned_loss=0.03953, over 24298.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2398, pruned_loss=0.04086, over 4740932.59 frames. ], batch size: 74, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:48:22,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 04:48:24,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:48:24,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:48:25,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:25,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:48:27,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:48:27,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 04:48:29,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:48:30,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:48:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:48:31,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:48:33,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:48:33,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:48:39,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:48:39,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 04:48:39,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:48:39,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:48:41,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:42,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 04:48:47,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 04:48:48,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:49,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 04:48:51,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:48:52,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:48:52,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:52,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:48:55,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 04:48:55,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:48:57,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:48:57,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 04:48:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:49:01,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:49:01,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 04:49:06,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:49:06,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:49:09,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1143453.3333333333, ans=0.1 2023-10-03 04:49:11,411 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 04:49:11,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:11,473 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 04:49:12,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:15,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:49:15,615 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 04:49:17,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:49:17,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1143520.0, ans=0.09899494936611666 2023-10-03 04:49:19,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 04:49:21,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:24,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:49:24,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:25,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:49:25,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:27,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:49:28,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 04:49:29,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 04:49:30,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:49:30,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 04:49:31,405 INFO [train.py:1046] (2/4) Epoch 33, batch 1550, loss[loss=0.196, simple_loss=0.2765, pruned_loss=0.05769, over 23972.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2411, pruned_loss=0.04183, over 4727866.52 frames. ], batch size: 80, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:49:31,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 04:49:31,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1143586.6666666667, ans=0.0 2023-10-03 04:49:34,159 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 1.973e+02 2.319e+02 2.706e+02 3.781e+02, threshold=4.639e+02, percent-clipped=0.0 2023-10-03 04:49:34,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:49:34,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1143586.6666666667, ans=0.125 2023-10-03 04:49:35,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:36,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:49:36,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:49:38,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:38,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:42,320 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 04:49:42,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:42,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1143586.6666666667, ans=10.0 2023-10-03 04:49:44,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:49:44,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:49:47,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:49:48,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 04:49:49,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:49:49,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 04:49:51,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 04:49:51,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 04:49:51,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:51,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:49:54,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:49:57,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 04:49:57,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 04:49:59,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1143653.3333333333, ans=0.0 2023-10-03 04:50:03,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:50:07,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:50:07,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:50:07,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:50:08,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 04:50:15,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:50:15,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:18,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:50:19,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:50:19,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:50:19,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 04:50:19,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:50:22,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:50:22,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:22,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 04:50:22,510 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 04:50:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:50:25,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.89 vs. limit=15.0 2023-10-03 04:50:31,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 04:50:32,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1143853.3333333333, ans=0.07 2023-10-03 04:50:34,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1143853.3333333333, ans=0.0 2023-10-03 04:50:37,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:50:38,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:38,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 04:50:41,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:50:41,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:50:41,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:50:41,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:50:41,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:50:45,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:50:45,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 04:50:46,764 INFO [train.py:1046] (2/4) Epoch 33, batch 1600, loss[loss=0.2187, simple_loss=0.2827, pruned_loss=0.07737, over 19755.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2421, pruned_loss=0.04263, over 4715611.84 frames. ], batch size: 388, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:50:46,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 04:50:48,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 04:50:50,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:50:51,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.37 vs. limit=10.0 2023-10-03 04:50:52,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 04:50:52,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:50:54,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:50:59,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:51:04,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 04:51:06,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:51:08,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 04:51:08,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:10,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 04:51:14,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 04:51:21,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:51:21,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 04:51:22,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:51:22,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:51:22,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:51:24,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 04:51:26,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 04:51:30,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:51:30,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:30,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:31,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:51:34,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:51:35,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.42 vs. limit=10.0 2023-10-03 04:51:35,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:51:37,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:51:43,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:45,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:51:46,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 04:51:46,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:51:48,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 04:51:48,466 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:51:50,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1144186.6666666667, ans=0.0 2023-10-03 04:51:53,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:51:54,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:51:56,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:51:56,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 04:51:57,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 04:51:57,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 04:51:57,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 04:52:00,266 INFO [train.py:1046] (2/4) Epoch 33, batch 1650, loss[loss=0.1501, simple_loss=0.2322, pruned_loss=0.03396, over 24487.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2423, pruned_loss=0.04259, over 4720298.40 frames. ], batch size: 63, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:52:00,735 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:52:01,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:52:01,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:52:01,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:01,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:52:02,238 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:52:03,800 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.880e+02 2.031e+02 2.196e+02 3.045e+02, threshold=4.062e+02, percent-clipped=0.0 2023-10-03 04:52:05,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:52:06,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 04:52:08,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:52:08,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:52:08,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:52:08,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:52:10,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 04:52:10,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 04:52:15,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1144320.0, ans=0.2 2023-10-03 04:52:16,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:52:18,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:52:18,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-10-03 04:52:26,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 04:52:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:29,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 04:52:29,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1144386.6666666667, ans=0.125 2023-10-03 04:52:32,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:52:36,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:52:36,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:52:36,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:52:37,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:52:37,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:41,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:52:41,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:43,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:52:43,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:52:44,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:44,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:52:48,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:52:49,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 04:52:51,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:52:51,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 04:52:52,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 04:52:52,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 04:52:52,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:53,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:52:54,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:52:55,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:55,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 04:52:58,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:53:00,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:53:00,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:53:04,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 04:53:08,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1144520.0, ans=0.2 2023-10-03 04:53:09,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:53:09,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:53:09,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 04:53:09,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:53:09,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:53:09,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:53:11,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=12.0 2023-10-03 04:53:14,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:53:14,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:53:14,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 04:53:15,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1144586.6666666667, ans=0.1 2023-10-03 04:53:16,261 INFO [train.py:1046] (2/4) Epoch 33, batch 1700, loss[loss=0.1606, simple_loss=0.2369, pruned_loss=0.04221, over 23496.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2414, pruned_loss=0.04204, over 4728431.00 frames. ], batch size: 134, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:53:16,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1144586.6666666667, ans=0.1 2023-10-03 04:53:18,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:53:19,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1144586.6666666667, ans=0.125 2023-10-03 04:53:26,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:53:26,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1144586.6666666667, ans=0.0 2023-10-03 04:53:27,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1144586.6666666667, ans=0.2 2023-10-03 04:53:28,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:53:34,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:53:34,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:53:35,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:53:35,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:53:37,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 04:53:40,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:53:40,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:53:42,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:53:44,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:53:45,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 04:53:45,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 04:53:47,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:53:49,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 04:53:50,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:54:00,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:01,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:03,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:54:04,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:54:04,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 04:54:04,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:54:05,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:05,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 04:54:07,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:54:07,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:07,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:07,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:11,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:11,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:54:11,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:13,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:54:13,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:18,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:54:19,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 04:54:19,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1144853.3333333333, ans=0.09899494936611666 2023-10-03 04:54:22,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:24,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:54:26,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 04:54:30,276 INFO [train.py:1046] (2/4) Epoch 33, batch 1750, loss[loss=0.1514, simple_loss=0.2277, pruned_loss=0.03757, over 23463.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2404, pruned_loss=0.04125, over 4726015.32 frames. ], batch size: 134, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:54:30,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:30,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.56 vs. limit=22.5 2023-10-03 04:54:32,922 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.900e+02 2.115e+02 2.470e+02 3.706e+02, threshold=4.230e+02, percent-clipped=0.0 2023-10-03 04:54:33,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:33,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:54:34,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 04:54:34,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:36,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1144920.0, ans=0.125 2023-10-03 04:54:37,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:54:37,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:42,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 04:54:44,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:45,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1144986.6666666667, ans=0.07 2023-10-03 04:54:47,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 04:54:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:49,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:54:53,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 04:54:53,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 04:54:55,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:54:56,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 04:54:59,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.99 vs. limit=12.0 2023-10-03 04:55:02,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:55:04,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:04,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:55:07,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:07,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:55:08,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1145053.3333333333, ans=0.125 2023-10-03 04:55:10,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:55:11,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:13,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:55:13,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:55:15,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 04:55:17,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:55:19,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 04:55:21,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:55:22,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:55:24,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:55:28,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:55:28,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:55:29,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:29,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:55:34,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1145186.6666666667, ans=0.1 2023-10-03 04:55:35,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:55:37,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:55:39,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:55:40,131 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.97 vs. limit=22.5 2023-10-03 04:55:40,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 04:55:40,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:40,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:55:40,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:55:40,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:55:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:55:42,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:55:43,418 INFO [train.py:1046] (2/4) Epoch 33, batch 1800, loss[loss=0.1751, simple_loss=0.262, pruned_loss=0.04408, over 24375.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2402, pruned_loss=0.04109, over 4721714.11 frames. ], batch size: 77, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:55:43,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1145253.3333333333, ans=0.125 2023-10-03 04:55:43,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1145253.3333333333, ans=0.125 2023-10-03 04:55:45,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:55:45,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:55:48,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1145253.3333333333, ans=0.1 2023-10-03 04:55:50,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:52,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 04:55:53,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:55:56,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:55:59,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:55:59,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:56:00,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:56:00,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1145320.0, ans=0.2 2023-10-03 04:56:03,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:56:03,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 04:56:03,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:07,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:10,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 04:56:12,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 04:56:12,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 04:56:13,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:13,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:56:13,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:56:15,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:56:24,632 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 04:56:26,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:56:26,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:26,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1145386.6666666667, ans=0.125 2023-10-03 04:56:28,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 04:56:28,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 04:56:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:56:29,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:56:31,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:56:31,791 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.88 vs. limit=12.0 2023-10-03 04:56:35,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 04:56:39,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1145453.3333333333, ans=0.0 2023-10-03 04:56:40,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:56:42,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 04:56:42,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:56:42,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:43,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:56:44,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 04:56:46,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:56:46,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:56:50,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 04:56:50,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:52,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:56:52,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:56:52,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:53,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1145520.0, ans=0.2 2023-10-03 04:56:54,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:54,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:56:57,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:56:57,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:56:59,219 INFO [train.py:1046] (2/4) Epoch 33, batch 1850, loss[loss=0.1952, simple_loss=0.2677, pruned_loss=0.06134, over 19286.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2407, pruned_loss=0.04195, over 4698971.01 frames. ], batch size: 388, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:57:00,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:57:00,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:57:03,504 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.866e+02 2.062e+02 2.280e+02 4.556e+02, threshold=4.123e+02, percent-clipped=1.0 2023-10-03 04:57:06,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:57:06,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 04:57:08,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 04:57:11,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 04:57:16,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:57:16,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 04:57:16,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:57:19,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1145653.3333333333, ans=0.125 2023-10-03 04:57:26,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:57:26,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 04:57:30,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:57:30,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:57:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 04:57:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:57:35,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:57:38,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:57:41,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:57:42,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:57:45,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:57:45,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:57:45,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 04:57:45,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:57:48,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:57:50,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:57:52,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 04:57:52,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1145786.6666666667, ans=0.0 2023-10-03 04:57:53,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:57:56,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:57:58,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:57:58,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 04:57:58,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 04:57:59,921 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 04:58:01,353 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 04:58:01,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1145853.3333333333, ans=0.125 2023-10-03 04:58:04,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:58:04,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:58:04,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:58:04,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:04,250 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 04:58:04,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:58:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:05,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:58:05,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1145853.3333333333, ans=0.05 2023-10-03 04:58:07,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:58:08,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:58:09,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 04:58:11,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:11,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 04:58:11,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:58:12,484 INFO [train.py:1046] (2/4) Epoch 33, batch 1900, loss[loss=0.1541, simple_loss=0.2355, pruned_loss=0.03634, over 24473.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2416, pruned_loss=0.04208, over 4694552.69 frames. ], batch size: 63, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:58:12,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:58:16,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:58:21,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:58:23,140 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 04:58:23,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 04:58:25,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:58:26,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:58:26,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 04:58:26,567 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 04:58:30,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 04:58:32,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:58:36,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 04:58:38,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 04:58:41,832 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-10-03 04:58:43,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1146053.3333333333, ans=0.1 2023-10-03 04:58:46,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 04:58:48,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1146053.3333333333, ans=0.0 2023-10-03 04:58:49,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 04:58:49,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:49,969 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 04:58:51,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 04:58:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 04:58:52,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 04:58:52,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:58:58,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 04:59:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:59:05,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:59:05,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 04:59:06,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:59:09,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.51 vs. limit=22.5 2023-10-03 04:59:10,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 04:59:10,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:59:14,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:59:14,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:59:16,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:59:16,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:59:19,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:59:19,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 04:59:19,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:59:20,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1146186.6666666667, ans=0.2 2023-10-03 04:59:23,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:59:23,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:59:25,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:59:25,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:59:25,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:59:26,641 INFO [train.py:1046] (2/4) Epoch 33, batch 1950, loss[loss=0.1526, simple_loss=0.2299, pruned_loss=0.03766, over 19769.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2423, pruned_loss=0.04214, over 4700755.07 frames. ], batch size: 43, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:59:26,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:59:30,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:59:32,077 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.922e+02 2.146e+02 2.746e+02 4.413e+02, threshold=4.292e+02, percent-clipped=1.0 2023-10-03 04:59:33,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:59:33,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:33,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:59:35,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 04:59:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:59:36,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:36,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:38,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1146253.3333333333, ans=0.1 2023-10-03 04:59:39,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1146253.3333333333, ans=0.1 2023-10-03 04:59:41,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:59:41,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:59:41,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:43,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:59:46,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:59:46,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:59:46,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:59:46,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:50,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:51,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1146320.0, ans=0.125 2023-10-03 04:59:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:59:54,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:59:54,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:59:54,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 04:59:55,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:59:55,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:59:55,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:00,370 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.14 vs. limit=22.5 2023-10-03 05:00:03,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:00:04,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:00:07,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:00:12,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:00:12,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:00:12,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 05:00:13,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:00:14,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1146453.3333333333, ans=0.05 2023-10-03 05:00:17,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:00:19,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:00:19,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:00:26,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:27,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:29,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:32,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:34,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1146520.0, ans=0.0 2023-10-03 05:00:35,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:00:35,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:37,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 05:00:37,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:00:39,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:00:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 05:00:41,749 INFO [train.py:1046] (2/4) Epoch 33, batch 2000, loss[loss=0.1685, simple_loss=0.2537, pruned_loss=0.04165, over 23785.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2426, pruned_loss=0.04221, over 4707679.96 frames. ], batch size: 85, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:00:41,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:00:43,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.62 vs. limit=22.5 2023-10-03 05:00:45,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:00:46,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:00:46,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:00:48,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:00:49,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:49,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1146586.6666666667, ans=0.0 2023-10-03 05:00:51,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 05:00:51,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:00:54,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:00:58,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 05:00:58,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:01:02,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:01:04,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:01:05,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 05:01:05,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:08,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:08,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:10,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 05:01:11,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:01:12,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1146653.3333333333, ans=0.1 2023-10-03 05:01:13,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 05:01:13,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:01:15,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:01:16,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:01:16,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:18,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:01:18,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:01:19,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 05:01:19,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1146720.0, ans=0.0 2023-10-03 05:01:22,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 05:01:22,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:01:22,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:26,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1146720.0, ans=10.0 2023-10-03 05:01:27,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:28,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:01:28,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:01:29,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:01:30,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:01:32,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:32,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:01:32,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:34,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:36,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:01:38,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 05:01:44,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:01:45,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:45,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1146853.3333333333, ans=0.125 2023-10-03 05:01:48,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:48,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:01:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:55,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:01:55,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:55,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:01:56,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:01:58,182 INFO [train.py:1046] (2/4) Epoch 33, batch 2050, loss[loss=0.1496, simple_loss=0.2266, pruned_loss=0.03636, over 21016.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2422, pruned_loss=0.04206, over 4714393.62 frames. ], batch size: 46, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:01:59,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:59,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:59,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1146920.0, ans=0.0 2023-10-03 05:02:02,865 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.906e+02 2.037e+02 2.269e+02 3.118e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-03 05:02:02,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:02:03,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:08,659 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.29 vs. limit=22.5 2023-10-03 05:02:08,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:02:11,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:02:11,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:02:15,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 05:02:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:02:15,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:02:17,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:02:17,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1146986.6666666667, ans=0.125 2023-10-03 05:02:18,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1146986.6666666667, ans=0.125 2023-10-03 05:02:27,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:02:27,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:02:29,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 05:02:30,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:02:32,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 05:02:32,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:02:35,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:02:38,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:02:39,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:02:40,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:02:41,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:02:43,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:02:43,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:02:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:02:48,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:02:49,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:02:50,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:02:53,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:02:58,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:02:59,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 05:03:01,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1147186.6666666667, ans=0.2 2023-10-03 05:03:01,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1147186.6666666667, ans=0.1 2023-10-03 05:03:02,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:03:04,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:03:05,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:03:07,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 05:03:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 05:03:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:10,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:03:11,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:03:13,690 INFO [train.py:1046] (2/4) Epoch 33, batch 2100, loss[loss=0.1612, simple_loss=0.2466, pruned_loss=0.03792, over 24684.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2401, pruned_loss=0.04127, over 4708767.68 frames. ], batch size: 65, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:03:13,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:03:13,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 05:03:15,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 05:03:16,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:03:19,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:03:19,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:03:22,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:23,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.48 vs. limit=15.0 2023-10-03 05:03:23,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:03:23,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 05:03:25,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:03:25,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 05:03:25,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 05:03:27,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:27,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:03:27,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 05:03:27,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:03:35,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 05:03:35,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:03:38,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:03:38,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:03:38,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1147320.0, ans=0.125 2023-10-03 05:03:40,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:03:41,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 05:03:43,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:43,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 05:03:44,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 05:03:46,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:46,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 05:03:46,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 05:03:47,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 05:03:49,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:03:49,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:03:49,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1147386.6666666667, ans=0.125 2023-10-03 05:03:52,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:03:53,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:03:55,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:56,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:56,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 05:03:56,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:56,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:57,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:57,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 05:03:59,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 05:04:01,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 05:04:04,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:04:07,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:04:07,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 05:04:13,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:14,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:04:14,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:04:14,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:04:15,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 05:04:16,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:04:18,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:18,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:04:20,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:04:20,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:22,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 05:04:23,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 05:04:23,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:25,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:04:25,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:04:26,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:04:26,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:04:27,906 INFO [train.py:1046] (2/4) Epoch 33, batch 2150, loss[loss=0.1541, simple_loss=0.2315, pruned_loss=0.03836, over 18272.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2389, pruned_loss=0.04115, over 4698181.13 frames. ], batch size: 39, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 05:04:31,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1147586.6666666667, ans=0.035 2023-10-03 05:04:32,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 05:04:33,937 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.872e+02 2.028e+02 2.280e+02 3.324e+02, threshold=4.056e+02, percent-clipped=0.0 2023-10-03 05:04:34,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:35,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:37,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:04:37,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:38,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:04:40,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:40,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:04:40,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:04:44,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:44,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 05:04:48,708 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.32 vs. limit=15.0 2023-10-03 05:04:49,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:04:50,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:04:52,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:52,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:04:52,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:52,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1147653.3333333333, ans=0.125 2023-10-03 05:04:53,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:04:53,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:53,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:04:54,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:54,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 05:04:56,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:04:57,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:58,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:00,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.42 vs. limit=15.0 2023-10-03 05:05:00,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:05:02,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:05:05,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:05:06,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:05:07,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1147720.0, ans=0.125 2023-10-03 05:05:08,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:08,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 05:05:08,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:05:09,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:05:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:12,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:05:12,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:05:13,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:15,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:15,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 05:05:17,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 05:05:17,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:05:19,160 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 05:05:19,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:19,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:05:20,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 05:05:20,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:05:20,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 05:05:20,667 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 05:05:20,668 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 05:05:20,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 05:05:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:22,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:05:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:05:24,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:24,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:05:26,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:26,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:26,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1147853.3333333333, ans=0.125 2023-10-03 05:05:37,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:05:37,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 05:05:37,886 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.94 vs. limit=15.0 2023-10-03 05:05:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:05:42,625 INFO [train.py:1046] (2/4) Epoch 33, batch 2200, loss[loss=0.1706, simple_loss=0.2447, pruned_loss=0.04823, over 23705.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2395, pruned_loss=0.04127, over 4709244.44 frames. ], batch size: 232, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 05:05:44,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1147920.0, ans=0.1 2023-10-03 05:05:45,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:47,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:05:47,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:05:48,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:05:51,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:51,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:51,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 05:05:58,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 05:06:00,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:06:04,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 05:06:08,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:08,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:06:09,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:06:13,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:06:14,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 05:06:16,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:06:18,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 05:06:22,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:06:24,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:06:26,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:06:27,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:28,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 05:06:30,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:31,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 05:06:34,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:34,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:06:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:37,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:06:37,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:06:37,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1148120.0, ans=0.2 2023-10-03 05:06:38,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:38,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:40,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:06:41,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:06:43,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:06:46,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 05:06:46,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:06:47,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:06:49,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 05:06:51,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:06:52,524 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 05:06:54,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:06:54,404 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 05:06:55,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:56,980 INFO [train.py:1046] (2/4) Epoch 33, batch 2250, loss[loss=0.1602, simple_loss=0.2453, pruned_loss=0.0376, over 24440.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2398, pruned_loss=0.04127, over 4725113.69 frames. ], batch size: 69, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:06:57,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:06:58,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:59,852 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 05:07:01,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:07:02,536 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.842e+02 2.039e+02 2.201e+02 2.888e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 05:07:02,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:07:08,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:07:11,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:07:12,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:14,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:07:16,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:07:18,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 05:07:18,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:07:18,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:07:21,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 05:07:23,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:07:23,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:26,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:07:30,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:07:32,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:07:32,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:07:33,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 05:07:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:36,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:07:39,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:07:41,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:07:42,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:07:42,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:07:45,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:07:46,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:07:51,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:07:54,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:08:00,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:08:00,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:08:02,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:08:02,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1148520.0, ans=0.125 2023-10-03 05:08:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:08:08,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:08:08,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 05:08:09,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:09,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:08:10,334 INFO [train.py:1046] (2/4) Epoch 33, batch 2300, loss[loss=0.135, simple_loss=0.2162, pruned_loss=0.02693, over 24402.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2411, pruned_loss=0.04189, over 4712718.38 frames. ], batch size: 58, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:08:12,539 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.59 vs. limit=22.5 2023-10-03 05:08:13,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 05:08:15,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:08:15,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:21,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:22,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:08:22,798 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 05:08:26,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:34,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:08:34,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:08:34,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:08:36,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:36,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 05:08:36,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:08:37,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:08:38,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:08:39,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1148720.0, ans=0.125 2023-10-03 05:08:41,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:08:43,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1148720.0, ans=0.125 2023-10-03 05:08:44,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:08:47,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:08:51,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1148720.0, ans=0.09899494936611666 2023-10-03 05:08:52,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:08:54,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:57,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:08:58,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:09:01,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:09:01,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:09:02,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:09:02,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 05:09:05,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:09:05,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:07,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:07,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:09:09,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:09:10,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 05:09:10,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:09:10,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 05:09:10,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:09:10,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:11,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 05:09:12,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1148853.3333333333, ans=0.125 2023-10-03 05:09:18,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:09:22,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:09:25,770 INFO [train.py:1046] (2/4) Epoch 33, batch 2350, loss[loss=0.1497, simple_loss=0.231, pruned_loss=0.03416, over 23322.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2411, pruned_loss=0.04185, over 4710503.77 frames. ], batch size: 119, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:09:27,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:09:27,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:09:27,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:09:27,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1148920.0, ans=0.5 2023-10-03 05:09:27,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1148920.0, ans=0.125 2023-10-03 05:09:28,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:09:28,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:09:28,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:09:28,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 05:09:31,607 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.933e+02 2.128e+02 2.511e+02 4.744e+02, threshold=4.255e+02, percent-clipped=2.0 2023-10-03 05:09:34,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:09:34,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 05:09:37,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1148920.0, ans=0.2 2023-10-03 05:09:40,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 05:09:43,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:46,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:46,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:46,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:09:46,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:09:48,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 05:09:51,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:09:54,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 05:09:55,576 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.92 vs. limit=12.0 2023-10-03 05:09:57,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:10:00,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:10:00,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:10:02,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:10:04,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 05:10:05,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:10:07,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:10:07,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:10:08,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:10:11,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:10:14,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 05:10:14,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:10:14,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1149120.0, ans=0.0 2023-10-03 05:10:16,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:10:16,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:10:19,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 05:10:19,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:10:19,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1149120.0, ans=0.125 2023-10-03 05:10:20,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1149120.0, ans=0.125 2023-10-03 05:10:22,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 05:10:22,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:10:27,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 05:10:31,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 05:10:31,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:10:31,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:10:31,339 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 05:10:31,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 05:10:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 05:10:34,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1149186.6666666667, ans=0.1 2023-10-03 05:10:36,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:10:39,595 INFO [train.py:1046] (2/4) Epoch 33, batch 2400, loss[loss=0.1723, simple_loss=0.2537, pruned_loss=0.04549, over 23181.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.241, pruned_loss=0.042, over 4709650.25 frames. ], batch size: 105, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:10:41,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:10:45,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:10:46,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.81 vs. limit=15.0 2023-10-03 05:10:47,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:10:49,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 05:10:49,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 05:10:53,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1149320.0, ans=0.2 2023-10-03 05:10:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:10:55,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:10:55,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=12.0 2023-10-03 05:10:56,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1149320.0, ans=0.2 2023-10-03 05:10:56,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1149320.0, ans=0.125 2023-10-03 05:10:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 05:10:59,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:10:59,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:01,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 05:11:04,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:05,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 05:11:09,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:11:11,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1149386.6666666667, ans=0.2 2023-10-03 05:11:14,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 05:11:16,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1149386.6666666667, ans=0.125 2023-10-03 05:11:18,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:11:18,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:24,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:11:24,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 05:11:25,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:11:30,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:33,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:11:36,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:11:37,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:11:37,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:11:37,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:11:37,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:38,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:11:38,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:11:43,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:11:43,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:11:43,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 05:11:46,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 05:11:48,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:11:49,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:49,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 05:11:49,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 05:11:51,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 05:11:51,181 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 05:11:52,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 05:11:52,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:11:54,459 INFO [train.py:1046] (2/4) Epoch 33, batch 2450, loss[loss=0.1561, simple_loss=0.2372, pruned_loss=0.03748, over 24456.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2396, pruned_loss=0.04179, over 4702704.28 frames. ], batch size: 63, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:11:54,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:54,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:11:54,624 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 05:11:56,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:57,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:11:59,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:11:59,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:12:00,783 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:12:00,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1149586.6666666667, ans=0.0 2023-10-03 05:12:01,673 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.804e+02 1.981e+02 2.295e+02 3.038e+02, threshold=3.963e+02, percent-clipped=0.0 2023-10-03 05:12:03,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:03,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:04,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 05:12:08,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:12:08,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:10,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1149653.3333333333, ans=0.125 2023-10-03 05:12:11,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:12:11,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:12:11,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:12:13,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 05:12:16,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:18,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:12:18,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:12:21,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:12:21,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:24,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:24,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:12:26,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 05:12:27,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:12:29,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1149720.0, ans=0.1 2023-10-03 05:12:33,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:36,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:36,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:12:36,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:12:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:37,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1149786.6666666667, ans=0.1 2023-10-03 05:12:38,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.44 vs. limit=15.0 2023-10-03 05:12:38,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:12:38,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 05:12:42,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:42,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:12:45,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:12:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:12:52,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:12:52,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 05:12:54,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:12:55,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:12:55,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 05:12:55,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:12:55,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:12:59,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:13:01,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:13:03,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:13:07,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 05:13:07,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:13:08,730 INFO [train.py:1046] (2/4) Epoch 33, batch 2500, loss[loss=0.1336, simple_loss=0.2123, pruned_loss=0.0274, over 24312.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2385, pruned_loss=0.04147, over 4698102.47 frames. ], batch size: 56, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:13:11,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1149920.0, ans=0.125 2023-10-03 05:13:14,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:13:21,714 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.14 vs. limit=15.0 2023-10-03 05:13:22,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:13:22,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:13:23,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:13:23,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 05:13:29,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:13:29,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1149986.6666666667, ans=0.125 2023-10-03 05:13:30,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:13:32,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:13:32,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:13:34,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 05:13:35,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:35,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:13:35,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 05:13:35,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:37,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 05:13:37,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:40,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.45 vs. limit=15.0 2023-10-03 05:13:41,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:13:42,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:13:46,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:13:46,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 05:13:47,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:13:49,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:51,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:57,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:58,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:14:02,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:14:07,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 05:14:07,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:14:07,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:14:08,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:14:08,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:14:09,922 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 05:14:09,923 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 05:14:09,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 05:14:12,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:14:14,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 05:14:14,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 05:14:16,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:14:16,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 05:14:17,237 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.10 vs. limit=15.0 2023-10-03 05:14:19,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 05:14:21,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1150186.6666666667, ans=0.2 2023-10-03 05:14:22,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:14:23,572 INFO [train.py:1046] (2/4) Epoch 33, batch 2550, loss[loss=0.203, simple_loss=0.261, pruned_loss=0.07246, over 19281.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2397, pruned_loss=0.0418, over 4694996.25 frames. ], batch size: 388, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:14:25,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:14:25,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:14:28,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:14:28,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.73 vs. limit=10.0 2023-10-03 05:14:29,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 05:14:29,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:14:30,145 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:14:31,058 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.976e+02 2.259e+02 2.608e+02 3.805e+02, threshold=4.518e+02, percent-clipped=0.0 2023-10-03 05:14:33,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 05:14:35,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:14:38,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:14:39,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:14:40,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 05:14:40,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:14:40,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:14:40,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:14:43,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:14:43,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 05:14:44,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:14:44,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:14:44,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 05:14:48,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1150320.0, ans=0.025 2023-10-03 05:14:57,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:15:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:02,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:02,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:15:03,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:15:10,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:15:13,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:15:13,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:15:13,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:15:13,863 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:15:13,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1150453.3333333333, ans=0.0 2023-10-03 05:15:14,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:15:15,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:15:16,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:16,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:17,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1150453.3333333333, ans=0.1 2023-10-03 05:15:22,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:15:22,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 05:15:22,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:15:22,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:22,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:15:24,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:15:26,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:15:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:15:35,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:15:37,242 INFO [train.py:1046] (2/4) Epoch 33, batch 2600, loss[loss=0.1579, simple_loss=0.2296, pruned_loss=0.04312, over 23339.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2404, pruned_loss=0.04166, over 4715759.01 frames. ], batch size: 119, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:15:37,402 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 05:15:40,324 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 05:15:40,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:15:41,659 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 05:15:41,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 05:15:42,977 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 05:15:45,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:45,762 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 05:15:47,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1150586.6666666667, ans=0.125 2023-10-03 05:15:48,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 05:15:50,363 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 05:15:51,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:15:53,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 05:15:55,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 05:15:56,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:15:58,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 05:15:58,359 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 05:16:00,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 05:16:04,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1150653.3333333333, ans=0.125 2023-10-03 05:16:06,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:06,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:06,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:16:06,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 05:16:07,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:16:14,755 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 05:16:19,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1150720.0, ans=0.125 2023-10-03 05:16:20,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:20,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:22,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 05:16:22,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:16:22,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:16:23,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 05:16:25,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:16:25,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:16:27,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:16:31,888 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 05:16:33,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:16:33,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:16:37,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:16:38,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1150853.3333333333, ans=0.125 2023-10-03 05:16:40,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:16:40,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 05:16:41,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:43,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:16:44,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:16:48,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 05:16:50,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:51,598 INFO [train.py:1046] (2/4) Epoch 33, batch 2650, loss[loss=0.1757, simple_loss=0.2621, pruned_loss=0.04458, over 24024.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.241, pruned_loss=0.04166, over 4730245.61 frames. ], batch size: 86, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:16:52,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:16:52,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1150920.0, ans=0.125 2023-10-03 05:16:55,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 05:16:55,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:57,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:16:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 05:16:58,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:16:59,706 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.879e+02 2.016e+02 2.278e+02 3.478e+02, threshold=4.033e+02, percent-clipped=0.0 2023-10-03 05:16:59,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:17:02,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:17:04,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:17:05,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:17:07,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 05:17:07,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:17:08,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:17:11,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 05:17:14,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 05:17:15,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:17:17,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 05:17:17,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:18,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 05:17:21,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:21,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:17:21,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:23,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:28,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 05:17:29,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 05:17:31,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:17:32,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1151053.3333333333, ans=0.125 2023-10-03 05:17:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 05:17:36,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:36,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:38,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:17:38,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:17:38,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:17:40,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:17:41,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:17:42,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:17:44,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:17:45,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:17:46,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:48,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:17:48,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:51,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:17:51,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:17:53,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:55,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:17:55,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:55,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 05:17:58,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:18:00,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:01,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:03,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:04,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:18:04,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:06,232 INFO [train.py:1046] (2/4) Epoch 33, batch 2700, loss[loss=0.1446, simple_loss=0.2156, pruned_loss=0.0368, over 22688.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2415, pruned_loss=0.04164, over 4727944.64 frames. ], batch size: 322, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:18:06,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:18:06,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 05:18:10,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:18:13,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 05:18:15,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:18:15,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:15,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:16,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:18:16,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:18:16,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:18:16,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:18:17,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 05:18:18,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:18:19,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:18:20,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:18:22,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:24,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.14 vs. limit=15.0 2023-10-03 05:18:24,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:18:26,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 05:18:27,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:18:30,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:18:30,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:18:36,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:18:36,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:18:38,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:18:38,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:18:41,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:18:42,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:18:42,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:18:43,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:18:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:48,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:18:54,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:18:55,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:18:59,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:18:59,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:03,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:19:04,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:04,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:19:06,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:08,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:19:09,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:19:11,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:19:12,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:19:12,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:19:15,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 05:19:16,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:17,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:19:19,325 INFO [train.py:1046] (2/4) Epoch 33, batch 2750, loss[loss=0.1471, simple_loss=0.2282, pruned_loss=0.03294, over 16839.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2412, pruned_loss=0.04153, over 4725046.13 frames. ], batch size: 36, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:19:19,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 05:19:19,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 05:19:20,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:21,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1151586.6666666667, ans=0.125 2023-10-03 05:19:23,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:23,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:26,306 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.013e+02 2.192e+02 2.661e+02 5.400e+02, threshold=4.383e+02, percent-clipped=1.0 2023-10-03 05:19:26,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:19:26,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:29,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:19:29,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:19:31,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:19:31,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:31,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 05:19:31,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:19:31,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:36,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1151653.3333333333, ans=0.125 2023-10-03 05:19:39,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 05:19:41,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:19:41,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:42,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:19:42,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:19:43,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:45,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:19:45,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:46,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:47,064 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.28 vs. limit=15.0 2023-10-03 05:19:50,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:19:50,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:19:50,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:19:52,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:52,268 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:19:53,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:19:57,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1151720.0, ans=0.1 2023-10-03 05:19:59,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:20:00,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:20:00,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:05,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:20:05,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:20:05,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:20:07,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1151786.6666666667, ans=0.2 2023-10-03 05:20:11,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:20:11,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:20:11,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 05:20:15,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:18,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 05:20:22,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:20:24,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:20:25,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 05:20:26,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:20:28,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:20:28,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 05:20:28,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:20:31,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 05:20:31,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:33,188 INFO [train.py:1046] (2/4) Epoch 33, batch 2800, loss[loss=0.1576, simple_loss=0.2452, pruned_loss=0.03496, over 24597.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2395, pruned_loss=0.04121, over 4704200.20 frames. ], batch size: 68, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:20:33,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:20:33,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 05:20:33,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:20:33,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:33,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1151920.0, ans=0.1 2023-10-03 05:20:36,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:20:36,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 05:20:36,541 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 05:20:41,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:41,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:20:41,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1151920.0, ans=0.125 2023-10-03 05:20:42,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:20:45,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:20:46,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 05:20:49,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 05:20:51,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 05:20:51,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1151986.6666666667, ans=0.125 2023-10-03 05:20:52,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:52,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:20:52,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:20:56,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:20:58,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:58,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:20:59,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:21:08,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:21:11,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:21:11,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:13,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:21:14,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:19,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:21:19,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 05:21:19,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:20,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:21:20,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:21:21,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-10-03 05:21:24,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:24,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:26,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:21:29,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:21:29,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:29,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:21:29,713 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:21:31,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:21:31,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:21:31,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:21:31,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 05:21:31,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:21:32,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:21:34,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:21:34,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 05:21:35,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:35,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:21:36,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:21:37,061 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:21:38,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 05:21:38,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1152186.6666666667, ans=0.125 2023-10-03 05:21:44,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:21:44,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:21:46,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:21:47,509 INFO [train.py:1046] (2/4) Epoch 33, batch 2850, loss[loss=0.1716, simple_loss=0.2632, pruned_loss=0.03993, over 24335.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2383, pruned_loss=0.0406, over 4707885.29 frames. ], batch size: 74, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:21:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:21:51,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:21:51,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:21:51,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:54,429 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.804e+02 2.039e+02 2.498e+02 3.555e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 05:21:54,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:54,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:57,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:21:57,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 05:22:05,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 05:22:05,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:06,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 05:22:08,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:08,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1152320.0, ans=0.125 2023-10-03 05:22:10,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 05:22:10,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 05:22:12,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:25,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:22:27,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:22:27,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:22:28,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:22:28,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:22:28,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:22:30,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:22:30,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 05:22:32,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1152453.3333333333, ans=0.125 2023-10-03 05:22:33,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:22:34,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:22:35,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:22:35,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:36,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:22:36,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:22:38,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:40,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:22:42,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:22:44,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:45,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:47,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:22:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:22:54,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 05:22:54,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 05:22:55,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:22:55,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:22:55,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 05:22:57,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:22:57,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:22:57,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:22:57,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:22:57,337 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 05:22:57,377 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 05:22:57,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:22:58,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:01,884 INFO [train.py:1046] (2/4) Epoch 33, batch 2900, loss[loss=0.149, simple_loss=0.223, pruned_loss=0.03749, over 23421.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.03986, over 4722910.80 frames. ], batch size: 285, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:23:04,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:23:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:23:06,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:23:06,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 05:23:10,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:23:10,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 05:23:12,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 05:23:13,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:23:13,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:23:15,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:23:17,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:23:20,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1152653.3333333333, ans=0.1 2023-10-03 05:23:21,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:23:21,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:23:24,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:23:24,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 05:23:24,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:23:27,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:27,904 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.24 vs. limit=15.0 2023-10-03 05:23:28,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 05:23:29,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 05:23:31,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:23:31,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1152720.0, ans=0.1 2023-10-03 05:23:32,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 05:23:32,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:23:34,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:23:34,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:23:37,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:23:39,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:42,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:23:46,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:23:48,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 05:23:48,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 05:23:48,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:23:52,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:23:54,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 05:23:55,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:23:59,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:24:07,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:24:07,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:24:08,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 05:24:11,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:11,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 05:24:11,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:24:13,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:24:15,899 INFO [train.py:1046] (2/4) Epoch 33, batch 2950, loss[loss=0.1376, simple_loss=0.2121, pruned_loss=0.0316, over 22030.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2392, pruned_loss=0.04054, over 4712928.36 frames. ], batch size: 48, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:24:18,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:24:20,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 05:24:20,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:24:20,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:22,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:24:23,326 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.788e+02 1.939e+02 2.108e+02 3.552e+02, threshold=3.878e+02, percent-clipped=0.0 2023-10-03 05:24:23,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:24:24,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 05:24:26,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 05:24:26,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:24:26,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:24:31,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:24:33,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:24:36,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:24:36,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:24:37,283 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-10-03 05:24:39,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:24:39,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:24:41,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:42,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:42,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:24:45,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 05:24:51,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 05:24:51,768 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 05:24:53,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:24:56,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 05:24:56,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 05:24:57,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:24:57,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:24:57,342 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 05:24:57,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:24:58,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 05:25:00,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:25:01,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:25:03,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:25:04,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:25:04,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:04,400 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 05:25:04,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:25:05,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 05:25:11,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:12,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1153120.0, ans=0.125 2023-10-03 05:25:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:25:13,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 05:25:13,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:25:14,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 05:25:16,834 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-10-03 05:25:17,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:25:20,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:25:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:25:21,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1153186.6666666667, ans=0.125 2023-10-03 05:25:22,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:22,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:25:25,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:25:25,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:25,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:25:26,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:25:26,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:25:27,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:25:29,292 INFO [train.py:1046] (2/4) Epoch 33, batch 3000, loss[loss=0.1845, simple_loss=0.2651, pruned_loss=0.05197, over 23570.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2405, pruned_loss=0.04111, over 4715345.64 frames. ], batch size: 85, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:25:29,293 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 05:25:41,083 INFO [train.py:1078] (2/4) Epoch 33, validation: loss=0.3581, simple_loss=0.2789, pruned_loss=0.2187, over 1125622.00 frames. 2023-10-03 05:25:41,084 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 05:25:41,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:41,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 05:25:43,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:45,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:25:46,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:25:48,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 05:25:49,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 05:25:52,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:25:53,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:25:53,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 05:25:53,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:25:59,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:26:08,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1153320.0, ans=0.2 2023-10-03 05:26:11,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:26:19,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 05:26:19,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:26:21,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:26:21,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:26:21,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:26:25,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:26:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 05:26:26,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 05:26:27,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:26:29,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:26:30,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:26:32,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:26:32,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:32,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:26:36,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:26:36,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:26:36,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:26:39,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:26:41,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 05:26:41,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:26:41,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:26:42,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:26:45,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:45,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:47,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 05:26:47,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 05:26:48,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:26:48,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 05:26:48,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:26:50,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 05:26:53,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:26:53,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:26:55,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 05:26:55,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 05:26:55,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:26:56,570 INFO [train.py:1046] (2/4) Epoch 33, batch 3050, loss[loss=0.1378, simple_loss=0.2174, pruned_loss=0.02908, over 24422.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2421, pruned_loss=0.04212, over 4713289.75 frames. ], batch size: 58, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:26:56,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:26:58,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:58,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:26:58,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:26:58,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:27:00,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 05:27:02,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:27:03,729 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.901e+02 2.048e+02 2.289e+02 3.124e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 05:27:05,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:05,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:27:08,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:12,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 05:27:18,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1153653.3333333333, ans=0.0 2023-10-03 05:27:19,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 05:27:21,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 05:27:21,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:24,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:27:27,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:27,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:27,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:30,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:27:30,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:27:30,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:32,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1153720.0, ans=0.125 2023-10-03 05:27:33,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:34,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:40,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:40,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 05:27:40,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:40,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:27:42,779 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.14 vs. limit=10.0 2023-10-03 05:27:43,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:27:44,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:27:44,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:27:44,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:27:48,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:50,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:27:54,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:55,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:27:55,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:57,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:27:57,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:27:57,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1153853.3333333333, ans=0.0 2023-10-03 05:27:59,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:28:00,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 05:28:01,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:28:01,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:01,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 05:28:04,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:28:08,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1153853.3333333333, ans=0.0 2023-10-03 05:28:09,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:28:10,397 INFO [train.py:1046] (2/4) Epoch 33, batch 3100, loss[loss=0.1586, simple_loss=0.2475, pruned_loss=0.03479, over 24645.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.241, pruned_loss=0.0418, over 4713254.32 frames. ], batch size: 73, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:28:10,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:28:14,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:28:14,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1153920.0, ans=0.0 2023-10-03 05:28:15,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 05:28:19,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 05:28:19,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 05:28:21,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:28:22,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1153920.0, ans=0.5 2023-10-03 05:28:23,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:28:23,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:26,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 05:28:30,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:35,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 05:28:40,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:28:40,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:41,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:28:41,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:28:41,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 05:28:44,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:28:44,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 05:28:44,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:28:46,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:47,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 05:28:49,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:28:53,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:28:55,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 05:28:55,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 05:28:56,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:57,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:59,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:28:59,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:59,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:29:00,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:29:00,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:29:03,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:29:03,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:03,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 05:29:07,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:29:08,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 05:29:11,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:29:11,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 05:29:12,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:12,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 05:29:23,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 05:29:24,782 INFO [train.py:1046] (2/4) Epoch 33, batch 3150, loss[loss=0.1519, simple_loss=0.227, pruned_loss=0.03836, over 24302.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2397, pruned_loss=0.04139, over 4709998.46 frames. ], batch size: 56, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:29:25,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1154253.3333333333, ans=0.1 2023-10-03 05:29:26,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:27,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:29,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:29:29,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:29:29,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 05:29:30,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:31,691 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.823e+02 1.977e+02 2.153e+02 2.836e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-03 05:29:31,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:29:32,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.29 vs. limit=15.0 2023-10-03 05:29:33,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 05:29:35,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:38,069 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 05:29:39,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 05:29:39,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:29:40,873 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 05:29:41,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1154320.0, ans=0.1 2023-10-03 05:29:42,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 05:29:42,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 05:29:43,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 05:29:43,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 05:29:43,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:43,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:29:43,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:45,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 05:29:46,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:46,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:48,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:49,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:29:54,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 05:29:55,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:29:56,218 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:29:57,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:29:58,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:58,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 05:30:01,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 05:30:01,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:30:03,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:30:03,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:30:04,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:30:04,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:30:06,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:30:06,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:30:08,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 05:30:09,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:30:09,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:10,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:30:10,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:30:12,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 05:30:12,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:15,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 05:30:15,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:17,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 05:30:18,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 05:30:18,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:30:19,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:21,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 05:30:22,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 05:30:22,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:30:24,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:30:25,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:25,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:30:31,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:30:32,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:34,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1154520.0, ans=0.0 2023-10-03 05:30:35,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 05:30:39,156 INFO [train.py:1046] (2/4) Epoch 33, batch 3200, loss[loss=0.1387, simple_loss=0.2203, pruned_loss=0.02853, over 24419.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2391, pruned_loss=0.04097, over 4721562.67 frames. ], batch size: 58, lr: 3.09e-03, grad_scale: 32.0 2023-10-03 05:30:40,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:30:40,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:30:43,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:45,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:30:45,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 05:30:46,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:50,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:30:55,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:31:04,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:31:08,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1154720.0, ans=0.125 2023-10-03 05:31:13,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 05:31:14,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:31:16,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 05:31:17,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:31:20,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:31:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:31:23,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:31:26,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 05:31:28,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 05:31:28,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1154786.6666666667, ans=0.2 2023-10-03 05:31:29,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 05:31:32,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 05:31:33,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1154786.6666666667, ans=0.125 2023-10-03 05:31:35,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:31:41,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:31:41,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:31:41,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:31:42,665 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 05:31:42,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:31:45,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:31:47,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 05:31:47,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 05:31:49,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 05:31:51,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 05:31:51,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:31:53,012 INFO [train.py:1046] (2/4) Epoch 33, batch 3250, loss[loss=0.1705, simple_loss=0.2463, pruned_loss=0.04737, over 23860.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.239, pruned_loss=0.04109, over 4712460.97 frames. ], batch size: 195, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:31:53,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1154920.0, ans=0.125 2023-10-03 05:31:54,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:31:54,688 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 05:31:54,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1154920.0, ans=0.125 2023-10-03 05:31:55,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:31:55,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:31:57,356 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 05:32:00,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:32:01,520 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.860e+02 2.002e+02 2.246e+02 3.741e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-03 05:32:02,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:32:09,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 05:32:10,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:11,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:32:11,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:32:12,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:32:14,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:32:15,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:15,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:32:15,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:17,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:17,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:17,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:32:19,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:22,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:32:24,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:24,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:27,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:27,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:32:27,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:32:31,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 05:32:31,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:32:31,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:32:34,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:34,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:32:36,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1155120.0, ans=0.125 2023-10-03 05:32:40,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:32:49,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:32:49,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:49,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 05:32:49,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:32:49,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:32:50,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:51,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1155186.6666666667, ans=0.0 2023-10-03 05:32:52,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 05:32:53,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 05:32:53,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:32:55,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:56,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:57,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 05:32:57,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:59,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1155186.6666666667, ans=0.125 2023-10-03 05:33:01,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:33:01,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:33:03,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 05:33:03,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:06,445 INFO [train.py:1046] (2/4) Epoch 33, batch 3300, loss[loss=0.151, simple_loss=0.2322, pruned_loss=0.03494, over 24532.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2399, pruned_loss=0.04118, over 4723436.64 frames. ], batch size: 60, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:33:06,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:33:06,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 05:33:09,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:33:09,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 05:33:12,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 05:33:12,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 05:33:12,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:15,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:33:17,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:33:17,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:20,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:33:20,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:33:22,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:22,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:33:27,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 05:33:28,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:33:28,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:29,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:31,334 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 05:33:31,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:33:32,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:33:32,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:33:32,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:33:34,160 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 05:33:36,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:36,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:33:38,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:38,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 05:33:40,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1155386.6666666667, ans=0.2 2023-10-03 05:33:41,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 05:33:41,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:42,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:33:44,202 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 05:33:45,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 05:33:47,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:33:49,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 05:33:52,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:33:53,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:33:53,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:33:56,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:33:58,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:58,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:58,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:34:01,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:34:01,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:34:02,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:34:02,542 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 05:34:03,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.67 vs. limit=15.0 2023-10-03 05:34:03,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 05:34:05,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:34:06,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:34:06,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:07,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:34:07,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:09,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:34:09,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:10,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:34:11,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:34:12,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:34:14,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 05:34:15,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:15,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:18,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:34:18,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:34:18,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:21,348 INFO [train.py:1046] (2/4) Epoch 33, batch 3350, loss[loss=0.1967, simple_loss=0.2611, pruned_loss=0.06619, over 19243.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2408, pruned_loss=0.04179, over 4715230.13 frames. ], batch size: 388, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:34:21,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:21,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:34:27,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:28,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:34:29,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1155586.6666666667, ans=0.025 2023-10-03 05:34:30,134 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.952e+02 2.092e+02 2.361e+02 3.355e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-03 05:34:30,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:31,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:34:33,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:34,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:34:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 05:34:37,137 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 05:34:38,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:41,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 05:34:41,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 05:34:43,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:34:43,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:34:43,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:34:44,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 05:34:44,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:44,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:34:46,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:48,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:48,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:49,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:34:52,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1155720.0, ans=0.0 2023-10-03 05:34:53,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:34:57,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:57,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:34:59,463 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.55 vs. limit=15.0 2023-10-03 05:35:01,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:35:02,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:35:04,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:35:04,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:06,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:09,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 05:35:09,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:35:09,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 05:35:09,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:35:11,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 05:35:12,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:35:14,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:35:21,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:22,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 05:35:22,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:35:22,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:35:24,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:35:29,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:35:31,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 05:35:31,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:35:32,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:35:33,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:35:33,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 05:35:33,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:33,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 05:35:35,250 INFO [train.py:1046] (2/4) Epoch 33, batch 3400, loss[loss=0.1776, simple_loss=0.2489, pruned_loss=0.05316, over 23761.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.242, pruned_loss=0.04222, over 4721083.70 frames. ], batch size: 179, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:35:36,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:35:36,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:35:36,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:35:37,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:35:39,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 05:35:44,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 05:35:44,059 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 05:35:44,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:35:48,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:35:48,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:35:48,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:35:49,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:35:50,716 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.58 vs. limit=12.0 2023-10-03 05:35:51,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1155986.6666666667, ans=0.125 2023-10-03 05:35:53,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1155986.6666666667, ans=0.1 2023-10-03 05:35:56,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:35:58,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 05:36:00,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1155986.6666666667, ans=0.125 2023-10-03 05:36:04,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:36:04,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:36:05,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:36:07,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:36:12,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:36:15,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 05:36:17,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1156053.3333333333, ans=0.2 2023-10-03 05:36:21,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:36:22,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:36:22,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 05:36:22,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:36:22,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:36:24,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:36:25,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:36:27,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:36:32,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:36:32,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:36:36,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:36:37,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 05:36:38,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1156186.6666666667, ans=0.125 2023-10-03 05:36:43,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:36:45,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 05:36:48,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 05:36:49,747 INFO [train.py:1046] (2/4) Epoch 33, batch 3450, loss[loss=0.1546, simple_loss=0.2382, pruned_loss=0.03547, over 24306.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2417, pruned_loss=0.04206, over 4720668.74 frames. ], batch size: 61, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:36:49,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:36:50,495 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.25 vs. limit=6.0 2023-10-03 05:36:52,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:36:52,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 05:36:52,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:36:57,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:36:57,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1156253.3333333333, ans=0.2 2023-10-03 05:36:59,362 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.812e+02 1.985e+02 2.188e+02 3.320e+02, threshold=3.971e+02, percent-clipped=0.0 2023-10-03 05:37:00,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:37:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:03,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:37:03,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:06,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:06,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1156320.0, ans=0.125 2023-10-03 05:37:13,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 05:37:17,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 05:37:18,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:37:18,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1156386.6666666667, ans=0.1 2023-10-03 05:37:19,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:37:20,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:26,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 05:37:26,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:37:31,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:37:31,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:37:32,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:37:33,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:37:35,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 05:37:35,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:37:36,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:39,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:37:41,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 05:37:45,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:37:49,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:37:52,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:53,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:00,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:00,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:38:00,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:38:00,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:38:02,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1156586.6666666667, ans=0.125 2023-10-03 05:38:03,157 INFO [train.py:1046] (2/4) Epoch 33, batch 3500, loss[loss=0.1628, simple_loss=0.2533, pruned_loss=0.03615, over 24621.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2397, pruned_loss=0.04171, over 4713803.31 frames. ], batch size: 68, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:38:05,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:08,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:38:08,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 05:38:11,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:38:13,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1156586.6666666667, ans=0.0 2023-10-03 05:38:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:38:16,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:16,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 05:38:20,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:38:20,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:38:21,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:38:21,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:38:23,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:38:23,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:23,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:38:23,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1156653.3333333333, ans=0.1 2023-10-03 05:38:24,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 05:38:27,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:27,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:38:30,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:38:34,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:35,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 05:38:35,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:38:38,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:38:38,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:38:39,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:40,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:38:40,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:38:42,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 05:38:45,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 05:38:45,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 05:38:45,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:38:47,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:48,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:38:48,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:38:48,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1156786.6666666667, ans=0.2 2023-10-03 05:38:52,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:38:52,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:38:58,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:39:00,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 05:39:00,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 05:39:02,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:04,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:39:04,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:39:07,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:07,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1156853.3333333333, ans=0.025 2023-10-03 05:39:10,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 05:39:11,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:39:11,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:39:13,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 05:39:13,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1156853.3333333333, ans=0.125 2023-10-03 05:39:13,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.17 vs. limit=22.5 2023-10-03 05:39:14,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 05:39:15,999 INFO [train.py:1046] (2/4) Epoch 33, batch 3550, loss[loss=0.1572, simple_loss=0.2393, pruned_loss=0.03751, over 24478.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2388, pruned_loss=0.04094, over 4712287.44 frames. ], batch size: 63, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:39:16,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1156920.0, ans=0.125 2023-10-03 05:39:18,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:19,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:39:19,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:19,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:23,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:39:24,903 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.866e+02 2.004e+02 2.192e+02 2.799e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-03 05:39:25,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1156920.0, ans=0.125 2023-10-03 05:39:31,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:31,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 05:39:36,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:39:36,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:39:38,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:38,510 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-10-03 05:39:39,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:39:39,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:39:42,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:42,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:39:43,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:43,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:39:44,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:39:48,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:39:48,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:49,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:39:49,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:49,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:39:49,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 05:39:51,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:52,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:52,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:39:58,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:59,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:40:01,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:02,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 05:40:02,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:40:04,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 05:40:04,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:40:07,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:40:07,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:40:10,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 05:40:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:40:16,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:40:16,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 05:40:16,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:20,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:40:22,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 05:40:23,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1157186.6666666667, ans=0.125 2023-10-03 05:40:26,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 05:40:27,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:40:27,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:40:30,809 INFO [train.py:1046] (2/4) Epoch 33, batch 3600, loss[loss=0.1738, simple_loss=0.2562, pruned_loss=0.04568, over 24033.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2387, pruned_loss=0.04098, over 4707303.01 frames. ], batch size: 80, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:40:30,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:30,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:32,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:40:34,306 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:40:36,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.95 vs. limit=10.0 2023-10-03 05:40:36,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:40:38,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:39,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:40:41,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:40:41,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.62 vs. limit=15.0 2023-10-03 05:40:42,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:42,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 05:40:45,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:40:45,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:50,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:40:52,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:40:52,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:40:54,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:40:54,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 05:40:55,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:40:57,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:57,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:41:00,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:03,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:41:03,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:41:03,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1157386.6666666667, ans=0.125 2023-10-03 05:41:03,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-10-03 05:41:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 05:41:08,598 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.58 vs. limit=15.0 2023-10-03 05:41:10,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:41:11,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1157386.6666666667, ans=0.125 2023-10-03 05:41:11,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.92 vs. limit=10.0 2023-10-03 05:41:13,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:41:13,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 05:41:17,179 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.70 vs. limit=6.0 2023-10-03 05:41:17,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:41:23,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:26,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:34,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:41:34,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:41:34,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 05:41:36,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 05:41:37,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 05:41:39,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:41:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:41:41,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 05:41:42,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:41:42,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:41:42,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:41:43,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1157586.6666666667, ans=0.1 2023-10-03 05:41:44,534 INFO [train.py:1046] (2/4) Epoch 33, batch 3650, loss[loss=0.1503, simple_loss=0.2293, pruned_loss=0.03569, over 24309.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2389, pruned_loss=0.0406, over 4723226.35 frames. ], batch size: 61, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:41:44,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 05:41:45,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 05:41:48,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:50,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 05:41:54,152 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.881e+02 2.096e+02 2.323e+02 3.707e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-03 05:41:55,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 05:41:55,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:41:57,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1157653.3333333333, ans=0.1 2023-10-03 05:41:59,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 05:42:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 05:42:06,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:42:06,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:42:06,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:42:07,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.91 vs. limit=12.0 2023-10-03 05:42:10,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:42:10,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=1157653.3333333333, ans=0.95 2023-10-03 05:42:10,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1157653.3333333333, ans=0.125 2023-10-03 05:42:11,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:42:11,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 05:42:13,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:42:13,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:42:13,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 05:42:15,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:42:16,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:42:16,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:18,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:42:19,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 05:42:20,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 05:42:22,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:42:24,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 05:42:26,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:42:26,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:42:27,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1157720.0, ans=0.125 2023-10-03 05:42:29,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.14 vs. limit=15.0 2023-10-03 05:42:29,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:42:29,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1157786.6666666667, ans=0.125 2023-10-03 05:42:32,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:32,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:42:35,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:42:35,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:42:38,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:42:41,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:42:41,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:42:41,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:42:43,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:42:45,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:45,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:42:51,912 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 05:42:54,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:42:54,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:42:56,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:42:56,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:42:57,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:42:58,794 INFO [train.py:1046] (2/4) Epoch 33, batch 3700, loss[loss=0.1544, simple_loss=0.2463, pruned_loss=0.03124, over 24468.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2398, pruned_loss=0.04068, over 4720853.23 frames. ], batch size: 69, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:42:59,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:01,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 05:43:01,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:43:03,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:43:06,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:43:06,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:43:08,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.26 vs. limit=22.5 2023-10-03 05:43:09,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:09,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 05:43:09,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:43:10,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1157920.0, ans=0.1 2023-10-03 05:43:11,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:43:11,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:43:11,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1157920.0, ans=0.125 2023-10-03 05:43:13,559 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.61 vs. limit=15.0 2023-10-03 05:43:14,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:43:16,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1157986.6666666667, ans=0.125 2023-10-03 05:43:17,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:43:18,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:19,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:43:19,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:20,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:43:22,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:24,828 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 05:43:30,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:43:32,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:43:32,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:43:32,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 05:43:32,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1158053.3333333333, ans=0.1 2023-10-03 05:43:33,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:43:35,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:36,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 05:43:36,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:39,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:43:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:42,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:43:44,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:43:49,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:43:49,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 05:43:50,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:50,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 05:43:54,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:43:54,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:43:56,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1158120.0, ans=0.125 2023-10-03 05:43:57,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:43:57,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 05:43:58,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:43:59,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:44:00,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:44:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:44:01,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1158186.6666666667, ans=0.1 2023-10-03 05:44:04,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:44:06,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 05:44:06,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 05:44:06,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:44:06,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:08,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:44:08,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1158186.6666666667, ans=0.125 2023-10-03 05:44:09,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:44:11,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:44:13,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:44:13,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:44:15,089 INFO [train.py:1046] (2/4) Epoch 33, batch 3750, loss[loss=0.2314, simple_loss=0.2906, pruned_loss=0.08608, over 19846.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2414, pruned_loss=0.04127, over 4721852.68 frames. ], batch size: 389, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:44:16,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 05:44:17,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 05:44:20,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:44:22,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 05:44:22,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:44:23,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:23,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:24,744 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.972e+02 2.162e+02 2.375e+02 3.648e+02, threshold=4.325e+02, percent-clipped=0.0 2023-10-03 05:44:26,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:44:29,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:44:33,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:44:35,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1158320.0, ans=0.125 2023-10-03 05:44:36,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:44:37,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:44:41,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:44:41,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 05:44:41,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1158320.0, ans=0.125 2023-10-03 05:44:42,123 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.75 vs. limit=15.0 2023-10-03 05:44:42,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:44:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:44:44,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:44:44,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1158386.6666666667, ans=0.125 2023-10-03 05:44:44,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1158386.6666666667, ans=0.125 2023-10-03 05:44:47,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 05:44:52,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 05:44:53,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:44:53,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:44:56,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:44:59,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:01,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:45:02,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 05:45:03,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.75 vs. limit=22.5 2023-10-03 05:45:05,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:09,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:45:11,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:45:14,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:45:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:45:20,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:45:23,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:45:23,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:45:23,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1158520.0, ans=0.125 2023-10-03 05:45:25,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:45:25,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1158520.0, ans=15.0 2023-10-03 05:45:29,737 INFO [train.py:1046] (2/4) Epoch 33, batch 3800, loss[loss=0.1622, simple_loss=0.2385, pruned_loss=0.04294, over 23525.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2417, pruned_loss=0.04196, over 4700959.65 frames. ], batch size: 134, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:45:33,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:45:34,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1158586.6666666667, ans=0.2 2023-10-03 05:45:36,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:45:36,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:45:36,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1158586.6666666667, ans=0.2 2023-10-03 05:45:38,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 05:45:38,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:41,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:45:41,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:45:44,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 05:45:44,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:45:46,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:45:47,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:45:47,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:45:51,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 05:45:55,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 05:45:55,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:45:56,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:45:59,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:45:59,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:46:02,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:46:02,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:46:03,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:04,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:46:05,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1158720.0, ans=0.0 2023-10-03 05:46:09,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:46:09,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 05:46:10,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:46:17,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:46:22,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:46:25,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 05:46:25,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=1158786.6666666667, ans=0.1 2023-10-03 05:46:26,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 05:46:27,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:46:29,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:46:29,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 05:46:34,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=1158853.3333333333, ans=0.02 2023-10-03 05:46:35,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 05:46:35,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 05:46:35,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:36,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:46:41,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1158920.0, ans=0.0 2023-10-03 05:46:42,946 INFO [train.py:1046] (2/4) Epoch 33, batch 3850, loss[loss=0.1541, simple_loss=0.2366, pruned_loss=0.03584, over 24485.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2406, pruned_loss=0.0415, over 4702471.56 frames. ], batch size: 63, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:46:43,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:46:44,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:46:48,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.34 vs. limit=15.0 2023-10-03 05:46:49,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:46:49,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 05:46:51,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:46:51,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:54,125 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.876e+02 2.130e+02 2.387e+02 3.615e+02, threshold=4.261e+02, percent-clipped=0.0 2023-10-03 05:46:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:46:57,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:46:57,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1158986.6666666667, ans=0.125 2023-10-03 05:47:00,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:47:01,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 05:47:07,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:08,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:47:11,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:11,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:47:13,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:13,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:47:14,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:14,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:47:15,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:17,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:18,036 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:47:19,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:19,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:47:19,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 05:47:20,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 05:47:22,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:22,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:25,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:25,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:25,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 05:47:26,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 05:47:27,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:29,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 05:47:32,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:47:36,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:37,464 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.86 vs. limit=15.0 2023-10-03 05:47:38,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:39,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1159120.0, ans=0.2 2023-10-03 05:47:42,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:42,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 05:47:44,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1159186.6666666667, ans=0.125 2023-10-03 05:47:45,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 05:47:45,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1159186.6666666667, ans=0.2 2023-10-03 05:47:46,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:46,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:51,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:47:51,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:47:51,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:52,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:52,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:47:52,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 05:47:52,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:55,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 05:47:55,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:55,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:56,813 INFO [train.py:1046] (2/4) Epoch 33, batch 3900, loss[loss=0.1596, simple_loss=0.2474, pruned_loss=0.03585, over 24626.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2385, pruned_loss=0.04087, over 4699197.72 frames. ], batch size: 68, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:47:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:47:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:00,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:48:00,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:48:00,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:48:00,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:48:00,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 05:48:01,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:04,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:48:05,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:48:07,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:48:07,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:48:08,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:48:08,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:10,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:48:12,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 05:48:12,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:48:14,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 05:48:14,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:16,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 05:48:19,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 05:48:24,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:48:25,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:48:25,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:48:25,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:48:30,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:48:31,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:48:35,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:48:35,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:48:35,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:48:41,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:48:41,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1159453.3333333333, ans=0.0 2023-10-03 05:48:42,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:48:48,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:48:50,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:48:52,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1159453.3333333333, ans=0.125 2023-10-03 05:48:55,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1159520.0, ans=0.125 2023-10-03 05:48:58,985 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.10 vs. limit=6.0 2023-10-03 05:48:59,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1159520.0, ans=0.0 2023-10-03 05:49:00,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:49:01,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1159520.0, ans=0.07 2023-10-03 05:49:03,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:49:03,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 05:49:04,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 05:49:04,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:49:05,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 05:49:06,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:49:07,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 05:49:11,130 INFO [train.py:1046] (2/4) Epoch 33, batch 3950, loss[loss=0.1591, simple_loss=0.2523, pruned_loss=0.03291, over 24325.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2388, pruned_loss=0.04074, over 4710077.71 frames. ], batch size: 74, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:49:11,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1159586.6666666667, ans=0.125 2023-10-03 05:49:12,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1159586.6666666667, ans=0.1 2023-10-03 05:49:14,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:49:15,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 05:49:17,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:49:17,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1159586.6666666667, ans=0.5 2023-10-03 05:49:19,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:49:19,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:49:21,204 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.827e+02 2.038e+02 2.266e+02 3.676e+02, threshold=4.077e+02, percent-clipped=0.0 2023-10-03 05:49:26,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.86 vs. limit=6.0 2023-10-03 05:49:27,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 05:49:28,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:49:28,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 05:49:30,329 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 05:49:30,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:49:33,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:49:33,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:49:33,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:49:35,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 05:49:36,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:49:38,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:49:38,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:49:39,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:49:39,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:49:48,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:49:48,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:49:52,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 05:49:59,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 05:49:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 05:50:00,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:50:00,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:50:08,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:50:08,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:50:08,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:50:08,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:50:08,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 05:50:12,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:50:14,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:50:18,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 05:50:24,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1159920.0, ans=0.125 2023-10-03 05:50:25,604 INFO [train.py:1046] (2/4) Epoch 33, batch 4000, loss[loss=0.1454, simple_loss=0.2198, pruned_loss=0.03547, over 24468.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2391, pruned_loss=0.04081, over 4712929.74 frames. ], batch size: 58, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:50:28,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:35,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:42,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:50:42,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:50:42,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:44,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 05:50:44,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:50:46,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 05:50:46,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:50:46,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 05:50:46,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1159986.6666666667, ans=0.1 2023-10-03 05:50:47,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:50:50,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:50:50,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:50:50,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:50:51,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:50:51,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:50:53,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:50:55,751 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 05:50:55,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:50:57,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:50:59,233 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 05:50:59,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:50:59,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:51:04,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 05:51:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:51:07,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:51:09,257 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 05:51:10,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:51:11,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 05:51:11,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:51:12,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:51:13,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:51:14,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:51:14,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:51:16,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:51:16,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1160120.0, ans=0.0 2023-10-03 05:51:18,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.96 vs. limit=15.0 2023-10-03 05:51:19,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 05:51:19,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:51:20,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 05:51:23,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:51:26,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 05:51:28,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:51:30,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:51:30,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:51:32,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:51:37,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:51:38,826 INFO [train.py:1046] (2/4) Epoch 33, batch 4050, loss[loss=0.1608, simple_loss=0.2529, pruned_loss=0.03429, over 24642.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2399, pruned_loss=0.04147, over 4701624.53 frames. ], batch size: 73, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:51:38,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:51:40,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 05:51:41,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:51:42,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:51:42,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:51:44,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:51:45,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:51:49,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:51:50,386 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.813e+02 1.958e+02 2.207e+02 3.335e+02, threshold=3.917e+02, percent-clipped=0.0 2023-10-03 05:51:53,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:51:53,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:51:54,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:51:54,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:51:58,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:52:00,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:52:05,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 05:52:07,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 05:52:07,527 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 05:52:10,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:52:11,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1160386.6666666667, ans=0.0 2023-10-03 05:52:17,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 05:52:18,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:52:22,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:52:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:52:25,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:52:25,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:52:29,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:52:32,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 05:52:33,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:52:35,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:52:35,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 05:52:40,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:52:46,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 05:52:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:52:46,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:52:49,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 05:52:49,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 05:52:49,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:52:51,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:52:52,500 INFO [train.py:1046] (2/4) Epoch 33, batch 4100, loss[loss=0.1421, simple_loss=0.2225, pruned_loss=0.03084, over 21184.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2401, pruned_loss=0.04105, over 4701722.83 frames. ], batch size: 46, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:52:52,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:52:52,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:52:58,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 05:52:59,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 05:53:02,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 05:53:02,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 05:53:02,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:03,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.80 vs. limit=15.0 2023-10-03 05:53:04,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:05,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:05,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:53:07,718 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 05:53:09,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:53:11,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:53:11,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:12,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:53:15,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:53:16,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:53:17,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:53:17,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 05:53:19,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:19,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:53:19,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:53:19,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:53:20,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 05:53:23,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:53:25,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 05:53:26,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:53:29,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:53:29,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 05:53:30,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:53:32,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:53:32,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:53:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 05:53:36,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:53:36,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:53:40,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 05:53:40,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:41,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:53:44,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:53:48,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:53:50,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:53:51,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:57,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:53:57,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:54:02,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:54:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:54:06,927 INFO [train.py:1046] (2/4) Epoch 33, batch 4150, loss[loss=0.1519, simple_loss=0.2375, pruned_loss=0.03319, over 24449.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.241, pruned_loss=0.04194, over 4685849.41 frames. ], batch size: 66, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:54:07,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:54:07,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1160920.0, ans=0.2 2023-10-03 05:54:09,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:54:11,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:54:11,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:54:14,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 05:54:14,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:54:15,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 05:54:16,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 05:54:16,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 05:54:17,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:54:20,402 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.882e+02 2.112e+02 2.535e+02 4.235e+02, threshold=4.223e+02, percent-clipped=3.0 2023-10-03 05:54:21,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:54:21,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:54:25,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:54:25,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:54:26,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:54:27,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1160986.6666666667, ans=0.2 2023-10-03 05:54:29,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:54:29,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:54:30,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:54:34,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:54:36,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1161053.3333333333, ans=0.2 2023-10-03 05:54:38,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:54:39,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 05:54:41,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 05:54:41,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:54:43,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 05:54:43,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:54:43,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:54:46,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:54:47,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:54:50,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 05:54:53,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:54:54,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:54:56,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 05:54:57,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:54:57,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 05:55:00,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:55:00,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:55:01,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:03,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 05:55:03,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:03,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:55:05,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:55:08,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1161186.6666666667, ans=0.125 2023-10-03 05:55:09,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 05:55:09,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:09,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:55:11,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:55:11,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 05:55:11,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:55:11,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:55:13,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:55:14,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:14,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 05:55:14,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:55:19,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:55:21,534 INFO [train.py:1046] (2/4) Epoch 33, batch 4200, loss[loss=0.1557, simple_loss=0.2375, pruned_loss=0.03698, over 24673.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.24, pruned_loss=0.04164, over 4702799.02 frames. ], batch size: 65, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:55:21,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 05:55:23,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:55:24,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:55:26,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:55:26,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:55:26,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:55:29,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 05:55:32,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 05:55:32,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1161253.3333333333, ans=0.125 2023-10-03 05:55:33,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:34,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:55:37,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:55:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:55:43,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:55:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:43,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 05:55:43,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:55:44,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:46,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:55:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:55:47,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:55:50,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 05:55:50,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:56,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:55:56,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:55:59,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:56:00,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:56:03,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:56:03,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 05:56:03,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:56:03,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1161386.6666666667, ans=0.2 2023-10-03 05:56:04,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:56:09,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:56:09,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1161453.3333333333, ans=0.0 2023-10-03 05:56:10,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:56:11,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1161453.3333333333, ans=0.125 2023-10-03 05:56:17,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:56:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 05:56:21,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:56:26,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:56:27,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:29,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 05:56:33,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:56:36,460 INFO [train.py:1046] (2/4) Epoch 33, batch 4250, loss[loss=0.1516, simple_loss=0.2309, pruned_loss=0.03617, over 23565.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2385, pruned_loss=0.04131, over 4684903.98 frames. ], batch size: 256, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:56:37,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:56:37,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:56:40,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:44,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:56:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 05:56:45,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:56:48,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:49,884 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.839e+02 2.000e+02 2.260e+02 3.065e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-03 05:56:51,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:56:51,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1161653.3333333333, ans=0.035 2023-10-03 05:56:56,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:56:56,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:56:57,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:56:57,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:56:58,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:00,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:01,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:04,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:57:04,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:06,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 05:57:08,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 05:57:10,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:12,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:12,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:13,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:57:13,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:13,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:17,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:57:17,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:57:23,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:57:24,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:25,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 05:57:25,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:57:26,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 05:57:28,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:57:29,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:57:31,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:31,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:57:33,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 05:57:34,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:57:35,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:57:38,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:41,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:42,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:57:44,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:57:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:57:47,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:57:47,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:57:47,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 05:57:47,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1161853.3333333333, ans=0.125 2023-10-03 05:57:49,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:49,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1161920.0, ans=0.0 2023-10-03 05:57:50,779 INFO [train.py:1046] (2/4) Epoch 33, batch 4300, loss[loss=0.1717, simple_loss=0.2549, pruned_loss=0.0442, over 23203.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2385, pruned_loss=0.04107, over 4703511.22 frames. ], batch size: 105, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:57:51,304 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.02 vs. limit=12.0 2023-10-03 05:57:54,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:57:54,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:57:55,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:56,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.63 vs. limit=6.0 2023-10-03 05:57:57,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1161920.0, ans=0.1 2023-10-03 05:58:04,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:58:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 05:58:05,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:58:06,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:58:06,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:58:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 05:58:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:58:11,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:58:14,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 05:58:14,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:58:14,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 05:58:17,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 05:58:19,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:58:22,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:58:22,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:58:23,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:58:25,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:58:26,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:58:26,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 05:58:28,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 05:58:30,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:58:33,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:33,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:58:33,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:33,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:58:33,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 05:58:33,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 05:58:33,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 05:58:35,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:58:35,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 05:58:36,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 05:58:40,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:58:42,274 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 05:58:43,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:58:45,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:58:45,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:58:47,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 05:58:48,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:58:48,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:50,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:58:50,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:58:50,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:58:53,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:58:53,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1162186.6666666667, ans=0.0 2023-10-03 05:58:56,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:58:56,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:56,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:59:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 05:59:02,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1162186.6666666667, ans=0.125 2023-10-03 05:59:03,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:59:03,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1162253.3333333333, ans=0.0 2023-10-03 05:59:04,716 INFO [train.py:1046] (2/4) Epoch 33, batch 4350, loss[loss=0.1515, simple_loss=0.2347, pruned_loss=0.0341, over 24431.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2393, pruned_loss=0.04137, over 4706856.38 frames. ], batch size: 69, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:59:06,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:07,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:59:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:59:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:59:15,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:59:18,255 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.918e+02 2.252e+02 2.552e+02 4.017e+02, threshold=4.505e+02, percent-clipped=1.0 2023-10-03 05:59:19,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:59:19,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1162320.0, ans=0.015 2023-10-03 05:59:23,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:59:23,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:59:25,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:59:27,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:59:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:59:33,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1162386.6666666667, ans=0.125 2023-10-03 05:59:34,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 05:59:34,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:35,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:59:38,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:59:43,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 05:59:46,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:59:47,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:59:52,024 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 05:59:52,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:59:52,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:59:54,027 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 05:59:54,088 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 05:59:54,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:59:54,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1162453.3333333333, ans=0.125 2023-10-03 05:59:55,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:55,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:59:56,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:59:57,452 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.24 vs. limit=10.0 2023-10-03 05:59:58,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:59:58,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:00:02,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 06:00:02,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:02,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:00:02,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:02,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 06:00:04,978 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 06:00:04,984 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 06:00:05,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 06:00:07,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:00:07,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:00:07,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:09,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:00:10,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 06:00:13,346 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 06:00:13,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:17,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:00:17,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:18,434 INFO [train.py:1046] (2/4) Epoch 33, batch 4400, loss[loss=0.1641, simple_loss=0.2526, pruned_loss=0.03782, over 24652.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2403, pruned_loss=0.04137, over 4718962.74 frames. ], batch size: 73, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 06:00:19,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:00:21,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 06:00:23,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 06:00:23,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 06:00:23,206 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 06:00:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:00:24,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:00:27,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 06:00:29,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:30,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:30,507 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 06:00:34,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:34,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 06:00:36,122 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 06:00:39,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 06:00:39,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1162653.3333333333, ans=0.125 2023-10-03 06:00:40,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 06:00:41,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 06:00:41,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:41,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:00:42,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1162653.3333333333, ans=0.125 2023-10-03 06:00:43,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:00:43,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:00:46,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 06:00:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 06:00:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:48,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:00:48,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:50,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:50,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:50,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 06:00:50,862 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 06:00:56,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:02,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:01:04,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 06:01:07,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:01:08,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:01:11,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.95 vs. limit=22.5 2023-10-03 06:01:11,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:01:11,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 06:01:11,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:01:11,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:01:11,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:01:13,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:01:18,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 06:01:20,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 06:01:21,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 06:01:21,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:01:21,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 06:01:21,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:01:21,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1162853.3333333333, ans=0.0 2023-10-03 06:01:26,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:01:28,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 06:01:30,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:01:33,217 INFO [train.py:1046] (2/4) Epoch 33, batch 4450, loss[loss=0.1677, simple_loss=0.2582, pruned_loss=0.03864, over 24359.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2414, pruned_loss=0.0418, over 4715995.05 frames. ], batch size: 77, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 06:01:33,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:01:37,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1162920.0, ans=0.0 2023-10-03 06:01:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:01:40,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1162920.0, ans=0.2 2023-10-03 06:01:41,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:01:43,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:45,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-10-03 06:01:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:01:47,588 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.835e+02 1.974e+02 2.226e+02 3.952e+02, threshold=3.948e+02, percent-clipped=0.0 2023-10-03 06:01:49,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:01:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:01:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 06:01:52,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:01:53,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:53,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:01:53,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:01:55,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:01:59,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:01:59,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:01,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:02:01,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:02:02,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:02:07,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 06:02:08,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 06:02:08,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 06:02:08,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:02:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:02:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 06:02:15,025 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.63 vs. limit=22.5 2023-10-03 06:02:16,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:02:19,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1163120.0, ans=0.125 2023-10-03 06:02:22,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:22,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 06:02:22,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:22,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:02:22,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:02:22,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:02:25,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:28,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:02:29,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 06:02:31,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:02:32,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:02:35,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:02:36,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:36,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 06:02:40,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:02:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 06:02:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:02:47,072 INFO [train.py:1046] (2/4) Epoch 33, batch 4500, loss[loss=0.1606, simple_loss=0.2367, pruned_loss=0.04226, over 23579.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2409, pruned_loss=0.04201, over 4709604.41 frames. ], batch size: 134, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:02:48,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:02:50,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 06:02:50,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 06:02:53,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:02:58,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:58,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:03:00,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:03:00,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:03:01,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:01,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:13,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:03:13,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:03:16,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:03:16,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:03:18,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:03:23,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:03:28,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:03:32,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:03:36,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:03:36,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 06:03:38,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:39,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:03:39,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:03:41,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:03:42,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:42,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 06:03:42,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:03:42,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:44,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1163453.3333333333, ans=0.125 2023-10-03 06:03:50,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:03:50,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:03:51,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:54,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:03:54,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:03:54,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 06:03:57,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 06:03:57,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 06:04:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 06:04:02,233 INFO [train.py:1046] (2/4) Epoch 33, batch 4550, loss[loss=0.1519, simple_loss=0.2137, pruned_loss=0.0451, over 23554.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2394, pruned_loss=0.04157, over 4703714.39 frames. ], batch size: 256, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:04:05,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 06:04:05,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:04:05,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1163586.6666666667, ans=0.125 2023-10-03 06:04:09,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:04:09,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:04:11,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:12,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1163586.6666666667, ans=0.0 2023-10-03 06:04:15,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:04:16,477 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.864e+02 2.150e+02 2.511e+02 3.546e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-03 06:04:16,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:04:18,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:18,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:04:18,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:21,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:21,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:04:24,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:04:27,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 06:04:27,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 06:04:29,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:04:30,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 06:04:33,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 06:04:33,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:04:37,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 06:04:39,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:04:41,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:43,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:43,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:04:44,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 06:04:47,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:04:49,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:50,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:04:50,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:52,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 06:04:52,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 06:04:53,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:04:53,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 06:04:55,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 06:04:55,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:57,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:57,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:04:58,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:58,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:04:59,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:05:00,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1163853.3333333333, ans=0.125 2023-10-03 06:05:01,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 06:05:03,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:05:03,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:05:04,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 06:05:04,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:05:04,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 06:05:07,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:05:08,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:05:09,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.45 vs. limit=22.5 2023-10-03 06:05:10,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:05:10,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:05:10,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:05:11,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:05:13,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:05:15,550 INFO [train.py:1046] (2/4) Epoch 33, batch 4600, loss[loss=0.1648, simple_loss=0.2514, pruned_loss=0.03913, over 24028.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2382, pruned_loss=0.04153, over 4705532.07 frames. ], batch size: 80, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:05:17,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:17,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:05:19,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:05:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:05:20,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 06:05:23,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:05:29,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:05:29,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:32,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:38,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 06:05:40,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:42,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:45,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:05:45,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:50,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1164053.3333333333, ans=0.0 2023-10-03 06:05:51,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 06:05:51,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:05:51,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:05:57,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:57,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:05:59,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:06:03,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 06:06:05,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:06:07,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:09,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:10,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:10,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 06:06:12,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:12,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 06:06:12,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:13,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:15,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:15,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:06:16,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:16,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 06:06:17,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 06:06:18,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 06:06:18,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:20,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:06:21,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:23,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:30,850 INFO [train.py:1046] (2/4) Epoch 33, batch 4650, loss[loss=0.1651, simple_loss=0.2317, pruned_loss=0.04931, over 23591.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2378, pruned_loss=0.04136, over 4713287.69 frames. ], batch size: 232, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:06:32,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1164253.3333333333, ans=0.1 2023-10-03 06:06:34,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:06:36,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:06:36,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:37,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:06:38,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:38,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:06:39,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:42,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 06:06:42,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1164253.3333333333, ans=0.125 2023-10-03 06:06:45,056 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.854e+02 2.077e+02 2.382e+02 3.489e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-03 06:06:45,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:06:46,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 06:06:47,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:06:49,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 06:06:49,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:06:50,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 06:06:50,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 06:06:50,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:50,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:06:55,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:06:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:57,290 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 06:06:57,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1164320.0, ans=0.125 2023-10-03 06:06:58,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:58,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1164386.6666666667, ans=0.125 2023-10-03 06:07:00,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 06:07:02,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1164386.6666666667, ans=0.0 2023-10-03 06:07:03,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:03,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:07:04,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 06:07:04,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:07:09,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:07:12,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:15,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:16,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:18,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:18,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:07:20,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1164453.3333333333, ans=0.2 2023-10-03 06:07:21,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 06:07:21,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 06:07:22,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 06:07:22,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 06:07:24,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:30,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:07:30,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:07:31,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 06:07:31,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:33,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:07:33,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:07:36,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:07:37,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:07:37,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:07:39,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:39,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1164520.0, ans=0.09899494936611666 2023-10-03 06:07:42,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:42,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:07:43,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:07:43,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 06:07:44,798 INFO [train.py:1046] (2/4) Epoch 33, batch 4700, loss[loss=0.1722, simple_loss=0.2429, pruned_loss=0.05081, over 23595.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2389, pruned_loss=0.04137, over 4706846.78 frames. ], batch size: 256, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:07:44,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:07:45,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1164586.6666666667, ans=0.1 2023-10-03 06:07:46,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 06:07:55,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:55,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:56,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:07:57,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:07:59,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:08:05,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 06:08:05,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 06:08:09,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:09,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:08:09,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:08:14,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:18,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:08:18,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1164720.0, ans=0.125 2023-10-03 06:08:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 06:08:23,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:08:29,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 06:08:30,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:08:32,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:36,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 06:08:37,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:08:43,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:08:43,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 06:08:43,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:44,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:08:46,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:47,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:08:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 06:08:47,596 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 06:08:49,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:08:50,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:50,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:50,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 06:08:51,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:56,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 06:08:56,845 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.45 vs. limit=15.0 2023-10-03 06:08:58,782 INFO [train.py:1046] (2/4) Epoch 33, batch 4750, loss[loss=0.1645, simple_loss=0.2395, pruned_loss=0.04472, over 23807.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2401, pruned_loss=0.04164, over 4702787.47 frames. ], batch size: 212, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:08:58,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:09:00,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:05,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:05,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:09:06,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 06:09:08,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:09,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 06:09:12,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:09:13,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:09:14,318 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.898e+02 2.045e+02 2.268e+02 3.285e+02, threshold=4.090e+02, percent-clipped=0.0 2023-10-03 06:09:14,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:09:19,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 06:09:25,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:09:26,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 06:09:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:09:31,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:09:31,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:09:32,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:34,292 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 06:09:34,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 06:09:38,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 06:09:40,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:42,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:09:44,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:09:44,998 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 06:09:45,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:09:48,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:09:51,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:09:51,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1165120.0, ans=0.125 2023-10-03 06:09:52,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 06:09:52,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 06:09:52,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:52,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:09:54,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:55,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 06:09:55,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 06:09:58,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 06:10:00,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:02,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:10:02,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 06:10:02,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:10:04,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:07,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:10:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:10:11,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:10:11,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 06:10:13,187 INFO [train.py:1046] (2/4) Epoch 33, batch 4800, loss[loss=0.136, simple_loss=0.2143, pruned_loss=0.02891, over 24337.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.241, pruned_loss=0.04203, over 4714844.47 frames. ], batch size: 56, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:10:13,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 06:10:14,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 06:10:17,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:10:17,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:10:19,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 06:10:24,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:24,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:29,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:10:29,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:10:29,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:31,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 06:10:31,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:10:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:10:33,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:10:37,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:10:38,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:38,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:10:41,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:41,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:10:41,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:43,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:10:44,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:47,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:49,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:49,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:10:49,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 06:10:50,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.10 vs. limit=10.0 2023-10-03 06:10:50,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:52,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 06:10:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 06:10:54,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:54,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:10:54,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:10:55,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:10:55,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:10:57,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:10:58,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:11:02,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:11:05,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:05,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:09,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 06:11:09,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:11:10,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:11,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:11:11,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:11:15,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:11:17,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:11:17,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:18,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:11:18,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:11:20,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:11:24,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:24,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:11:25,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 06:11:27,681 INFO [train.py:1046] (2/4) Epoch 33, batch 4850, loss[loss=0.1525, simple_loss=0.2337, pruned_loss=0.03567, over 23714.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2417, pruned_loss=0.0425, over 4703983.63 frames. ], batch size: 149, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:11:29,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 06:11:29,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:11:29,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:11:29,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:11:29,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:32,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:11:38,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 06:11:38,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:42,842 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.890e+02 2.112e+02 2.331e+02 3.669e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 06:11:44,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:11:45,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:11:45,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:50,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:11:51,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:11:51,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 06:11:52,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1165653.3333333333, ans=0.125 2023-10-03 06:11:52,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1165653.3333333333, ans=0.2 2023-10-03 06:11:54,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:11:57,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:11:59,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:12:00,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:12:00,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 06:12:01,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:12:01,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:05,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1165720.0, ans=0.125 2023-10-03 06:12:06,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:06,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 06:12:06,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 06:12:07,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:12:11,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1165786.6666666667, ans=0.125 2023-10-03 06:12:15,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:12:15,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 06:12:17,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:12:17,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:12:18,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:12:18,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 06:12:18,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:20,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 06:12:20,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:12:21,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:12:24,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 06:12:30,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:34,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:12:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:12:40,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 06:12:40,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:12:42,280 INFO [train.py:1046] (2/4) Epoch 33, batch 4900, loss[loss=0.1684, simple_loss=0.2508, pruned_loss=0.04296, over 24369.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2408, pruned_loss=0.04162, over 4713792.97 frames. ], batch size: 77, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:12:47,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:12:48,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:12:48,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:12:52,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 06:12:56,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 06:13:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 06:13:01,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 06:13:01,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:13:01,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:13:02,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:13:02,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:13:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:13:02,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 06:13:05,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 06:13:06,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:13:08,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:13:10,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:13:11,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:13:12,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:13:14,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:14,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 06:13:17,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1166053.3333333333, ans=0.1 2023-10-03 06:13:18,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:13:18,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:13:18,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 06:13:18,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 06:13:21,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 06:13:23,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:13:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:13:23,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:13:25,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:13:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 06:13:25,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:13:25,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 06:13:29,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:32,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:13:35,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:13:38,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 06:13:38,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:13:39,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 06:13:39,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 06:13:46,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:13:46,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:13:48,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 06:13:48,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:13:49,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:13:51,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:52,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.38 vs. limit=10.0 2023-10-03 06:13:54,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1166186.6666666667, ans=0.125 2023-10-03 06:13:55,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:13:55,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:13:55,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:13:55,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 06:13:56,758 INFO [train.py:1046] (2/4) Epoch 33, batch 4950, loss[loss=0.1609, simple_loss=0.2523, pruned_loss=0.03479, over 24648.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2398, pruned_loss=0.04137, over 4708517.91 frames. ], batch size: 73, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:13:56,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:13:59,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:13:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:14:01,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 06:14:03,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 06:14:03,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:14:04,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 06:14:04,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:04,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:14:04,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:14:04,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:07,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:07,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:14:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:14:10,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:14:10,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1166320.0, ans=0.0 2023-10-03 06:14:11,692 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.885e+02 2.044e+02 2.302e+02 2.905e+02, threshold=4.089e+02, percent-clipped=0.0 2023-10-03 06:14:11,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:13,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:14:14,344 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.63 vs. limit=22.5 2023-10-03 06:14:15,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:14:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:21,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:14:24,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:24,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:25,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:14:27,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 06:14:27,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 06:14:29,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.49 vs. limit=15.0 2023-10-03 06:14:29,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:31,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:14:31,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:14:33,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:14:33,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:14:34,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:14:35,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:37,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:14:39,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:14:40,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 06:14:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:14:43,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:14:44,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-10-03 06:14:49,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:14:50,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:14:51,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:14:51,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:51,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:14:52,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:14:53,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:14:55,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:14:55,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:56,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 06:14:59,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1166520.0, ans=0.125 2023-10-03 06:15:01,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:04,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 06:15:05,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 06:15:10,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:15:11,584 INFO [train.py:1046] (2/4) Epoch 33, batch 5000, loss[loss=0.1435, simple_loss=0.2272, pruned_loss=0.02991, over 24294.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2388, pruned_loss=0.04121, over 4693395.69 frames. ], batch size: 61, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:15:11,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:15:13,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 06:15:14,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 06:15:16,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:15:17,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 06:15:17,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:15:17,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:15:19,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 06:15:19,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:19,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:15:20,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 06:15:20,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:20,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:15:23,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 06:15:23,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 06:15:23,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:15:25,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 06:15:25,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:15:25,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:25,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:15:26,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 06:15:26,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 06:15:27,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 06:15:29,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:30,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 06:15:32,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1166653.3333333333, ans=0.125 2023-10-03 06:15:34,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:15:35,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:37,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 06:15:38,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 06:15:38,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:15:40,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:15:45,906 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 06:15:49,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:15:50,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:50,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:15:55,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 06:15:55,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:56,195 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.36 vs. limit=15.0 2023-10-03 06:15:56,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:15:56,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:15:58,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 06:15:58,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:16:01,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:16:01,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:07,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 06:16:11,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:21,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:16:23,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:23,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:16:23,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:16:23,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:16:25,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:16:25,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:26,411 INFO [train.py:1046] (2/4) Epoch 33, batch 5050, loss[loss=0.1706, simple_loss=0.257, pruned_loss=0.04213, over 24400.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2394, pruned_loss=0.04122, over 4702990.59 frames. ], batch size: 77, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:16:27,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:27,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 06:16:29,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:16:30,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:16:32,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:16:32,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 06:16:35,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:35,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:16:35,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1166920.0, ans=0.05 2023-10-03 06:16:36,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:16:38,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:16:39,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:16:40,921 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.953e+02 2.133e+02 2.408e+02 3.808e+02, threshold=4.267e+02, percent-clipped=0.0 2023-10-03 06:16:47,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 06:16:48,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:16:48,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:16:49,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 06:16:49,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:16:51,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:16:51,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:52,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:16:52,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 06:16:54,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 06:16:55,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:16:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:01,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:17:01,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 06:17:02,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:17:06,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 06:17:07,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:17:07,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:17:08,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:08,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:17:11,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:17:14,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:17:14,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:14,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:17:16,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:17:16,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 06:17:17,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:17:18,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:17:21,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:17:21,779 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 06:17:21,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:17:23,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:17:24,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:24,544 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 06:17:24,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1167186.6666666667, ans=0.125 2023-10-03 06:17:29,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:29,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 06:17:29,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:29,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1167186.6666666667, ans=0.04949747468305833 2023-10-03 06:17:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:33,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:33,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 06:17:34,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 06:17:38,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:17:38,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:17:38,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:17:39,184 INFO [train.py:1046] (2/4) Epoch 33, batch 5100, loss[loss=0.2249, simple_loss=0.2901, pruned_loss=0.07987, over 19396.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2406, pruned_loss=0.04138, over 4703701.87 frames. ], batch size: 388, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:17:40,659 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 06:17:42,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:45,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 06:17:45,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 06:17:46,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:17:48,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:17:49,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:17:49,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 06:17:50,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 06:17:54,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:54,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:18:00,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:18:01,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 06:18:01,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.27 vs. limit=15.0 2023-10-03 06:18:02,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:18:04,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:18:04,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 06:18:07,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:08,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:08,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 06:18:12,010 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 06:18:13,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:14,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 06:18:14,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 06:18:18,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:18:26,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:18:27,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1167453.3333333333, ans=0.125 2023-10-03 06:18:28,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 06:18:28,792 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 06:18:28,801 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 06:18:32,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 06:18:32,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:33,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 06:18:37,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 06:18:39,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 06:18:41,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:18:42,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1167520.0, ans=0.125 2023-10-03 06:18:43,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 06:18:43,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:18:45,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 06:18:49,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.51 vs. limit=15.0 2023-10-03 06:18:50,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:18:50,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:18:50,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:18:51,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:18:51,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:18:52,743 INFO [train.py:1046] (2/4) Epoch 33, batch 5150, loss[loss=0.1799, simple_loss=0.2498, pruned_loss=0.05505, over 23841.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.241, pruned_loss=0.04143, over 4710925.08 frames. ], batch size: 195, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:18:52,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:18:54,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 06:18:54,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 06:18:55,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 06:18:55,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:18:55,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 06:18:58,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:18:58,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 06:18:59,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:01,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:06,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:19:06,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 06:19:07,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:08,863 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.881e+02 2.011e+02 2.161e+02 3.119e+02, threshold=4.022e+02, percent-clipped=0.0 2023-10-03 06:19:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:19:09,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:19:09,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:19:09,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:19:10,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:19:10,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:19:10,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 06:19:13,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:19:13,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:19:15,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:19:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 06:19:17,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:19:21,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1167720.0, ans=0.0 2023-10-03 06:19:23,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:19:24,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 06:19:27,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:19:35,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:19:36,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:39,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:19:41,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:19:43,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 06:19:47,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:49,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:19:49,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:19:53,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:19:53,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:19:54,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 06:19:58,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:58,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:20:00,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:20:01,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:20:01,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:20:03,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:20:03,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:20:03,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:20:06,126 INFO [train.py:1046] (2/4) Epoch 33, batch 5200, loss[loss=0.1534, simple_loss=0.2276, pruned_loss=0.03963, over 23762.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04152, over 4706986.00 frames. ], batch size: 212, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:20:06,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1167920.0, ans=0.0 2023-10-03 06:20:07,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:20:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:20:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:16,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 06:20:16,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:20:17,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1167920.0, ans=0.125 2023-10-03 06:20:18,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:19,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1167986.6666666667, ans=0.1 2023-10-03 06:20:21,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:22,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:20:22,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:23,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 06:20:25,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:20:25,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:20:28,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 06:20:29,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:20:32,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:20:32,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 06:20:32,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1167986.6666666667, ans=0.0 2023-10-03 06:20:34,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 06:20:34,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1168053.3333333333, ans=0.0 2023-10-03 06:20:36,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 06:20:38,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:20:38,084 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 06:20:38,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:38,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:20:39,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:20:39,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 06:20:41,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:20:44,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:46,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 06:20:46,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 06:20:46,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 06:20:50,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 06:20:52,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:20:52,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1168120.0, ans=0.0 2023-10-03 06:20:56,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:20:57,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:20:58,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 06:20:58,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:58,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:20:58,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:20:58,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:21:01,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1168120.0, ans=0.0 2023-10-03 06:21:01,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1168120.0, ans=0.125 2023-10-03 06:21:03,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:21:04,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:21:09,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:21:10,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:10,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:15,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:21:15,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 06:21:17,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:21:17,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:21:17,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1168186.6666666667, ans=0.125 2023-10-03 06:21:18,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:19,724 INFO [train.py:1046] (2/4) Epoch 33, batch 5250, loss[loss=0.1561, simple_loss=0.2323, pruned_loss=0.03995, over 23348.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2392, pruned_loss=0.0413, over 4708603.96 frames. ], batch size: 105, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:21:19,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:21:19,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:21:22,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:21:25,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:26,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:21:28,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:21:29,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1168253.3333333333, ans=0.125 2023-10-03 06:21:32,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:21:33,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:21:35,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:21:36,477 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.883e+02 2.112e+02 2.383e+02 3.529e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 06:21:36,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:21:40,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 06:21:40,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:41,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:45,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1168320.0, ans=0.5 2023-10-03 06:21:58,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1168386.6666666667, ans=0.0 2023-10-03 06:22:15,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1168520.0, ans=0.0 2023-10-03 06:22:19,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1168520.0, ans=0.125 2023-10-03 06:22:27,876 INFO [train.py:1046] (2/4) Epoch 33, batch 5300, loss[loss=0.1592, simple_loss=0.2415, pruned_loss=0.03847, over 24367.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2389, pruned_loss=0.04093, over 4721977.26 frames. ], batch size: 77, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:22:42,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:22:42,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 06:22:42,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 06:22:42,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:42,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:42,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:42,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:42,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:42,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:22:43,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:43,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:22:43,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:22:43,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 06:22:43,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 06:22:43,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 06:22:43,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:22:44,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 06:22:44,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 06:22:44,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:44,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:44,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:22:44,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:22:44,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:22:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:22:44,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:45,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:45,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:22:45,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:45,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:22:45,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:45,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:22:46,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 06:22:46,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:22:46,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:46,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 06:22:46,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 06:22:46,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:22:46,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:22:46,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 06:22:46,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 06:22:46,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:22:47,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:22:47,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:22:47,344 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 06:22:47,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 06:22:47,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:22:47,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 06:22:47,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 06:22:48,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 06:22:48,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:22:54,399 INFO [train.py:1046] (2/4) Epoch 34, batch 0, loss[loss=0.1596, simple_loss=0.2484, pruned_loss=0.03544, over 24421.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2484, pruned_loss=0.03544, over 24421.00 frames. ], batch size: 69, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:22:54,400 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 06:23:07,169 INFO [train.py:1078] (2/4) Epoch 34, validation: loss=0.3345, simple_loss=0.2716, pruned_loss=0.1987, over 1125622.00 frames. 2023-10-03 06:23:07,170 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 06:23:11,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 06:23:11,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1168666.6666666667, ans=0.125 2023-10-03 06:23:13,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:23:14,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:23:19,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:19,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:23:19,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:20,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 06:23:22,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 06:23:24,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:24,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:27,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:27,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:23:27,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:23:29,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 06:23:30,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:23:33,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1168733.3333333333, ans=0.125 2023-10-03 06:23:36,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1168800.0, ans=0.0 2023-10-03 06:23:37,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:23:37,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:41,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 06:23:45,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:23:45,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:23:45,905 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:23:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:23:52,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:23:57,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:01,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 06:24:03,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1168866.6666666667, ans=0.0 2023-10-03 06:24:04,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 06:24:06,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:24:06,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:06,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:24:06,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:24:07,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 06:24:09,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:11,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:14,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:24:16,925 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 06:24:18,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:24:21,493 INFO [train.py:1046] (2/4) Epoch 34, batch 50, loss[loss=0.1605, simple_loss=0.233, pruned_loss=0.04397, over 23472.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2395, pruned_loss=0.03848, over 1080395.94 frames. ], batch size: 285, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:24:22,871 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.913e+02 2.173e+02 2.528e+02 6.265e+02, threshold=4.345e+02, percent-clipped=6.0 2023-10-03 06:24:22,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:24:24,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:24:24,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 06:24:25,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:24:25,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:24:27,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:24:28,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.08 vs. limit=10.0 2023-10-03 06:24:28,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:24:31,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:24:32,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 06:24:32,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:38,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:24:40,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 06:24:41,179 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.38 vs. limit=15.0 2023-10-03 06:24:41,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 06:24:42,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1169066.6666666667, ans=0.125 2023-10-03 06:24:43,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:24:44,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:24:44,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:46,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:24:46,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:24:47,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:24:47,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:50,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1169133.3333333333, ans=0.09899494936611666 2023-10-03 06:24:55,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:24:58,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:24:58,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:24:58,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1169133.3333333333, ans=10.0 2023-10-03 06:24:59,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 06:25:00,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1169133.3333333333, ans=0.0 2023-10-03 06:25:01,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:25:01,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:25:01,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 06:25:02,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:25:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 06:25:10,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:12,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:25:13,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:13,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1169200.0, ans=0.0 2023-10-03 06:25:14,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:25:14,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:25:17,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 06:25:17,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 06:25:19,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:19,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:25:21,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:25:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:25:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 06:25:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 06:25:25,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 06:25:25,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:25:27,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 06:25:27,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 06:25:28,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:29,272 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:25:30,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:25:30,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:25:31,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:25:33,327 INFO [train.py:1046] (2/4) Epoch 34, batch 100, loss[loss=0.1405, simple_loss=0.224, pruned_loss=0.02848, over 19883.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2415, pruned_loss=0.04026, over 1897600.60 frames. ], batch size: 43, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:25:33,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:25:33,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1169333.3333333333, ans=0.1 2023-10-03 06:25:36,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:25:38,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:25:40,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 06:25:40,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:45,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:25:46,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:25:46,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:25:46,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:25:46,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:25:48,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 06:25:48,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1169400.0, ans=0.0 2023-10-03 06:25:49,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:25:51,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:51,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:51,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:25:52,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1169400.0, ans=0.125 2023-10-03 06:25:55,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 06:25:57,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:58,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:59,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:26:01,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:26:02,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1169466.6666666667, ans=0.125 2023-10-03 06:26:05,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 06:26:05,421 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 06:26:06,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:06,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:26:09,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:26:11,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:26:14,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:17,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:18,366 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 06:26:21,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 06:26:23,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:26:23,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:26:27,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:29,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:32,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:26:34,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:26:35,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1169600.0, ans=0.04949747468305833 2023-10-03 06:26:36,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:36,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:26:38,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:38,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:26:38,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:39,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 06:26:39,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 06:26:39,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:40,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:26:42,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:42,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:42,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 06:26:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:26:43,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:26:43,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:44,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:26:45,798 INFO [train.py:1046] (2/4) Epoch 34, batch 150, loss[loss=0.152, simple_loss=0.2369, pruned_loss=0.03357, over 24648.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2415, pruned_loss=0.0403, over 2513505.36 frames. ], batch size: 68, lr: 3.02e-03, grad_scale: 4.0 2023-10-03 06:26:45,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:45,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:26:45,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:26:48,605 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.899e+02 2.057e+02 2.467e+02 3.842e+02, threshold=4.114e+02, percent-clipped=0.0 2023-10-03 06:26:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:52,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:26:52,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:26:52,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:56,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:56,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:00,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:27:01,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:03,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1169733.3333333333, ans=0.1 2023-10-03 06:27:05,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 06:27:05,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 06:27:05,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 06:27:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:27:08,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:27:09,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:27:09,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:27:09,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:11,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:11,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:12,520 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 06:27:15,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:20,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:27:20,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1169800.0, ans=0.125 2023-10-03 06:27:23,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:27:24,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 06:27:26,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:27:26,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:27:26,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:27:28,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:27:29,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:27:30,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:27:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:32,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 06:27:37,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:39,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:27:39,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:27:39,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:27:42,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:45,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 06:27:46,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1169933.3333333333, ans=15.0 2023-10-03 06:27:47,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:27:47,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:27:49,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:27:51,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:27:51,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 06:27:51,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:27:53,218 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 06:27:54,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:58,628 INFO [train.py:1046] (2/4) Epoch 34, batch 200, loss[loss=0.1568, simple_loss=0.233, pruned_loss=0.04031, over 23728.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2422, pruned_loss=0.04107, over 3013838.62 frames. ], batch size: 149, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:27:58,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:27:59,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1170000.0, ans=0.125 2023-10-03 06:28:00,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 06:28:02,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:03,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:03,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1170000.0, ans=0.025 2023-10-03 06:28:04,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 06:28:06,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:28:07,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:07,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1170000.0, ans=0.1 2023-10-03 06:28:08,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:12,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:28:13,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:28:13,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:21,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1170066.6666666667, ans=0.0 2023-10-03 06:28:28,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1170133.3333333333, ans=0.125 2023-10-03 06:28:31,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:28:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:28:33,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:28:34,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:28:34,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 06:28:34,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:28:35,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:37,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:28:38,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:38,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:28:40,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 06:28:40,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:28:40,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:43,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1170200.0, ans=0.125 2023-10-03 06:28:44,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:28:50,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:57,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:57,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:29:00,506 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.42 vs. limit=15.0 2023-10-03 06:29:01,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1170266.6666666667, ans=0.0 2023-10-03 06:29:03,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:03,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=6.0 2023-10-03 06:29:04,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 06:29:05,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:29:07,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:29:07,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:29:08,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:29:08,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 06:29:11,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:29:11,370 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 06:29:11,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1170333.3333333333, ans=0.0 2023-10-03 06:29:12,606 INFO [train.py:1046] (2/4) Epoch 34, batch 250, loss[loss=0.1462, simple_loss=0.2222, pruned_loss=0.03511, over 20755.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2411, pruned_loss=0.04099, over 3399016.29 frames. ], batch size: 45, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:29:14,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:15,952 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.885e+02 2.083e+02 2.421e+02 4.173e+02, threshold=4.166e+02, percent-clipped=1.0 2023-10-03 06:29:17,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:29:18,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.33 vs. limit=15.0 2023-10-03 06:29:19,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:19,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:29:22,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:29:22,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:23,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:29:25,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1170333.3333333333, ans=0.125 2023-10-03 06:29:26,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:29:34,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1170400.0, ans=0.1 2023-10-03 06:29:34,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.34 vs. limit=15.0 2023-10-03 06:29:36,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:29:38,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:29:38,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:29:38,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1170400.0, ans=0.125 2023-10-03 06:29:44,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1170466.6666666667, ans=0.125 2023-10-03 06:29:45,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:29:47,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:29:47,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:29:48,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:29:48,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:29:48,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:29:50,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:29:52,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:29:55,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 06:29:56,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:29:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:29:59,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:29:59,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:30:00,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:30:00,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:30:00,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:30:02,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:04,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:30:05,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:09,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:30:11,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:15,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:30:15,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1170600.0, ans=0.125 2023-10-03 06:30:21,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:23,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:30:28,048 INFO [train.py:1046] (2/4) Epoch 34, batch 300, loss[loss=0.1577, simple_loss=0.2425, pruned_loss=0.03639, over 24469.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.24, pruned_loss=0.04038, over 3691899.24 frames. ], batch size: 63, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:30:28,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 06:30:29,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:30:29,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:30:30,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 06:30:30,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:30:31,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:30:31,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 06:30:35,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:35,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:30:38,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:30:39,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 06:30:41,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:41,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:30:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 06:30:41,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:30:45,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:30:50,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:30:50,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 06:30:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 06:30:56,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:30:58,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:00,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:00,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 06:31:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:31:02,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:31:02,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1170800.0, ans=0.1 2023-10-03 06:31:03,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:31:03,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:31:07,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 06:31:07,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 06:31:07,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:31:11,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:11,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 06:31:12,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:18,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:31:20,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:31:20,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 06:31:20,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1170866.6666666667, ans=0.1 2023-10-03 06:31:23,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:23,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:31:26,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:26,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:31:26,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 06:31:27,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:31:27,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:31:28,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1170933.3333333333, ans=0.0 2023-10-03 06:31:30,158 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.81 vs. limit=22.5 2023-10-03 06:31:30,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 06:31:31,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:31,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:33,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:34,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.10 vs. limit=10.0 2023-10-03 06:31:34,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:34,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:41,418 INFO [train.py:1046] (2/4) Epoch 34, batch 350, loss[loss=0.1471, simple_loss=0.2313, pruned_loss=0.03144, over 24649.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.238, pruned_loss=0.04004, over 3914615.66 frames. ], batch size: 68, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:31:41,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:31:41,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 06:31:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:45,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1171000.0, ans=0.1 2023-10-03 06:31:46,055 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.903e+02 2.085e+02 2.376e+02 3.254e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-03 06:31:46,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1171000.0, ans=0.2 2023-10-03 06:31:48,364 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.42 vs. limit=15.0 2023-10-03 06:31:49,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:51,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:53,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:56,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 06:31:58,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:31:58,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 06:32:00,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:01,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 06:32:01,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:32:03,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 06:32:05,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:32:06,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=1171066.6666666667, ans=10.0 2023-10-03 06:32:07,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:32:08,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:32:10,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:11,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:11,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:32:11,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:12,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:32:14,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:32:14,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:22,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:32:22,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:32:23,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:32:23,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:27,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 06:32:27,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:32,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:32,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:32,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:32:33,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 06:32:36,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:36,742 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 06:32:39,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 06:32:39,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:43,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:32:43,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 06:32:43,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1171266.6666666667, ans=0.05 2023-10-03 06:32:46,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:48,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:32:48,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:49,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:49,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:51,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:54,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:32:56,037 INFO [train.py:1046] (2/4) Epoch 34, batch 400, loss[loss=0.1638, simple_loss=0.2369, pruned_loss=0.0454, over 23781.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2381, pruned_loss=0.03981, over 4098938.06 frames. ], batch size: 164, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:32:56,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:32:58,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 06:32:58,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:59,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:00,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:33:02,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:04,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:04,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:05,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1171333.3333333333, ans=0.125 2023-10-03 06:33:06,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 06:33:09,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 06:33:09,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:10,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 06:33:11,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:14,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:33:14,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 06:33:14,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:33:16,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:16,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:16,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:33:19,853 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 06:33:21,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 06:33:25,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:25,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:27,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 06:33:29,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 06:33:29,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1171466.6666666667, ans=0.125 2023-10-03 06:33:31,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:33:33,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:33:33,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1171466.6666666667, ans=0.035 2023-10-03 06:33:34,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1171466.6666666667, ans=0.0 2023-10-03 06:33:37,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 06:33:40,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:33:41,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 06:33:45,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:45,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:33:45,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 06:33:49,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:33:51,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1171533.3333333333, ans=0.125 2023-10-03 06:33:52,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:33:54,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:54,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1171600.0, ans=0.125 2023-10-03 06:33:56,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:33:56,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 06:34:00,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:34:00,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 06:34:01,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:34:01,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:34:04,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 06:34:06,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-10-03 06:34:07,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:34:08,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:34:08,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:34:08,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1171666.6666666667, ans=0.1 2023-10-03 06:34:09,942 INFO [train.py:1046] (2/4) Epoch 34, batch 450, loss[loss=0.1634, simple_loss=0.2329, pruned_loss=0.04696, over 22806.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.239, pruned_loss=0.0402, over 4232948.25 frames. ], batch size: 322, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:34:10,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 06:34:10,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:34:11,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:34:12,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:34:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 06:34:12,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:34:14,034 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.869e+02 1.964e+02 2.234e+02 2.686e+02, threshold=3.928e+02, percent-clipped=0.0 2023-10-03 06:34:14,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:34:15,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:34:26,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:26,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:34:27,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 06:34:29,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 06:34:32,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:34:34,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1171733.3333333333, ans=0.2 2023-10-03 06:34:35,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:36,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:34:39,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:34:40,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:34:43,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 06:34:43,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 06:34:46,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 06:34:46,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:34:47,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:34:47,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:34:50,916 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 06:34:50,924 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 06:34:50,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:51,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1171800.0, ans=0.0 2023-10-03 06:34:52,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:34:53,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 06:34:54,589 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.94 vs. limit=15.0 2023-10-03 06:34:56,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:34:58,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:34:58,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 06:34:58,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 06:35:01,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:35:03,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:35:04,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:35:07,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 06:35:11,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:35:12,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 06:35:12,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 06:35:14,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:35:15,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1171933.3333333333, ans=0.125 2023-10-03 06:35:18,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:35:19,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:35:21,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.78 vs. limit=15.0 2023-10-03 06:35:22,461 INFO [train.py:1046] (2/4) Epoch 34, batch 500, loss[loss=0.1723, simple_loss=0.2391, pruned_loss=0.05272, over 22853.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2396, pruned_loss=0.04022, over 4347008.63 frames. ], batch size: 322, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:35:22,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:35:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 06:35:23,934 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.10 vs. limit=15.0 2023-10-03 06:35:27,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:35:29,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:35:29,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:35:29,134 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 06:35:30,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 06:35:30,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:35:33,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:35:36,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:35:38,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:35:39,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:35:39,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:35:39,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1172066.6666666667, ans=0.125 2023-10-03 06:35:40,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:35:49,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:49,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:35:50,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:35:50,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:50,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1172133.3333333333, ans=0.0 2023-10-03 06:35:51,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 06:35:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:35:55,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:35:56,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:35:56,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:35:56,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:58,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 06:36:02,431 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 06:36:03,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:05,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:06,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:06,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:08,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:36:09,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 06:36:12,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:36:14,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:18,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:20,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:27,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:31,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 06:36:31,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:31,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:31,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1172266.6666666667, ans=0.125 2023-10-03 06:36:34,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 06:36:35,492 INFO [train.py:1046] (2/4) Epoch 34, batch 550, loss[loss=0.1769, simple_loss=0.2493, pruned_loss=0.05222, over 23609.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2411, pruned_loss=0.04096, over 4433469.76 frames. ], batch size: 256, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:36:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:36:37,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:39,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.022e+02 2.267e+02 3.367e+02, threshold=4.045e+02, percent-clipped=0.0 2023-10-03 06:36:41,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 06:36:42,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 06:36:42,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:42,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 06:36:42,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:36:44,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:44,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:46,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:46,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:36:47,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:36:48,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:49,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 06:36:49,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1172400.0, ans=0.2 2023-10-03 06:36:50,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:36:54,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:36:54,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:57,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:36:57,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:37:03,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 06:37:03,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 06:37:03,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:37:08,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:37:09,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:37:10,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:37:13,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:13,488 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 06:37:15,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:37:15,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 06:37:18,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:37:19,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:37:19,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:37:19,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:22,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 06:37:23,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 06:37:23,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1172533.3333333333, ans=0.2 2023-10-03 06:37:24,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:24,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:37:24,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:37:24,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:37:26,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1172533.3333333333, ans=0.0 2023-10-03 06:37:27,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:37:27,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:37:27,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1172533.3333333333, ans=0.0 2023-10-03 06:37:29,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1172533.3333333333, ans=0.0 2023-10-03 06:37:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:37:32,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:33,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 06:37:33,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:37:34,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1172600.0, ans=0.125 2023-10-03 06:37:36,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:38,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:37:38,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:38,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1172600.0, ans=0.1 2023-10-03 06:37:39,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:37:39,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 06:37:44,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 06:37:48,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 06:37:49,528 INFO [train.py:1046] (2/4) Epoch 34, batch 600, loss[loss=0.155, simple_loss=0.2346, pruned_loss=0.0377, over 23673.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.241, pruned_loss=0.04133, over 4497248.18 frames. ], batch size: 149, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:37:49,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:37:50,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:37:50,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:57,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:38:00,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:38:02,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 06:38:06,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:38:06,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:38:06,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1172733.3333333333, ans=0.05 2023-10-03 06:38:08,051 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.71 vs. limit=10.0 2023-10-03 06:38:08,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:10,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 06:38:10,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:38:13,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1172733.3333333333, ans=0.125 2023-10-03 06:38:17,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 06:38:20,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:38:21,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:21,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:38:27,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:38:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:38:28,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:38:30,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=15.0 2023-10-03 06:38:34,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:38:38,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:38:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:38:38,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:39,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1172866.6666666667, ans=0.125 2023-10-03 06:38:45,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 06:38:49,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:38:49,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:38:54,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 06:38:55,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:38:58,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 06:38:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:38:58,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:39:02,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 06:39:04,476 INFO [train.py:1046] (2/4) Epoch 34, batch 650, loss[loss=0.1566, simple_loss=0.2192, pruned_loss=0.04704, over 23590.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2398, pruned_loss=0.04138, over 4543427.46 frames. ], batch size: 256, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:39:04,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:39:07,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:39:07,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:39:08,688 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.849e+02 2.038e+02 2.277e+02 3.904e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-03 06:39:08,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:11,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 06:39:12,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:39:14,462 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:39:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:39:17,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:39:21,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:25,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 06:39:26,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:39:27,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:39:28,278 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:39:30,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:39:32,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 06:39:35,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:35,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:35,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:39:36,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:39,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:39:42,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:39:42,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 06:39:42,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:42,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:39:44,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:46,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:39:46,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:39:46,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:39:46,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1173200.0, ans=0.2 2023-10-03 06:39:47,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 06:39:51,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:39:51,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:39:52,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:39:54,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:39:54,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:39:55,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 06:39:57,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 06:39:57,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:57,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:39:57,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:39:57,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:39:58,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:40:02,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:02,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:40:04,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:40:05,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1173266.6666666667, ans=0.95 2023-10-03 06:40:06,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1173266.6666666667, ans=0.0 2023-10-03 06:40:07,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:40:07,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 06:40:07,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:40:15,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:40:15,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:40:19,764 INFO [train.py:1046] (2/4) Epoch 34, batch 700, loss[loss=0.1708, simple_loss=0.2386, pruned_loss=0.05147, over 23816.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2378, pruned_loss=0.04115, over 4563657.69 frames. ], batch size: 179, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:40:19,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:40:19,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:40:26,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 06:40:27,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 06:40:27,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 06:40:28,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:30,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:40:31,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 06:40:35,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:40:37,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:40:40,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:40:41,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:40:43,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:46,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 06:40:46,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:40:46,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1173400.0, ans=0.125 2023-10-03 06:40:47,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 06:40:47,614 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:40:50,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 06:40:52,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1173466.6666666667, ans=0.0 2023-10-03 06:40:55,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:40:55,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:40:57,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:40:59,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:41:00,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1173466.6666666667, ans=0.125 2023-10-03 06:41:01,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 06:41:05,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:06,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:41:06,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 06:41:09,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:41:10,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1173533.3333333333, ans=0.0 2023-10-03 06:41:11,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:13,320 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=10.01 vs. limit=10.0 2023-10-03 06:41:14,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:41:19,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:41:20,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 06:41:24,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 06:41:24,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 06:41:28,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:29,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1173600.0, ans=0.125 2023-10-03 06:41:30,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:41:32,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:41:33,733 INFO [train.py:1046] (2/4) Epoch 34, batch 750, loss[loss=0.1681, simple_loss=0.2402, pruned_loss=0.04799, over 23879.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2374, pruned_loss=0.04086, over 4602461.82 frames. ], batch size: 195, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:41:33,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:33,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 06:41:36,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 06:41:38,361 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.818e+02 1.954e+02 2.110e+02 2.895e+02, threshold=3.908e+02, percent-clipped=0.0 2023-10-03 06:41:38,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 06:41:38,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 06:41:39,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 06:41:39,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 06:41:39,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:41:41,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 06:41:42,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:41:44,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1173666.6666666667, ans=0.125 2023-10-03 06:41:45,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:41:46,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:46,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:41:46,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:41:49,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:41:52,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:41:53,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:41:55,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:41:56,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:56,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 06:41:58,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:41:59,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:42:02,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:42:03,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:42:04,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 06:42:04,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:06,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 06:42:06,515 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 06:42:08,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 06:42:08,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:42:09,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:42:10,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.35 vs. limit=15.0 2023-10-03 06:42:10,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:42:16,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:42:16,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:16,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:42:19,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:42:20,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:21,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 06:42:21,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:42:23,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 06:42:23,873 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.51 vs. limit=15.0 2023-10-03 06:42:24,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:42:28,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:42:29,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 06:42:29,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:35,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:42:35,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:42:35,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1173933.3333333333, ans=0.2 2023-10-03 06:42:37,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:42:37,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1173933.3333333333, ans=0.125 2023-10-03 06:42:39,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:42:40,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1173933.3333333333, ans=0.125 2023-10-03 06:42:43,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 06:42:43,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:42:43,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:42:44,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:42:45,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:47,363 INFO [train.py:1046] (2/4) Epoch 34, batch 800, loss[loss=0.1676, simple_loss=0.2418, pruned_loss=0.04672, over 23696.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2383, pruned_loss=0.04055, over 4643285.10 frames. ], batch size: 164, lr: 3.01e-03, grad_scale: 32.0 2023-10-03 06:42:48,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:48,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:42:51,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1174000.0, ans=0.1 2023-10-03 06:42:54,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:54,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:56,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:42:56,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:58,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1174000.0, ans=0.125 2023-10-03 06:42:59,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:01,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:05,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:43:08,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 06:43:08,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:11,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:43:11,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:43:11,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:43:11,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 06:43:13,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:13,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 06:43:16,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:17,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:18,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:43:18,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:43:21,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:21,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:25,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:43:25,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:43:25,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 06:43:29,190 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 06:43:29,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 06:43:29,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:43:29,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:43:29,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1174133.3333333333, ans=0.125 2023-10-03 06:43:31,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:31,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:43:36,666 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 06:43:36,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 06:43:38,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:43:41,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:43:44,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:43:48,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:49,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 06:43:49,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:43:52,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 06:43:59,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:44:01,095 INFO [train.py:1046] (2/4) Epoch 34, batch 850, loss[loss=0.1631, simple_loss=0.2465, pruned_loss=0.03984, over 24670.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2397, pruned_loss=0.0413, over 4645541.65 frames. ], batch size: 65, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:44:01,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:44:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 06:44:02,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:44:02,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:44:04,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 06:44:04,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:07,098 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.856e+02 2.060e+02 2.258e+02 3.332e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-03 06:44:07,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:44:08,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:09,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:44:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:44:11,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 06:44:11,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1174333.3333333333, ans=0.0 2023-10-03 06:44:12,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 06:44:12,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 06:44:14,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:44:14,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:44:16,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:44:17,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:44:22,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:22,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:24,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 06:44:25,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 06:44:26,095 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:44:29,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:29,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 06:44:33,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 06:44:35,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 06:44:37,129 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 06:44:37,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:44:37,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:44:38,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 06:44:41,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:43,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:43,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 06:44:45,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:44:47,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:47,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:44:47,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:44:48,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:44:50,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:44:50,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 06:44:54,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:44:54,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:44:54,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:44:54,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:44:55,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:57,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1174533.3333333333, ans=0.2 2023-10-03 06:44:58,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:45:02,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:45:03,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:45:03,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:05,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:45:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:45:14,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:45:15,811 INFO [train.py:1046] (2/4) Epoch 34, batch 900, loss[loss=0.1336, simple_loss=0.2117, pruned_loss=0.02777, over 24340.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2407, pruned_loss=0.04168, over 4651833.67 frames. ], batch size: 56, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:45:15,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 06:45:15,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:45:15,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:45:18,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 06:45:22,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:45:25,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:26,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 06:45:29,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:45:29,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 06:45:31,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 06:45:31,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:45:31,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:45:33,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:45:34,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:45:42,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:45:42,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:43,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:45:43,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1174800.0, ans=0.0 2023-10-03 06:45:46,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:45:49,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 06:45:52,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:45:55,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:45:55,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:45:56,854 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 06:45:56,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 06:46:03,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:46:03,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:46:05,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:46:05,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1174866.6666666667, ans=0.125 2023-10-03 06:46:09,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:11,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 06:46:11,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:46:14,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 06:46:15,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:46:15,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:17,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:46:17,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:21,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 06:46:22,641 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 06:46:22,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 06:46:22,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 06:46:26,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:30,131 INFO [train.py:1046] (2/4) Epoch 34, batch 950, loss[loss=0.1645, simple_loss=0.2264, pruned_loss=0.05136, over 22845.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04161, over 4668010.57 frames. ], batch size: 323, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:46:30,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 06:46:35,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:46:38,420 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.873e+02 2.075e+02 2.442e+02 3.584e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 06:46:38,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:38,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:39,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:46:41,257 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 06:46:45,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:47,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:46:47,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1175066.6666666667, ans=0.0 2023-10-03 06:46:48,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:46:48,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:46:48,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 06:46:48,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1175066.6666666667, ans=0.0 2023-10-03 06:46:50,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:46:51,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:52,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 06:46:52,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:56,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:56,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:58,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:59,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 06:47:01,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:47:03,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:47:04,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:47:10,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:47:10,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:47:14,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 06:47:16,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 06:47:16,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:47:17,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:47:18,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:18,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:47:23,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 06:47:23,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:47:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:47:26,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:26,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 06:47:26,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:47:26,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:47:26,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 06:47:28,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1175266.6666666667, ans=0.2 2023-10-03 06:47:29,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1175266.6666666667, ans=0.125 2023-10-03 06:47:30,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:47:34,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:47:38,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:47:38,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 06:47:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 06:47:41,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1175266.6666666667, ans=0.125 2023-10-03 06:47:42,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:44,856 INFO [train.py:1046] (2/4) Epoch 34, batch 1000, loss[loss=0.1559, simple_loss=0.2189, pruned_loss=0.0464, over 23431.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.24, pruned_loss=0.04152, over 4681374.34 frames. ], batch size: 285, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:47:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 06:47:47,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:47:52,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:47:52,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1175333.3333333333, ans=0.2 2023-10-03 06:47:53,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.72 vs. limit=22.5 2023-10-03 06:47:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 06:47:55,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 06:47:55,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1175333.3333333333, ans=0.0 2023-10-03 06:47:59,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:47:59,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:48:02,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:05,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 06:48:06,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 06:48:09,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 06:48:09,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:11,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 06:48:12,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 06:48:12,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 06:48:13,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:14,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:21,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:22,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:48:22,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:22,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 06:48:24,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:24,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:48:25,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:25,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1175466.6666666667, ans=0.1 2023-10-03 06:48:26,910 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 06:48:28,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 06:48:30,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 06:48:31,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 06:48:33,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1175533.3333333333, ans=0.125 2023-10-03 06:48:33,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1175533.3333333333, ans=0.125 2023-10-03 06:48:34,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:48:40,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:40,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:48:41,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:42,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:48:44,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 06:48:45,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:48:45,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 06:48:47,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 06:48:48,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:48:48,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:52,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:48:54,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:48:55,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:57,056 INFO [train.py:1046] (2/4) Epoch 34, batch 1050, loss[loss=0.1721, simple_loss=0.2534, pruned_loss=0.04544, over 23960.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2382, pruned_loss=0.04116, over 4686058.61 frames. ], batch size: 86, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:48:58,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:48:58,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:49:00,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:49:00,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1175666.6666666667, ans=0.125 2023-10-03 06:49:02,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:49:04,580 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.921e+02 2.098e+02 2.393e+02 3.925e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 06:49:04,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:49:06,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1175666.6666666667, ans=0.0 2023-10-03 06:49:07,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:49:08,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:49:10,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:49:11,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:49:11,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:49:11,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:49:13,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 06:49:14,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:49:14,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 06:49:17,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:49:17,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 06:49:17,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 06:49:24,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1175733.3333333333, ans=0.125 2023-10-03 06:49:26,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:49:27,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:49:27,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:49:29,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 06:49:29,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 06:49:29,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:49:34,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 06:49:37,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 06:49:37,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:49:41,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 06:49:42,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1175866.6666666667, ans=0.1 2023-10-03 06:49:43,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 06:49:43,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:49:43,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:49:46,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:49:51,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 06:49:53,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 06:49:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 06:49:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:49:54,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:49:56,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 06:49:58,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:50:02,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:50:02,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:50:03,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:50:03,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:06,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:06,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 06:50:07,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:50:07,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 06:50:07,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 06:50:08,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:50:10,775 INFO [train.py:1046] (2/4) Epoch 34, batch 1100, loss[loss=0.1591, simple_loss=0.2372, pruned_loss=0.04045, over 23300.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2372, pruned_loss=0.04075, over 4692237.85 frames. ], batch size: 105, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:50:12,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:50:16,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:50:23,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:50:25,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:50:25,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:50:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 06:50:28,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:50:28,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:50:32,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:50:32,374 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:50:34,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:50:34,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 06:50:36,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 06:50:38,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:50:38,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:50:39,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=12.0 2023-10-03 06:50:40,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:50:42,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:50:46,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:50:46,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1176133.3333333333, ans=0.125 2023-10-03 06:50:49,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.17 vs. limit=15.0 2023-10-03 06:50:50,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 06:50:50,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1176133.3333333333, ans=0.1 2023-10-03 06:50:51,488 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 06:50:51,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:54,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:54,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:50:56,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:50:57,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 06:50:58,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.80 vs. limit=15.0 2023-10-03 06:50:58,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:50:58,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:50:58,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:51:00,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:00,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 06:51:01,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1176200.0, ans=0.0 2023-10-03 06:51:04,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:51:04,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 06:51:06,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:51:11,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:51:12,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.67 vs. limit=15.0 2023-10-03 06:51:15,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 06:51:15,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:51:16,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:17,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-03 06:51:19,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:51:19,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:51:19,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 06:51:20,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:51:22,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:51:23,804 INFO [train.py:1046] (2/4) Epoch 34, batch 1150, loss[loss=0.1602, simple_loss=0.2439, pruned_loss=0.03823, over 23258.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2379, pruned_loss=0.04076, over 4691144.24 frames. ], batch size: 93, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:51:23,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 06:51:23,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:51:23,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 06:51:25,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:51:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:51:25,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:51:31,201 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.910e+02 2.113e+02 2.416e+02 3.603e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 06:51:31,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:33,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:51:34,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:51:35,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:51:35,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 06:51:35,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:51:37,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 06:51:40,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:40,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:51:46,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 06:51:48,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:50,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:52,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:51:52,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 06:51:52,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:51:53,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:51:54,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1176466.6666666667, ans=0.0 2023-10-03 06:51:56,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 06:51:58,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:59,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:52:05,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:52:06,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1176466.6666666667, ans=0.125 2023-10-03 06:52:12,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:52:13,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 06:52:13,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:13,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:19,189 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 06:52:22,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:27,813 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 06:52:31,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:52:33,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:52:33,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:52:34,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:52:37,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:52:38,898 INFO [train.py:1046] (2/4) Epoch 34, batch 1200, loss[loss=0.2095, simple_loss=0.2771, pruned_loss=0.07097, over 19074.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2393, pruned_loss=0.04159, over 4680354.65 frames. ], batch size: 388, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:52:41,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:52:41,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:52:43,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:52:43,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:52:44,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:52:46,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:52:47,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:52:50,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:52:50,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:53,350 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 06:52:56,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 06:52:58,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:53:00,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:53:03,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:53:05,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:53:05,343 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 06:53:05,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:53:08,486 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:53:11,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1176800.0, ans=0.0 2023-10-03 06:53:14,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:53:14,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:53:14,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 06:53:15,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:53:18,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 06:53:22,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 06:53:22,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:53:23,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:53:25,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:53:27,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:53:28,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:53:28,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:53:30,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:53:31,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 06:53:31,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:53:33,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:53:33,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 06:53:34,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:53:34,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:53:39,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:53:41,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:53:43,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 06:53:46,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1176933.3333333333, ans=0.2 2023-10-03 06:53:48,284 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 06:53:49,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:53:50,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.76 vs. limit=15.0 2023-10-03 06:53:52,408 INFO [train.py:1046] (2/4) Epoch 34, batch 1250, loss[loss=0.186, simple_loss=0.2571, pruned_loss=0.0575, over 23398.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2405, pruned_loss=0.04174, over 4693336.85 frames. ], batch size: 285, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:53:52,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:53:53,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:53:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:53:58,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 06:53:59,819 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.873e+02 2.047e+02 2.320e+02 3.578e+02, threshold=4.093e+02, percent-clipped=0.0 2023-10-03 06:54:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:54:03,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 06:54:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:54:05,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:54:10,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:54:10,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:11,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=1177066.6666666667, ans=15.0 2023-10-03 06:54:12,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:54:12,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:54:14,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:54:16,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1177066.6666666667, ans=0.2 2023-10-03 06:54:18,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1177066.6666666667, ans=0.125 2023-10-03 06:54:19,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:54:19,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:54:19,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:54:20,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1177133.3333333333, ans=0.125 2023-10-03 06:54:21,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:54:21,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:24,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:26,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 06:54:29,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 06:54:30,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:54:32,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:54:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 06:54:34,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:34,279 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 06:54:34,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:34,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:40,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:41,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:41,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:54:43,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 06:54:43,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 06:54:43,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 06:54:46,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:54:47,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 06:54:47,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:50,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1177266.6666666667, ans=0.0 2023-10-03 06:54:51,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 06:54:51,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:54:53,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 06:54:54,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:54:54,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:54:54,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 06:54:54,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:54:57,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 06:55:00,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:55:00,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:55:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:55:03,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1177266.6666666667, ans=0.1 2023-10-03 06:55:07,037 INFO [train.py:1046] (2/4) Epoch 34, batch 1300, loss[loss=0.1828, simple_loss=0.2637, pruned_loss=0.05089, over 24438.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.242, pruned_loss=0.04241, over 4680866.05 frames. ], batch size: 77, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:55:07,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:55:09,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:55:09,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 06:55:12,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:55:15,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:55:15,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:55:17,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:55:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:55:19,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1177333.3333333333, ans=0.2 2023-10-03 06:55:20,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 06:55:24,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:55:25,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:55:27,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 06:55:31,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:55:34,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:55:36,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:55:38,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:55:39,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:55:40,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:55:40,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:55:40,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 06:55:47,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:55:47,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:55:50,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 06:55:50,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:55:51,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:55:53,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:55:53,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 06:55:54,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:55:54,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 06:55:56,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:55:59,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:55:59,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:55:59,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1177533.3333333333, ans=0.5 2023-10-03 06:56:03,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 06:56:05,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 06:56:07,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 06:56:10,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:56:13,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 06:56:14,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:56:20,031 INFO [train.py:1046] (2/4) Epoch 34, batch 1350, loss[loss=0.1675, simple_loss=0.2515, pruned_loss=0.04174, over 24012.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2413, pruned_loss=0.04181, over 4694585.80 frames. ], batch size: 80, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:56:20,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 06:56:23,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:56:25,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:56:27,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:56:27,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:56:29,622 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.836e+02 2.071e+02 2.360e+02 3.214e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 06:56:29,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1177666.6666666667, ans=0.1 2023-10-03 06:56:31,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:56:31,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:56:35,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:56:38,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 06:56:40,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:56:40,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:56:43,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 06:56:43,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:56:44,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:56:44,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 06:56:45,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 06:56:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 06:56:48,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:56:48,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 06:56:53,915 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.98 vs. limit=15.0 2023-10-03 06:57:00,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:57:08,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:57:09,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:09,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 06:57:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:13,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.68 vs. limit=15.0 2023-10-03 06:57:13,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 06:57:13,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:57:15,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:57:15,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1177866.6666666667, ans=0.0 2023-10-03 06:57:16,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:57:17,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 06:57:18,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.51 vs. limit=12.0 2023-10-03 06:57:20,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:57:22,788 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.17 vs. limit=15.0 2023-10-03 06:57:26,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 06:57:28,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 06:57:34,241 INFO [train.py:1046] (2/4) Epoch 34, batch 1400, loss[loss=0.1591, simple_loss=0.2398, pruned_loss=0.0392, over 21338.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2391, pruned_loss=0.04126, over 4693800.41 frames. ], batch size: 46, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:57:34,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 06:57:36,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:38,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:57:38,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:57:39,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1178000.0, ans=0.2 2023-10-03 06:57:43,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 06:57:44,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 06:57:54,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:57:57,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:57:59,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:57:59,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:57:59,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1178066.6666666667, ans=0.125 2023-10-03 06:58:04,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:58:04,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 06:58:05,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.62 vs. limit=22.5 2023-10-03 06:58:13,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:14,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:17,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 06:58:18,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:58:20,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:58:20,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:58:21,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:58:23,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:58:23,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:58:24,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:58:24,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 06:58:24,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:58:30,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:31,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:58:40,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 06:58:41,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:58:43,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:58:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 06:58:44,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:58:47,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:58:48,816 INFO [train.py:1046] (2/4) Epoch 34, batch 1450, loss[loss=0.1609, simple_loss=0.2504, pruned_loss=0.03566, over 24673.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2384, pruned_loss=0.04097, over 4704470.53 frames. ], batch size: 68, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:58:49,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1178333.3333333333, ans=0.0 2023-10-03 06:58:50,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:58:53,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:58:53,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:53,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 06:58:53,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1178333.3333333333, ans=0.0 2023-10-03 06:58:57,696 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.834e+02 2.006e+02 2.217e+02 3.059e+02, threshold=4.013e+02, percent-clipped=0.0 2023-10-03 06:58:59,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:58:59,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:59:00,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:59:00,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 06:59:01,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:59:02,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 06:59:03,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:03,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:03,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 06:59:04,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:59:04,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:59:05,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:59:05,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:07,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:59:09,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:12,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:15,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:59:15,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:59:16,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:59:16,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:21,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:21,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:59:21,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:22,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:23,118 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.33 vs. limit=15.0 2023-10-03 06:59:25,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 06:59:28,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:59:30,025 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 06:59:32,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:59:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:59:35,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:59:37,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 06:59:42,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:43,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 06:59:45,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 06:59:45,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:59:49,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:59:49,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:59:52,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 06:59:52,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1178600.0, ans=0.125 2023-10-03 06:59:53,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 06:59:53,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 06:59:54,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:56,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:00:02,342 INFO [train.py:1046] (2/4) Epoch 34, batch 1500, loss[loss=0.1689, simple_loss=0.2423, pruned_loss=0.04772, over 22820.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2387, pruned_loss=0.04052, over 4710840.00 frames. ], batch size: 322, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:00:05,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 07:00:06,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:00:06,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:00:08,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:00:09,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:00:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:00:11,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 07:00:13,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:00:13,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1178666.6666666667, ans=0.125 2023-10-03 07:00:13,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1178666.6666666667, ans=0.125 2023-10-03 07:00:14,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:00:14,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:00:16,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:00:16,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1178733.3333333333, ans=0.125 2023-10-03 07:00:17,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:00:18,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:00:23,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:00:24,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 07:00:24,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:00:25,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:00:25,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1178733.3333333333, ans=0.1 2023-10-03 07:00:26,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:00:30,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 07:00:33,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 07:00:33,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:00:34,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 07:00:37,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:00:39,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:00:40,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:00:40,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:00:42,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 07:00:43,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:00:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:00:45,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 07:00:45,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:00:49,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:00:49,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 07:00:55,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:00:56,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:00:58,511 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 07:00:59,272 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.05 vs. limit=6.0 2023-10-03 07:00:59,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:00:59,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 07:01:02,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:03,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:01:03,916 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 07:01:05,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:01:05,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1178933.3333333333, ans=0.0 2023-10-03 07:01:08,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 07:01:08,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1178933.3333333333, ans=0.1 2023-10-03 07:01:09,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:14,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:01:14,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:15,347 INFO [train.py:1046] (2/4) Epoch 34, batch 1550, loss[loss=0.161, simple_loss=0.236, pruned_loss=0.04303, over 23706.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2395, pruned_loss=0.0408, over 4705844.11 frames. ], batch size: 232, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:01:15,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:01:15,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:17,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:01:17,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 07:01:18,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 07:01:18,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:01:20,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 07:01:20,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 07:01:22,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:01:22,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:24,226 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.899e+02 2.097e+02 2.316e+02 2.849e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 07:01:24,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:01:24,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:01:24,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:25,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:28,704 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 07:01:28,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:28,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:01:30,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:01:32,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:01:32,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 07:01:34,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:01:34,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 07:01:36,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 07:01:36,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 07:01:38,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:38,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:01:40,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.61 vs. limit=10.0 2023-10-03 07:01:42,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:01:45,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 07:01:45,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 07:01:52,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:01:55,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:01:56,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:01:56,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:01:56,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 07:02:04,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:02:05,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:06,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.07 vs. limit=22.5 2023-10-03 07:02:08,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:02:10,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:02:10,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:02:10,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 07:02:10,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:02:14,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:02:14,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:14,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 07:02:14,470 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 07:02:15,285 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:02:17,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:18,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-10-03 07:02:22,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 07:02:25,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1179266.6666666667, ans=0.125 2023-10-03 07:02:28,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:02:29,297 INFO [train.py:1046] (2/4) Epoch 34, batch 1600, loss[loss=0.1369, simple_loss=0.2134, pruned_loss=0.03026, over 24313.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2397, pruned_loss=0.04088, over 4718950.02 frames. ], batch size: 56, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 07:02:29,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:30,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 07:02:32,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:02:33,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:02:33,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:02:33,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:02:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:02:38,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:38,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 07:02:38,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 07:02:41,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 07:02:43,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:02:44,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 07:02:44,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:02:45,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.57 vs. limit=15.0 2023-10-03 07:02:47,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:02:52,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:02:55,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 07:02:58,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:02:59,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 07:02:59,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:59,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 07:03:03,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 07:03:12,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:03:14,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 07:03:14,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:03:14,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:03:14,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:03:17,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 07:03:17,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1179533.3333333333, ans=0.1 2023-10-03 07:03:22,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:03:25,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:03:25,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:26,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:26,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:03:28,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:03:29,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:03:30,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:03:31,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1179600.0, ans=0.1 2023-10-03 07:03:36,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:37,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:03:39,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 07:03:39,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:03:39,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 07:03:44,170 INFO [train.py:1046] (2/4) Epoch 34, batch 1650, loss[loss=0.1721, simple_loss=0.2424, pruned_loss=0.05083, over 23565.00 frames. ], tot_loss[loss=0.162, simple_loss=0.241, pruned_loss=0.04156, over 4707212.64 frames. ], batch size: 256, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 07:03:45,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:03:45,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:03:47,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:03:47,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 07:03:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 07:03:47,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 07:03:47,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 07:03:51,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:51,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1179666.6666666667, ans=0.125 2023-10-03 07:03:52,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:03:52,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:03:52,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1179666.6666666667, ans=0.0 2023-10-03 07:03:54,532 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.876e+02 2.077e+02 2.336e+02 3.555e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-03 07:03:54,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:03:57,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:03:58,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 07:03:58,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1179733.3333333333, ans=0.0 2023-10-03 07:04:00,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:04:00,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:04:00,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:04:00,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:04:01,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 07:04:01,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 07:04:07,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:04:09,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:04:09,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1179733.3333333333, ans=0.0 2023-10-03 07:04:18,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 07:04:19,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 07:04:23,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:25,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:04:27,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:04:27,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:28,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:04:28,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:31,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:04:32,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:32,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:04:32,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:04:34,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:04:35,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:04:38,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1179866.6666666667, ans=0.5 2023-10-03 07:04:39,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:04:41,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 07:04:43,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:04:43,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 07:04:44,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 07:04:44,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 07:04:44,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:04:46,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:04:46,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:46,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:46,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 07:04:50,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:50,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1179933.3333333333, ans=0.0 2023-10-03 07:04:52,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:04:52,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:52,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.44 vs. limit=10.0 2023-10-03 07:04:55,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 07:04:58,445 INFO [train.py:1046] (2/4) Epoch 34, batch 1700, loss[loss=0.1535, simple_loss=0.2441, pruned_loss=0.03144, over 24551.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2411, pruned_loss=0.04154, over 4704184.04 frames. ], batch size: 71, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:04:59,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:59,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:04:59,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 07:05:00,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1180000.0, ans=0.1 2023-10-03 07:05:01,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:05:01,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:05:01,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:05:04,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:05:04,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:05:04,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 07:05:06,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:05:12,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:05:13,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1180066.6666666667, ans=0.125 2023-10-03 07:05:17,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:05:22,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:05:22,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:05:22,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1180066.6666666667, ans=0.5 2023-10-03 07:05:23,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:05:23,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:05:26,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 07:05:28,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:05:28,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:29,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:05:29,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:05:31,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1180133.3333333333, ans=0.0 2023-10-03 07:05:32,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 07:05:32,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 07:05:32,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:35,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 07:05:36,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:05:42,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1180200.0, ans=0.1 2023-10-03 07:05:46,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:05:46,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:05:47,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:05:49,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:05:49,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 07:05:49,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:05:49,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1180200.0, ans=0.125 2023-10-03 07:05:52,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:52,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 07:05:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:05:52,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:05:52,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:52,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:05:55,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:05:55,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:05:55,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:05:55,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1180200.0, ans=0.125 2023-10-03 07:05:57,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:05:57,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:01,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:02,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 07:06:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:06,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:08,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 07:06:12,496 INFO [train.py:1046] (2/4) Epoch 34, batch 1750, loss[loss=0.1625, simple_loss=0.2479, pruned_loss=0.03853, over 24679.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.239, pruned_loss=0.04127, over 4697950.38 frames. ], batch size: 65, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:06:15,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:17,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:06:17,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:06:17,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 07:06:19,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:06:20,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:06:20,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:23,233 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.852e+02 2.086e+02 2.272e+02 3.301e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 07:06:24,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 07:06:26,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:06:29,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 07:06:29,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:06:30,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:06:34,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:06:36,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1180400.0, ans=0.125 2023-10-03 07:06:37,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 07:06:37,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1180400.0, ans=0.09899494936611666 2023-10-03 07:06:38,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:06:39,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 07:06:48,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:06:50,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1180466.6666666667, ans=0.1 2023-10-03 07:06:51,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:06:51,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:55,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:55,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:57,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:58,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:07:01,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:03,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 07:07:04,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:07:06,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 07:07:07,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:07:10,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:07:10,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:07:14,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:07:16,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 07:07:16,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:16,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1180600.0, ans=0.125 2023-10-03 07:07:18,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:07:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:07:23,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:07:24,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:07:25,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 07:07:25,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:07:26,714 INFO [train.py:1046] (2/4) Epoch 34, batch 1800, loss[loss=0.1591, simple_loss=0.2298, pruned_loss=0.04426, over 23424.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.238, pruned_loss=0.0406, over 4695706.08 frames. ], batch size: 285, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:07:26,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:07:26,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:26,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:07:26,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:07:28,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:07:31,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:07:32,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:34,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:07:34,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1180666.6666666667, ans=0.0 2023-10-03 07:07:37,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:07:38,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:07:39,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:07:43,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:07:45,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:45,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:47,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:07:47,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:07:47,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 07:07:49,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:53,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:56,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 07:07:59,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 07:07:59,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 07:08:00,641 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.16 vs. limit=22.5 2023-10-03 07:08:01,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:01,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:08:01,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:08:02,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:08:02,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1180800.0, ans=0.5 2023-10-03 07:08:04,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1180800.0, ans=0.125 2023-10-03 07:08:08,289 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 07:08:09,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:08:11,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:12,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 07:08:12,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 07:08:12,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:08:14,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:08:15,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:08:21,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 07:08:25,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:08:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 07:08:25,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1180933.3333333333, ans=0.125 2023-10-03 07:08:26,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:08:26,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:26,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:08:28,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 07:08:31,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:08:31,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:08:35,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 07:08:35,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:36,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:08:36,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:08:36,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:39,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:39,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:08:40,847 INFO [train.py:1046] (2/4) Epoch 34, batch 1850, loss[loss=0.1452, simple_loss=0.2282, pruned_loss=0.03107, over 24568.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04084, over 4699818.29 frames. ], batch size: 60, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:08:42,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:08:42,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:08:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:08:45,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:08:49,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.whiten.whitening_limit, batch_count=1181000.0, ans=12.0 2023-10-03 07:08:51,418 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.930e+02 2.238e+02 2.527e+02 4.034e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-03 07:08:54,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:08:54,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 07:08:57,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 07:08:58,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.72 vs. limit=15.0 2023-10-03 07:09:00,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 07:09:03,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:09:03,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 07:09:03,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 07:09:03,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1181066.6666666667, ans=0.1 2023-10-03 07:09:13,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:09:15,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 07:09:15,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1181133.3333333333, ans=0.1 2023-10-03 07:09:16,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:09:17,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1181133.3333333333, ans=0.0 2023-10-03 07:09:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:09:21,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 07:09:22,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:22,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:09:24,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:09:26,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:09:29,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:09:31,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:09:32,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:32,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:09:32,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:09:33,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:09:36,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:09:38,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 07:09:39,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:09:41,467 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.73 vs. limit=15.0 2023-10-03 07:09:42,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:09:43,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:09:43,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 07:09:43,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 07:09:43,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1181266.6666666667, ans=0.125 2023-10-03 07:09:46,756 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 07:09:48,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 07:09:48,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:09:48,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:09:48,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:09:50,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:50,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 07:09:50,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:09:51,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:52,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:09:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:09:56,150 INFO [train.py:1046] (2/4) Epoch 34, batch 1900, loss[loss=0.1705, simple_loss=0.2437, pruned_loss=0.04861, over 23830.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2399, pruned_loss=0.04092, over 4711596.22 frames. ], batch size: 164, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:09:57,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:09:57,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 07:09:59,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1181333.3333333333, ans=0.0 2023-10-03 07:10:00,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:10:00,401 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 07:10:00,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:10:01,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:10:05,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:10:07,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:10:09,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 07:10:10,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 07:10:10,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:10:12,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:10:12,031 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 07:10:12,064 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 07:10:12,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1181400.0, ans=0.125 2023-10-03 07:10:16,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 07:10:16,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:10:19,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 07:10:21,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 07:10:31,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 07:10:34,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 07:10:34,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:10:35,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 07:10:35,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 07:10:35,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 07:10:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 07:10:35,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:10:40,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 07:10:42,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1181533.3333333333, ans=0.125 2023-10-03 07:10:43,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:10:46,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1181533.3333333333, ans=0.125 2023-10-03 07:10:47,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:10:47,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 07:10:49,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:10:53,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 07:10:53,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:10:58,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:10:58,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:11:00,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:11:00,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:11:01,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:11:01,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:11:02,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:11:06,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:11:06,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:11:09,529 INFO [train.py:1046] (2/4) Epoch 34, batch 1950, loss[loss=0.1418, simple_loss=0.224, pruned_loss=0.0298, over 24315.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2407, pruned_loss=0.04104, over 4709401.26 frames. ], batch size: 61, lr: 3.00e-03, grad_scale: 8.0 2023-10-03 07:11:09,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:11:09,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:11:09,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:11:11,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:11:14,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:11:16,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:11:16,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:16,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:11:19,956 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.878e+02 2.003e+02 2.217e+02 3.435e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-03 07:11:20,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 07:11:20,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 07:11:20,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:22,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:25,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:11:25,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:11:25,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:28,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:11:31,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:11:31,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:11:32,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:11:32,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:39,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:11:39,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:11:39,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:11:39,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 07:11:41,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:11:41,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:11:42,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:47,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:48,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:11:51,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1181800.0, ans=0.0 2023-10-03 07:11:52,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:11:54,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:11:56,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:11:56,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 07:11:56,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:11:59,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:12:00,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:12:02,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:12:05,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1181866.6666666667, ans=0.125 2023-10-03 07:12:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:08,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:10,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:12,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:12:13,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:12:15,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:12:15,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 07:12:15,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:12:15,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1181933.3333333333, ans=0.0 2023-10-03 07:12:17,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:12:17,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 07:12:19,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:12:24,232 INFO [train.py:1046] (2/4) Epoch 34, batch 2000, loss[loss=0.158, simple_loss=0.2319, pruned_loss=0.04211, over 23298.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2421, pruned_loss=0.04188, over 4705216.72 frames. ], batch size: 119, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:12:24,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:12:25,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:12:26,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:12:29,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:12:30,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:33,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 07:12:34,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:12:36,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:12:36,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.21 vs. limit=15.0 2023-10-03 07:12:37,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 07:12:37,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:12:37,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:12:41,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:12:41,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 07:12:42,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:44,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:44,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:46,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 07:12:46,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:12:48,266 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=15.0 2023-10-03 07:12:48,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 07:12:48,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:12:53,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:12:53,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 07:12:53,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:55,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:12:55,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:12:57,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 07:12:58,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1182133.3333333333, ans=0.125 2023-10-03 07:13:00,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 07:13:00,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:13:00,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:05,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:06,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:13:06,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:13:06,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:13:09,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:13:10,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:12,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:13:12,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:12,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1182200.0, ans=0.125 2023-10-03 07:13:13,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:16,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:13:16,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 07:13:19,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:13:21,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:26,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:26,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:13:29,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:31,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:13:31,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:33,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:13:33,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:13:37,622 INFO [train.py:1046] (2/4) Epoch 34, batch 2050, loss[loss=0.163, simple_loss=0.2352, pruned_loss=0.04544, over 23906.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.241, pruned_loss=0.04179, over 4695349.46 frames. ], batch size: 195, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:13:37,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:37,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:38,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1182333.3333333333, ans=0.125 2023-10-03 07:13:39,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1182333.3333333333, ans=0.125 2023-10-03 07:13:40,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:13:41,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:44,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:13:46,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:13:47,556 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.844e+02 2.070e+02 2.334e+02 4.430e+02, threshold=4.140e+02, percent-clipped=1.0 2023-10-03 07:13:47,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:47,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:13:50,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 07:13:50,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:13:52,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:52,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:14:02,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:14:03,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:14:05,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 07:14:07,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:14:09,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 07:14:10,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:14:13,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:14:14,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:16,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:14:16,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:14:17,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:14:17,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:14:19,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:14:22,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:25,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:14:25,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1182533.3333333333, ans=0.2 2023-10-03 07:14:27,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:14:27,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1182533.3333333333, ans=0.125 2023-10-03 07:14:28,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:14:34,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:14:37,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:14:39,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 07:14:44,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:14:44,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:14:47,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:14:47,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1182600.0, ans=0.0 2023-10-03 07:14:49,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 07:14:52,169 INFO [train.py:1046] (2/4) Epoch 34, batch 2100, loss[loss=0.1637, simple_loss=0.2358, pruned_loss=0.04578, over 23545.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2398, pruned_loss=0.0416, over 4688629.22 frames. ], batch size: 134, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:14:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 07:14:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:14:54,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:54,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:14:54,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:14:54,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 07:14:56,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 07:14:57,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:15:01,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:15:02,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:15:04,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:04,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:15:05,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 07:15:06,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:15:06,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 07:15:06,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 07:15:08,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:09,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:15:09,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 07:15:09,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 07:15:09,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1182733.3333333333, ans=0.0 2023-10-03 07:15:14,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 07:15:14,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:15:17,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:15:19,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:15:23,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:15:23,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 07:15:23,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:23,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 07:15:25,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 07:15:25,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:25,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 07:15:25,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 07:15:26,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.68 vs. limit=15.0 2023-10-03 07:15:27,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 07:15:28,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:15:30,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:15:33,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:15:34,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:15:35,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:36,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 07:15:36,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:38,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:38,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:38,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 07:15:39,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 07:15:39,750 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:15:40,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 07:15:43,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:15:47,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:15:47,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 07:15:52,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:55,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:15:55,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:15:55,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:15:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 07:15:55,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:15:57,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:57,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:15:59,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:15:59,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:01,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 07:16:02,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 07:16:02,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:05,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:16:05,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:16:06,356 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.36 vs. limit=10.0 2023-10-03 07:16:06,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:16:06,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:16:08,079 INFO [train.py:1046] (2/4) Epoch 34, batch 2150, loss[loss=0.1601, simple_loss=0.2401, pruned_loss=0.04002, over 23613.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2392, pruned_loss=0.0411, over 4690135.61 frames. ], batch size: 149, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:16:13,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 07:16:14,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:16,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:16,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1183000.0, ans=0.125 2023-10-03 07:16:17,655 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.865e+02 1.979e+02 2.220e+02 3.230e+02, threshold=3.959e+02, percent-clipped=0.0 2023-10-03 07:16:17,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:16:17,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:19,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:16:21,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:21,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:16:21,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:16:26,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:26,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 07:16:29,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:31,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:16:32,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:33,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:33,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:35,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:16:35,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:36,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:16:37,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:16:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 07:16:40,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:16:41,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:43,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:43,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:16:44,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:16:47,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:47,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:16:49,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:49,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 07:16:49,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:16:52,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:52,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:53,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:55,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:16:56,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:16:56,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:56,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 07:16:59,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 07:16:59,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:17:00,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 07:17:00,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:00,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:17:02,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 07:17:02,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:17:02,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 07:17:02,710 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 07:17:02,711 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 07:17:04,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 07:17:06,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:06,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:17:06,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:17:07,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:07,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:17:10,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:10,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:17,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:17:18,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 07:17:21,358 INFO [train.py:1046] (2/4) Epoch 34, batch 2200, loss[loss=0.1789, simple_loss=0.2442, pruned_loss=0.05679, over 23907.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2402, pruned_loss=0.04121, over 4700817.37 frames. ], batch size: 195, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:17:22,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:17:27,937 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.16 vs. limit=15.0 2023-10-03 07:17:28,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:28,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:17:30,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:17:30,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:17:30,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1183333.3333333333, ans=0.125 2023-10-03 07:17:31,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:31,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:17:32,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 07:17:38,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 07:17:39,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:17:45,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 07:17:46,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:46,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:17:48,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:17:51,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:17:51,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 07:17:55,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:17:55,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:57,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 07:18:00,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:18:01,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:03,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:18:05,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:08,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 07:18:08,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:08,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1183533.3333333333, ans=0.0 2023-10-03 07:18:08,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1183533.3333333333, ans=0.125 2023-10-03 07:18:11,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 07:18:13,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:13,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:18:13,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:16,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:18:16,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:16,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:17,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.58 vs. limit=22.5 2023-10-03 07:18:18,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:19,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:18:19,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:18:22,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:18:23,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:18:24,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:18:26,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:18:28,248 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 07:18:29,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:18:29,752 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 07:18:31,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:18:31,164 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 07:18:34,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:18:35,751 INFO [train.py:1046] (2/4) Epoch 34, batch 2250, loss[loss=0.1517, simple_loss=0.2311, pruned_loss=0.03617, over 24350.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2409, pruned_loss=0.0413, over 4701286.08 frames. ], batch size: 56, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:18:36,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:18:36,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:18:36,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1183666.6666666667, ans=0.0 2023-10-03 07:18:39,651 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 07:18:41,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:18:42,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:18:45,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:18:46,635 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.861e+02 2.051e+02 2.381e+02 3.141e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-03 07:18:47,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:18:48,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1183666.6666666667, ans=0.1 2023-10-03 07:18:50,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:18:51,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:18:53,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:18:54,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 07:18:54,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:54,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:18:57,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 07:18:57,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:57,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:18:58,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1183733.3333333333, ans=0.2 2023-10-03 07:18:59,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:19:07,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:19:09,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:19:09,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:19:09,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1183800.0, ans=0.2 2023-10-03 07:19:11,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 07:19:12,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:19:13,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:19:18,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:19:19,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:19:21,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:19:21,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:19:22,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:19:25,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:19:29,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:19:30,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:19:35,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:19:37,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:19:37,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:19:41,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1183933.3333333333, ans=0.0 2023-10-03 07:19:42,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:19:43,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:19:43,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 07:19:45,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:45,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:19:47,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 07:19:51,017 INFO [train.py:1046] (2/4) Epoch 34, batch 2300, loss[loss=0.166, simple_loss=0.2463, pruned_loss=0.04281, over 23339.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2416, pruned_loss=0.04162, over 4691631.32 frames. ], batch size: 105, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:19:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:19:51,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:56,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:58,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:19:59,526 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 07:20:00,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:09,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:20:09,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:20:09,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:10,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:10,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 07:20:10,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:20:14,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:20:14,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:20:17,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:20:18,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:20:22,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:20:25,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:20:26,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:28,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1184133.3333333333, ans=0.1 2023-10-03 07:20:29,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:20:29,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1184133.3333333333, ans=0.95 2023-10-03 07:20:32,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:20:34,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.03 vs. limit=12.0 2023-10-03 07:20:36,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:20:37,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:20:37,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:20:37,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 07:20:41,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:20:41,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:41,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:20:42,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1184200.0, ans=0.0 2023-10-03 07:20:43,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:20:43,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:20:43,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 07:20:43,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:20:45,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 07:20:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:20:45,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:45,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 07:20:49,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:20:52,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:20:56,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:20:57,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:20:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:21:00,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:21:00,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:21:00,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:21:01,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 07:21:04,714 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.78 vs. limit=22.5 2023-10-03 07:21:05,069 INFO [train.py:1046] (2/4) Epoch 34, batch 2350, loss[loss=0.1661, simple_loss=0.2323, pruned_loss=0.04993, over 22673.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2429, pruned_loss=0.04287, over 4674292.82 frames. ], batch size: 322, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:21:08,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:21:08,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 07:21:13,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 07:21:16,332 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.838e+02 1.985e+02 2.277e+02 3.152e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 07:21:17,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:21:20,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:20,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:21,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:21:21,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:21:23,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 07:21:26,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:21:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 07:21:31,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:21:35,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:21:35,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:21:36,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:21:39,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 07:21:39,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:21:42,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:21:42,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:21:42,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:21:46,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:21:46,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 07:21:47,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:21:49,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1184533.3333333333, ans=15.0 2023-10-03 07:21:50,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:50,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:21:51,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=15.0 2023-10-03 07:21:51,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 07:21:53,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:21:55,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 07:21:55,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:22:00,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 07:22:04,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 07:22:04,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:22:04,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 07:22:06,934 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 07:22:06,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 07:22:08,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 07:22:11,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:22:16,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:22:19,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:22:20,813 INFO [train.py:1046] (2/4) Epoch 34, batch 2400, loss[loss=0.1646, simple_loss=0.2332, pruned_loss=0.04805, over 23849.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2416, pruned_loss=0.04249, over 4673996.06 frames. ], batch size: 195, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:22:22,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:22:23,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 07:22:23,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 07:22:25,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1184666.6666666667, ans=0.2 2023-10-03 07:22:29,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:22:29,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:22:31,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 07:22:31,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:22:33,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:33,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 07:22:35,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1184733.3333333333, ans=0.125 2023-10-03 07:22:40,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:41,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 07:22:46,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:22:52,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 07:22:54,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:22:55,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:57,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1184800.0, ans=0.05 2023-10-03 07:22:58,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:22:59,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 07:23:00,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:23:09,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:11,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:23:14,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:14,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:23:15,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:23:15,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:23:15,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:17,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:23:17,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:23:19,588 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:23:21,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:23:21,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:23:21,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 07:23:23,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 07:23:26,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:23:26,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:27,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 07:23:27,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 07:23:28,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 07:23:28,730 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 07:23:28,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 07:23:29,545 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.45 vs. limit=15.0 2023-10-03 07:23:30,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:23:31,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:23:31,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:23:33,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 07:23:33,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1185000.0, ans=0.125 2023-10-03 07:23:34,291 INFO [train.py:1046] (2/4) Epoch 34, batch 2450, loss[loss=0.1657, simple_loss=0.2373, pruned_loss=0.04704, over 23778.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2406, pruned_loss=0.04196, over 4694519.33 frames. ], batch size: 212, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:23:34,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:23:34,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:23:34,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1185000.0, ans=0.125 2023-10-03 07:23:37,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:23:37,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:23:41,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:41,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:23:43,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 07:23:47,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.892e+02 2.133e+02 2.562e+02 4.061e+02, threshold=4.265e+02, percent-clipped=1.0 2023-10-03 07:23:47,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:23:47,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:50,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:23:50,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:23:50,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:23:50,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 07:23:54,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:56,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:23:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:24:01,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:24:01,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:03,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:03,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:24:04,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 07:24:04,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:24:12,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:12,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:24:14,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:24:14,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:24:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:15,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:24:17,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 07:24:20,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1185200.0, ans=0.125 2023-10-03 07:24:21,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:22,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:24:25,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:24:25,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:24:29,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:24:29,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 07:24:31,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:24:32,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:24:32,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 07:24:32,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:24:34,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:24:37,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:24:40,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:41,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:24:44,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 07:24:47,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:24:49,888 INFO [train.py:1046] (2/4) Epoch 34, batch 2500, loss[loss=0.1517, simple_loss=0.2232, pruned_loss=0.0401, over 23482.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2393, pruned_loss=0.04157, over 4696629.25 frames. ], batch size: 285, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:24:51,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:25:01,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:25:01,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:25:01,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:25:01,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 07:25:08,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:25:09,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:25:11,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:25:11,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:25:11,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1185400.0, ans=0.1 2023-10-03 07:25:12,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 07:25:14,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:14,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:25:15,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 07:25:15,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:17,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 07:25:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:23,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:25:25,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:25:27,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1185466.6666666667, ans=0.0 2023-10-03 07:25:28,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:25:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 07:25:28,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:25:29,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:32,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:36,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:39,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:25:43,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:25:44,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.64 vs. limit=15.0 2023-10-03 07:25:46,108 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.68 vs. limit=6.0 2023-10-03 07:25:46,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 07:25:46,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:25:46,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:25:48,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:25:48,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:25:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 07:25:50,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 07:25:50,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 07:25:52,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1185600.0, ans=0.125 2023-10-03 07:25:54,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:56,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 07:25:56,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.92 vs. limit=15.0 2023-10-03 07:25:57,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 07:25:57,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:25:59,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 07:26:03,116 INFO [train.py:1046] (2/4) Epoch 34, batch 2550, loss[loss=0.1544, simple_loss=0.237, pruned_loss=0.03591, over 24640.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2399, pruned_loss=0.04149, over 4696706.39 frames. ], batch size: 65, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:26:03,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 07:26:04,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:26:04,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:26:06,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:26:07,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:26:09,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 07:26:09,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:26:13,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 07:26:13,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:26:15,134 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.964e+02 2.224e+02 2.724e+02 4.382e+02, threshold=4.447e+02, percent-clipped=1.0 2023-10-03 07:26:15,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:18,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:26:18,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 07:26:18,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:26:18,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:26:20,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:26:21,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:26:21,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 07:26:21,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:26:21,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 07:26:30,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1185733.3333333333, ans=0.125 2023-10-03 07:26:36,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:26:39,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1185800.0, ans=0.2 2023-10-03 07:26:41,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:26:41,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:41,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1185800.0, ans=15.0 2023-10-03 07:26:42,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:26:43,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:26:49,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:26:52,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:26:52,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1185866.6666666667, ans=0.0 2023-10-03 07:26:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:26:54,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:26:54,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:26:54,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:26:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:26:57,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:02,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:27:02,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 07:27:02,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:27:02,326 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:27:03,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:04,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:27:04,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:27:05,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:08,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1185933.3333333333, ans=0.1 2023-10-03 07:27:10,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1185933.3333333333, ans=0.0 2023-10-03 07:27:11,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:27:13,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:14,924 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 07:27:16,933 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 07:27:16,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:27:18,263 INFO [train.py:1046] (2/4) Epoch 34, batch 2600, loss[loss=0.1546, simple_loss=0.2377, pruned_loss=0.03572, over 24309.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2403, pruned_loss=0.04142, over 4706481.36 frames. ], batch size: 61, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:27:18,342 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 07:27:19,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 07:27:19,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 07:27:22,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:27:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 07:27:24,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 07:27:26,176 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 07:27:26,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1186000.0, ans=0.0 2023-10-03 07:27:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:27:29,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 07:27:30,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 07:27:31,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:27:31,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 07:27:33,285 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 07:27:33,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 07:27:38,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:27:38,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:27:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 07:27:42,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:27:47,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=15.0 2023-10-03 07:27:47,816 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 07:27:54,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:55,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:27:57,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 07:27:57,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:57,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:27:58,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 07:28:00,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:28:00,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:28:01,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:03,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1186200.0, ans=0.125 2023-10-03 07:28:04,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 07:28:05,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:05,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:28:08,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:28:09,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:28:09,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 07:28:13,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:28:15,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:28:16,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:28:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 07:28:21,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1186266.6666666667, ans=0.125 2023-10-03 07:28:22,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:23,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:28:28,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 07:28:28,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:28,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:28:29,655 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 07:28:29,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:28:32,263 INFO [train.py:1046] (2/4) Epoch 34, batch 2650, loss[loss=0.1687, simple_loss=0.2501, pruned_loss=0.0437, over 23471.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2412, pruned_loss=0.04154, over 4719504.21 frames. ], batch size: 93, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:28:32,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:35,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:28:36,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:28:39,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 07:28:40,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:28:41,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:28:43,949 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.874e+02 2.143e+02 2.497e+02 3.678e+02, threshold=4.285e+02, percent-clipped=0.0 2023-10-03 07:28:44,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 07:28:47,245 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 07:28:49,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:28:51,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 07:28:52,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:28:53,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 07:28:57,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.23 vs. limit=15.0 2023-10-03 07:28:57,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:28:57,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:28:57,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:28:57,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:03,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 07:29:03,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 07:29:06,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:29:09,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 07:29:09,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:29:09,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:09,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:29:10,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:29:10,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:29:11,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:29:15,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:29:16,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:29:16,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:29:19,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:29:20,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:22,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:29:22,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:22,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1186533.3333333333, ans=0.125 2023-10-03 07:29:25,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:29:25,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:29:29,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:30,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:29:30,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:30,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 07:29:33,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1186600.0, ans=0.125 2023-10-03 07:29:37,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:29:37,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:38,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:38,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:38,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1186600.0, ans=0.05 2023-10-03 07:29:39,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:29:39,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:42,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:29:42,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 07:29:45,424 INFO [train.py:1046] (2/4) Epoch 34, batch 2700, loss[loss=0.1679, simple_loss=0.2311, pruned_loss=0.05236, over 23666.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2417, pruned_loss=0.04149, over 4723050.45 frames. ], batch size: 256, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:29:45,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:29:48,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 07:29:51,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:29:51,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:53,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:29:53,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:53,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:29:53,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:29:53,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 07:29:55,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:29:55,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1186666.6666666667, ans=0.2 2023-10-03 07:29:56,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:29:57,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:29:59,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:30:02,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:30:03,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 07:30:03,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:30:08,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:30:08,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:13,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:30:13,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:30:13,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:30:13,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:30:17,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:30:19,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:30:19,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:30:19,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:30:27,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:27,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:30:33,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:30:33,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:30:37,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:30:37,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:40,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:42,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:30:43,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:30:44,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:30:46,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:46,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:30:47,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:30:49,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:49,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:49,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1186933.3333333333, ans=0.125 2023-10-03 07:30:52,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 07:30:53,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:30:55,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 07:30:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 07:30:56,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:59,393 INFO [train.py:1046] (2/4) Epoch 34, batch 2750, loss[loss=0.1664, simple_loss=0.2494, pruned_loss=0.04166, over 24356.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2421, pruned_loss=0.04144, over 4726291.70 frames. ], batch size: 77, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:30:59,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:30:59,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:31:02,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:02,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:31:03,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.99 vs. limit=6.0 2023-10-03 07:31:03,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:06,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:06,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:31:07,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:31:07,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:07,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 07:31:07,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:31:07,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:31:09,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.55 vs. limit=22.5 2023-10-03 07:31:10,302 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.859e+02 2.069e+02 2.359e+02 3.471e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 07:31:17,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 07:31:19,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:31:19,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:19,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:31:20,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:31:21,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:31:22,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:31:23,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:24,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:27,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1187133.3333333333, ans=0.125 2023-10-03 07:31:28,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:31:28,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:31:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:31:30,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:31,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:31:38,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:38,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1187133.3333333333, ans=0.125 2023-10-03 07:31:39,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:31:40,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:43,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:43,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:31:43,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:31:49,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:31:49,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:31:49,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 07:31:53,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.86 vs. limit=10.0 2023-10-03 07:31:53,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:56,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 07:32:00,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:32:03,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:32:03,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 07:32:04,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:32:06,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:32:06,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 07:32:06,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:32:07,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 07:32:09,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:09,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:09,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 07:32:09,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:09,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.29 vs. limit=15.0 2023-10-03 07:32:11,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:11,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1187333.3333333333, ans=0.125 2023-10-03 07:32:12,364 INFO [train.py:1046] (2/4) Epoch 34, batch 2800, loss[loss=0.1521, simple_loss=0.2051, pruned_loss=0.04956, over 18732.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2406, pruned_loss=0.04132, over 4712375.46 frames. ], batch size: 388, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:32:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:13,818 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 07:32:13,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 07:32:16,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:20,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:32:20,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:32:23,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:32:25,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 07:32:26,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 07:32:28,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 07:32:28,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:29,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:32:29,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:32:33,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:32:33,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:33,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:32:35,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:32:41,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:32:44,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:46,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:46,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:32:48,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:32:49,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.52 vs. limit=15.0 2023-10-03 07:32:52,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:32:52,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 07:32:54,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:54,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:32:54,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:32:55,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1187466.6666666667, ans=0.125 2023-10-03 07:32:58,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:59,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:01,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:33:04,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:33:04,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:04,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:33:05,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:33:05,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:33:07,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:33:07,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 07:33:07,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:33:08,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:09,561 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.66 vs. limit=15.0 2023-10-03 07:33:10,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 07:33:12,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:33:12,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:33:13,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:33:14,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 07:33:22,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:33:22,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:33:23,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:33:25,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:33:27,108 INFO [train.py:1046] (2/4) Epoch 34, batch 2850, loss[loss=0.1355, simple_loss=0.2109, pruned_loss=0.03006, over 15447.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2389, pruned_loss=0.0407, over 4695124.09 frames. ], batch size: 33, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:33:29,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:33:29,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:33:29,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:33:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:33:34,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:35,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:33:35,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 07:33:37,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1187666.6666666667, ans=0.2 2023-10-03 07:33:38,279 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.832e+02 1.949e+02 2.119e+02 3.126e+02, threshold=3.897e+02, percent-clipped=0.0 2023-10-03 07:33:41,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 07:33:41,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:33:42,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 07:33:44,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:47,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 07:33:47,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 07:33:48,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:01,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:03,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:34:03,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:34:03,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:34:03,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:34:03,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:34:06,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:34:06,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 07:34:07,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:34:08,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:34:08,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:08,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:09,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.90 vs. limit=15.0 2023-10-03 07:34:12,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:12,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:14,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:14,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:34:16,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:34:17,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:17,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:20,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:34:25,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:34:27,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 07:34:27,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 07:34:27,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1187933.3333333333, ans=0.125 2023-10-03 07:34:27,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1187933.3333333333, ans=0.125 2023-10-03 07:34:28,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:34:28,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:34:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 07:34:31,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:34:31,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:34:31,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:34:32,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:34:32,639 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 07:34:32,687 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 07:34:32,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:34:32,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:36,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:34:36,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:34:38,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:34:38,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 07:34:40,968 INFO [train.py:1046] (2/4) Epoch 34, batch 2900, loss[loss=0.1497, simple_loss=0.2351, pruned_loss=0.03218, over 24468.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2387, pruned_loss=0.04053, over 4708197.73 frames. ], batch size: 66, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:34:41,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:41,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 07:34:42,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 07:34:44,876 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-10-03 07:34:45,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:34:45,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:34:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:48,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:53,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:34:53,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:56,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:34:57,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 07:34:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:34:59,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:01,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1188066.6666666667, ans=0.125 2023-10-03 07:35:01,510 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.10 vs. limit=15.0 2023-10-03 07:35:02,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 07:35:02,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 07:35:05,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:35:05,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 07:35:05,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:35:05,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1188066.6666666667, ans=0.0 2023-10-03 07:35:08,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:35:08,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:35:11,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:35:13,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:15,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:35:16,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1188133.3333333333, ans=0.0 2023-10-03 07:35:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:19,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1188133.3333333333, ans=0.1 2023-10-03 07:35:21,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 07:35:21,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 07:35:21,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:35:23,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1188133.3333333333, ans=0.0 2023-10-03 07:35:26,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:35:27,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 07:35:28,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:35:31,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1188200.0, ans=0.95 2023-10-03 07:35:34,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:43,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:35:43,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:35:43,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 07:35:46,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:46,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 07:35:48,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:35:48,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:35:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:35:54,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 07:35:55,837 INFO [train.py:1046] (2/4) Epoch 34, batch 2950, loss[loss=0.1474, simple_loss=0.2223, pruned_loss=0.03629, over 24346.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2392, pruned_loss=0.04045, over 4717239.30 frames. ], batch size: 56, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:35:56,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:35:56,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:57,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:35:59,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:36:00,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 07:36:00,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 07:36:00,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:36:00,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:36:05,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:36:07,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:36:07,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1188333.3333333333, ans=0.1 2023-10-03 07:36:08,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1188333.3333333333, ans=0.125 2023-10-03 07:36:08,911 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.911e+02 2.066e+02 2.316e+02 3.128e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 07:36:10,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:36:10,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:36:13,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:36:13,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:36:14,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:36:16,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:36:16,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:36:19,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 07:36:21,382 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.34 vs. limit=15.0 2023-10-03 07:36:25,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 07:36:26,309 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 07:36:26,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:36:29,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 07:36:31,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 07:36:31,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:36:31,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:36:31,234 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 07:36:31,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:36:33,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 07:36:35,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:36:35,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:36:35,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1188466.6666666667, ans=0.0 2023-10-03 07:36:38,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:36:39,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:36:39,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:39,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 07:36:39,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:36:41,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 07:36:41,448 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:36:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:36:48,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 07:36:49,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:36:51,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 07:36:52,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:36:54,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:36:55,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:36:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:57,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:36:58,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:36:58,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:36:58,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:37:00,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:37:00,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:37:02,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:37:03,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:37:03,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 07:37:04,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:37:05,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1188600.0, ans=0.0 2023-10-03 07:37:06,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:37:06,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:37:09,470 INFO [train.py:1046] (2/4) Epoch 34, batch 3000, loss[loss=0.1622, simple_loss=0.2363, pruned_loss=0.04406, over 23316.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2395, pruned_loss=0.04051, over 4731176.94 frames. ], batch size: 119, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:37:09,470 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 07:37:21,175 INFO [train.py:1078] (2/4) Epoch 34, validation: loss=0.3506, simple_loss=0.2704, pruned_loss=0.2154, over 1125622.00 frames. 2023-10-03 07:37:21,176 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 07:37:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 07:37:22,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 07:37:24,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:37:24,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:37:25,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 07:37:26,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:37:32,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:37:32,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1188666.6666666667, ans=0.125 2023-10-03 07:37:35,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1188733.3333333333, ans=0.5 2023-10-03 07:37:37,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1188733.3333333333, ans=0.1 2023-10-03 07:37:40,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:37:46,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 07:37:47,878 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.26 vs. limit=15.0 2023-10-03 07:37:48,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:37:49,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:37:49,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:37:51,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:37:51,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1188800.0, ans=0.0 2023-10-03 07:37:53,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:37:53,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 07:37:54,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 07:37:55,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:37:57,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:37:59,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:37:59,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:37:59,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:37:59,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:38:03,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:38:03,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:38:03,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:38:05,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:38:07,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 07:38:08,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1188866.6666666667, ans=0.2 2023-10-03 07:38:09,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:38:10,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:10,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:38:14,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:14,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:15,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1188866.6666666667, ans=0.125 2023-10-03 07:38:16,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 07:38:16,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 07:38:16,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:38:18,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 07:38:18,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:38:18,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1188866.6666666667, ans=0.05 2023-10-03 07:38:19,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 07:38:21,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:38:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:38:24,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 07:38:25,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 07:38:25,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:38:25,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:38:26,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:26,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:38:26,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:28,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:38:32,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 07:38:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:38:34,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1189000.0, ans=0.1 2023-10-03 07:38:34,731 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.81 vs. limit=15.0 2023-10-03 07:38:35,341 INFO [train.py:1046] (2/4) Epoch 34, batch 3050, loss[loss=0.1538, simple_loss=0.2425, pruned_loss=0.03251, over 24656.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2401, pruned_loss=0.04075, over 4724015.46 frames. ], batch size: 65, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:38:36,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:38:36,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:38:38,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.79 vs. limit=22.5 2023-10-03 07:38:40,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:43,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 07:38:47,567 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.885e+02 2.088e+02 2.309e+02 3.994e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 07:38:50,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 07:38:50,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 07:38:50,928 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.23 vs. limit=15.0 2023-10-03 07:38:52,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:38:56,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:38:58,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:38:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:02,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:39:03,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:39:03,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:04,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:39:04,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:05,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:39:07,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:11,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:11,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 07:39:11,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:39:11,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:39:11,666 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-10-03 07:39:15,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:39:15,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:39:16,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:39:16,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:22,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:22,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:22,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1189200.0, ans=0.1 2023-10-03 07:39:28,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:28,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:39:28,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:31,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:39:31,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:39:31,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:39:32,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 07:39:34,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:39:34,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:35,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 07:39:37,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:37,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1189266.6666666667, ans=0.125 2023-10-03 07:39:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:44,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:39:47,525 INFO [train.py:1046] (2/4) Epoch 34, batch 3100, loss[loss=0.1558, simple_loss=0.2448, pruned_loss=0.03343, over 24537.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2396, pruned_loss=0.04059, over 4724275.28 frames. ], batch size: 71, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:39:47,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:39:49,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 07:39:51,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 07:39:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 07:39:54,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:39:58,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:39:58,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:59,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 07:40:01,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1189400.0, ans=0.0 2023-10-03 07:40:03,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:08,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 07:40:12,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:40:13,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:13,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:40:14,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:40:15,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 07:40:17,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:40:17,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 07:40:17,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:40:18,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:18,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 07:40:19,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.13 vs. limit=6.0 2023-10-03 07:40:20,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:40:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:40:22,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 07:40:24,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 07:40:24,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1189466.6666666667, ans=0.1 2023-10-03 07:40:26,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:26,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1189466.6666666667, ans=0.05 2023-10-03 07:40:27,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:29,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:40:29,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:29,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:40:30,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:40:30,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:40:34,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:40:34,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:40:34,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:34,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 07:40:38,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:40:39,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1189533.3333333333, ans=0.2 2023-10-03 07:40:40,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 07:40:40,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1189533.3333333333, ans=0.125 2023-10-03 07:40:41,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:40:41,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 07:40:41,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:40:41,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:43,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 07:40:46,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1189600.0, ans=0.1 2023-10-03 07:40:53,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 07:40:56,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:40:58,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:00,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:41:00,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:41:00,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 07:41:02,587 INFO [train.py:1046] (2/4) Epoch 34, batch 3150, loss[loss=0.1669, simple_loss=0.2524, pruned_loss=0.04066, over 24355.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2385, pruned_loss=0.04014, over 4732686.65 frames. ], batch size: 77, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:41:02,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:02,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 07:41:04,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 07:41:05,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:06,557 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.81 vs. limit=15.0 2023-10-03 07:41:08,629 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 07:41:10,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 07:41:10,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:41:10,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1189666.6666666667, ans=0.025 2023-10-03 07:41:11,417 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 07:41:11,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 07:41:14,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 07:41:14,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 07:41:14,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 07:41:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:14,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:41:15,389 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.879e+02 2.066e+02 2.438e+02 3.109e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 07:41:16,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.29 vs. limit=22.5 2023-10-03 07:41:16,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:18,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 07:41:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:18,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:19,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:41:19,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:41:25,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 07:41:25,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:41:30,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:41:30,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:41:30,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 07:41:30,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1189800.0, ans=0.125 2023-10-03 07:41:34,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 07:41:34,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:41:35,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 07:41:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:41:36,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:36,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:41:36,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:41:36,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:41:38,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 07:41:38,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1189800.0, ans=0.0 2023-10-03 07:41:39,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:41:39,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:39,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1189800.0, ans=0.125 2023-10-03 07:41:42,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:41:42,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:41:42,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 07:41:42,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1189800.0, ans=0.1 2023-10-03 07:41:43,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:41:45,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 07:41:46,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:46,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 07:41:47,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 07:41:50,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:41:50,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:41:52,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 07:41:52,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 07:41:54,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:55,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:41:56,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:58,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:42:03,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:42:04,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:05,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 07:42:11,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:42:11,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:42:13,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:15,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:42:15,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 07:42:16,675 INFO [train.py:1046] (2/4) Epoch 34, batch 3200, loss[loss=0.1451, simple_loss=0.2266, pruned_loss=0.03177, over 24549.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2374, pruned_loss=0.03989, over 4724004.29 frames. ], batch size: 60, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 07:42:16,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:42:17,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.53 vs. limit=6.0 2023-10-03 07:42:21,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:42:24,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:34,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:42:36,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.95 vs. limit=15.0 2023-10-03 07:42:37,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1190066.6666666667, ans=0.125 2023-10-03 07:42:41,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1190066.6666666667, ans=0.2 2023-10-03 07:42:44,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 07:42:46,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:42:49,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1190133.3333333333, ans=0.125 2023-10-03 07:42:50,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 07:42:50,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:42:53,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:42:53,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:42:54,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:42:56,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1190133.3333333333, ans=0.2 2023-10-03 07:42:57,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.47 vs. limit=22.5 2023-10-03 07:42:58,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 07:42:59,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 07:43:02,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 07:43:05,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 07:43:08,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:43:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:12,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:43:14,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:14,320 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 07:43:14,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:43:18,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:43:18,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1190266.6666666667, ans=0.125 2023-10-03 07:43:19,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 07:43:19,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 07:43:21,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 07:43:21,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 07:43:21,944 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.46 vs. limit=22.5 2023-10-03 07:43:22,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:43:24,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:43:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 07:43:25,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:43:25,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:27,252 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 07:43:30,666 INFO [train.py:1046] (2/4) Epoch 34, batch 3250, loss[loss=0.1754, simple_loss=0.2506, pruned_loss=0.05016, over 23491.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2385, pruned_loss=0.03978, over 4734433.35 frames. ], batch size: 285, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:43:30,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:43:33,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:43:38,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1190333.3333333333, ans=0.1 2023-10-03 07:43:44,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:43:44,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 07:43:45,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.836e+02 2.055e+02 2.276e+02 3.582e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 07:43:45,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:43:45,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1190400.0, ans=0.2 2023-10-03 07:43:46,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:46,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:43:48,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:43:49,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:43:52,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:52,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:43:52,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:52,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:52,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:53,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:43:55,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:43:56,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:43:56,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1190400.0, ans=0.1 2023-10-03 07:43:58,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1190466.6666666667, ans=0.2 2023-10-03 07:43:59,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:59,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:59,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:59,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:43:59,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:44:04,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 07:44:06,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:44:06,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:44:08,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:09,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:44:15,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:44:22,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:44:22,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:22,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 07:44:22,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:44:22,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:44:22,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:25,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 07:44:26,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 07:44:26,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:44:29,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:29,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:44:29,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 07:44:30,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:44:34,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:44:34,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:44:36,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 07:44:36,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:44:40,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:44:40,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 07:44:43,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:44:43,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 07:44:45,002 INFO [train.py:1046] (2/4) Epoch 34, batch 3300, loss[loss=0.1653, simple_loss=0.2473, pruned_loss=0.04162, over 24657.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2395, pruned_loss=0.04014, over 4725400.80 frames. ], batch size: 73, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:44:45,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 07:44:45,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 07:44:47,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:47,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1190666.6666666667, ans=0.0 2023-10-03 07:44:50,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:44:50,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1190666.6666666667, ans=0.0 2023-10-03 07:44:51,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:44:51,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:51,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1190666.6666666667, ans=0.1 2023-10-03 07:44:54,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:44:54,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:44:56,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:44:58,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:45:02,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 07:45:02,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:03,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:05,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:06,671 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 07:45:06,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:08,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:45:08,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:45:08,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:08,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 07:45:13,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:45:13,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:45:15,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:15,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 07:45:16,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 07:45:16,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:18,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:45:20,972 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 07:45:21,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 07:45:21,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.89 vs. limit=15.0 2023-10-03 07:45:22,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:45:23,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 07:45:27,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:45:29,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:45:30,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:45:32,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:33,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:33,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:45:33,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:45:36,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:45:36,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:36,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:45:38,319 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 07:45:40,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 07:45:41,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:45:41,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:45:41,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:44,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:44,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:46,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:45:46,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:46,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:45:47,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:49,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:45:50,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 07:45:52,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:53,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:54,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:45:54,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:45:56,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:57,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:57,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:59,060 INFO [train.py:1046] (2/4) Epoch 34, batch 3350, loss[loss=0.1495, simple_loss=0.2309, pruned_loss=0.034, over 24458.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2402, pruned_loss=0.0404, over 4727705.36 frames. ], batch size: 58, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:45:59,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:45:59,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.40 vs. limit=15.0 2023-10-03 07:46:00,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:00,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:46:03,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:07,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:46:08,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:46:08,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1191000.0, ans=0.2 2023-10-03 07:46:09,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:46:09,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 07:46:10,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1191000.0, ans=0.125 2023-10-03 07:46:11,435 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 07:46:11,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:46:14,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 07:46:14,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 07:46:16,357 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.949e+02 2.135e+02 2.589e+02 3.898e+02, threshold=4.270e+02, percent-clipped=0.0 2023-10-03 07:46:16,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:46:16,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:46:16,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1191066.6666666667, ans=0.1 2023-10-03 07:46:17,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:17,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 07:46:17,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:17,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:46:19,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:20,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:20,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:22,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:46:24,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:28,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:30,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:46:31,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:33,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1191133.3333333333, ans=0.1 2023-10-03 07:46:34,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:37,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:39,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 07:46:39,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:46:39,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 07:46:39,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.66 vs. limit=6.0 2023-10-03 07:46:40,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:46:41,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 07:46:43,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:45,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:49,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-03 07:46:52,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:53,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 07:46:54,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:46:56,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:46:57,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:47:01,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:47:03,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 07:47:05,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:47:05,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:47:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:08,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 07:47:08,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:47:08,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 07:47:11,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:47:12,403 INFO [train.py:1046] (2/4) Epoch 34, batch 3400, loss[loss=0.1675, simple_loss=0.2556, pruned_loss=0.03972, over 24045.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2415, pruned_loss=0.04143, over 4717109.38 frames. ], batch size: 80, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:47:12,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:47:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:47:13,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:47:15,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 07:47:18,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 07:47:18,597 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 07:47:18,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:23,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:47:23,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:47:24,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:47:24,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:47:28,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:47:29,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-03 07:47:30,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 07:47:31,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1191400.0, ans=0.05 2023-10-03 07:47:35,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:47:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:47:39,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:39,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1191400.0, ans=0.1 2023-10-03 07:47:40,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:47:44,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:47:46,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1191466.6666666667, ans=0.0 2023-10-03 07:47:49,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 07:47:51,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.90 vs. limit=12.0 2023-10-03 07:47:56,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:56,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1191533.3333333333, ans=0.125 2023-10-03 07:47:56,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1191533.3333333333, ans=0.125 2023-10-03 07:47:58,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 07:47:58,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:47:58,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:59,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:47:59,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1191533.3333333333, ans=0.2 2023-10-03 07:48:00,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:48:02,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:48:05,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:48:05,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:48:12,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:48:14,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 07:48:17,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1191600.0, ans=0.1 2023-10-03 07:48:18,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:48:21,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1191600.0, ans=0.125 2023-10-03 07:48:22,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 07:48:26,333 INFO [train.py:1046] (2/4) Epoch 34, batch 3450, loss[loss=0.1509, simple_loss=0.2193, pruned_loss=0.04129, over 23769.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2412, pruned_loss=0.04117, over 4722091.54 frames. ], batch size: 212, lr: 2.99e-03, grad_scale: 4.0 2023-10-03 07:48:26,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 07:48:27,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:48:29,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:48:29,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 07:48:29,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:48:31,249 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.16 vs. limit=15.0 2023-10-03 07:48:32,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1191666.6666666667, ans=0.1 2023-10-03 07:48:33,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:48:34,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1191666.6666666667, ans=0.0 2023-10-03 07:48:37,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:48:39,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:48:39,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:48:39,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:48:42,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:48:44,095 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.945e+02 2.174e+02 2.507e+02 5.518e+02, threshold=4.348e+02, percent-clipped=2.0 2023-10-03 07:48:48,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 07:48:54,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 07:48:55,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:48:55,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:48:57,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:48:57,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1191800.0, ans=0.125 2023-10-03 07:49:01,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 07:49:02,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:49:06,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:49:06,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:49:08,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:49:10,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:49:11,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 07:49:11,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:49:11,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:49:13,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:49:15,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 07:49:20,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:49:23,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:49:24,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:27,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:33,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:33,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:49:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:49:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:49:39,504 INFO [train.py:1046] (2/4) Epoch 34, batch 3500, loss[loss=0.1379, simple_loss=0.2051, pruned_loss=0.03535, over 23515.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2407, pruned_loss=0.04084, over 4729729.29 frames. ], batch size: 256, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:49:41,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:42,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:49:43,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1192000.0, ans=0.0 2023-10-03 07:49:44,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 07:49:45,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:49:48,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 07:49:51,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:51,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 07:49:51,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1192000.0, ans=0.0 2023-10-03 07:49:56,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:49:58,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:49:58,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:49:58,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:49:59,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:49:59,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:00,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:50:00,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 07:50:02,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:03,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:50:04,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:50:10,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:12,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 07:50:12,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:50:14,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:50:15,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:50:17,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:18,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:50:19,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:50:20,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 07:50:22,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 07:50:24,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 07:50:24,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:50:25,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:27,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:50:27,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:50:27,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1192200.0, ans=0.0 2023-10-03 07:50:30,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:50:31,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:50:31,764 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:50:35,133 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.67 vs. limit=10.0 2023-10-03 07:50:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:50:38,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 07:50:38,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 07:50:38,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:50:38,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1192266.6666666667, ans=0.125 2023-10-03 07:50:39,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:50:39,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:50:41,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:43,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1192266.6666666667, ans=0.09899494936611666 2023-10-03 07:50:45,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 07:50:46,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:50:48,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:50:49,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 07:50:50,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 07:50:52,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:53,619 INFO [train.py:1046] (2/4) Epoch 34, batch 3550, loss[loss=0.1504, simple_loss=0.2272, pruned_loss=0.03676, over 18205.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2397, pruned_loss=0.04063, over 4718126.80 frames. ], batch size: 39, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:50:53,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:50:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:50:55,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:50:57,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:51:05,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:07,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 07:51:11,511 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.871e+02 2.039e+02 2.227e+02 3.484e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 07:51:11,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:51:11,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:51:11,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1192400.0, ans=0.125 2023-10-03 07:51:13,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:13,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:51:13,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:51:17,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:51:17,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:51:17,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:19,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:51:20,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:51:25,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:51:25,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:51:27,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:51:27,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:28,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:51:28,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 07:51:28,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:30,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:31,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.87 vs. limit=15.0 2023-10-03 07:51:31,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:51:35,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:51:35,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:51:37,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:51:38,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 07:51:40,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:51:40,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 07:51:40,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:51:43,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:51:43,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:51:46,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 07:51:47,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:51:48,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1192533.3333333333, ans=0.125 2023-10-03 07:51:52,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:51:53,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 07:51:53,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:51:57,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 07:52:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 07:52:05,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:52:06,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:52:08,113 INFO [train.py:1046] (2/4) Epoch 34, batch 3600, loss[loss=0.1556, simple_loss=0.2305, pruned_loss=0.04038, over 23546.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.0405, over 4714016.69 frames. ], batch size: 134, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:52:08,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:52:08,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:52:11,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:52:14,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:52:14,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1192666.6666666667, ans=0.0 2023-10-03 07:52:17,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:18,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:52:18,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:52:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:19,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 07:52:23,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:52:25,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:28,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:52:30,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:52:31,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:52:32,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:52:32,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 07:52:32,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:52:36,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:38,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:52:39,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:52:39,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1192800.0, ans=0.0 2023-10-03 07:52:40,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:52:42,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:52:42,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 07:52:42,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1192800.0, ans=0.0 2023-10-03 07:52:50,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:52:51,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:52:51,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 07:52:53,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1192866.6666666667, ans=0.125 2023-10-03 07:52:53,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1192866.6666666667, ans=0.1 2023-10-03 07:52:56,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1192866.6666666667, ans=0.1 2023-10-03 07:52:57,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:53:00,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.36 vs. limit=10.0 2023-10-03 07:53:02,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:05,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:05,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1192933.3333333333, ans=0.125 2023-10-03 07:53:09,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:53:09,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:53:09,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 07:53:11,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 07:53:11,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1192933.3333333333, ans=0.0 2023-10-03 07:53:13,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 07:53:13,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1192933.3333333333, ans=0.0 2023-10-03 07:53:15,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:53:15,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:53:15,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 07:53:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:53:17,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:53:17,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:53:18,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 07:53:20,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 07:53:21,550 INFO [train.py:1046] (2/4) Epoch 34, batch 3650, loss[loss=0.1691, simple_loss=0.2516, pruned_loss=0.04333, over 23434.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2392, pruned_loss=0.04034, over 4719040.00 frames. ], batch size: 93, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:53:22,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:24,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 07:53:29,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 07:53:30,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:53:35,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 07:53:36,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 07:53:39,300 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.892e+02 2.047e+02 2.269e+02 3.053e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 07:53:39,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:53:39,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:53:39,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:53:41,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1193066.6666666667, ans=10.0 2023-10-03 07:53:42,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:53:42,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:53:44,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 07:53:44,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:53:44,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:53:45,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 07:53:46,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:53:47,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:53:47,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:53:51,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:53:53,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 07:53:54,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 07:53:55,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:53:58,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 07:53:59,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:53:59,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:54:05,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1193200.0, ans=0.125 2023-10-03 07:54:06,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:54:06,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:54:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:54:06,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1193200.0, ans=0.0 2023-10-03 07:54:06,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1193200.0, ans=0.2 2023-10-03 07:54:09,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:54:10,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:54:12,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:54:14,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:54:14,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:14,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:54:18,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:54:18,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1193200.0, ans=0.125 2023-10-03 07:54:19,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:54:19,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:54:25,726 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 07:54:28,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1193266.6666666667, ans=0.0 2023-10-03 07:54:29,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:54:29,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:54:31,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:54:31,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1193266.6666666667, ans=0.125 2023-10-03 07:54:32,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:34,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:54:35,821 INFO [train.py:1046] (2/4) Epoch 34, batch 3700, loss[loss=0.169, simple_loss=0.2605, pruned_loss=0.03874, over 24578.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2395, pruned_loss=0.0403, over 4720486.14 frames. ], batch size: 71, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:54:35,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:37,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 07:54:37,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:37,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1193333.3333333333, ans=0.1 2023-10-03 07:54:39,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:54:40,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:54:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:54:43,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:43,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 07:54:44,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:44,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 07:54:45,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-10-03 07:54:46,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:54:49,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:54:52,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:54:52,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:54:54,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:54:55,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:55,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:54:56,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:54:58,286 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 07:55:05,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:55:07,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:55:08,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:55:08,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1193466.6666666667, ans=0.1 2023-10-03 07:55:08,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1193466.6666666667, ans=0.125 2023-10-03 07:55:10,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 07:55:10,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:55:13,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:14,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 07:55:15,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:17,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:55:18,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:18,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:55:21,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 07:55:23,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1193533.3333333333, ans=0.1 2023-10-03 07:55:26,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:55:26,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 07:55:27,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:55:27,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 07:55:32,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:55:32,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:55:35,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:55:36,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 07:55:38,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:55:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:55:39,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:55:39,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:55:43,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:55:43,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 07:55:45,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 07:55:45,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:55:45,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:55:47,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:55:48,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:55:50,546 INFO [train.py:1046] (2/4) Epoch 34, batch 3750, loss[loss=0.1708, simple_loss=0.2642, pruned_loss=0.0387, over 24462.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2401, pruned_loss=0.04035, over 4734504.41 frames. ], batch size: 69, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:55:50,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:51,586 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.00 vs. limit=15.0 2023-10-03 07:55:52,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:55:53,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:55:55,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 07:55:56,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 07:55:58,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:55:58,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 07:55:58,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:55:59,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:56:00,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:56:02,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:56:02,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1193666.6666666667, ans=0.04949747468305833 2023-10-03 07:56:03,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:56:05,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:56:07,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:56:08,415 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.948e+02 2.234e+02 2.708e+02 3.464e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-03 07:56:08,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1193733.3333333333, ans=0.0 2023-10-03 07:56:10,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:56:12,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:56:13,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 07:56:14,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:56:16,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:56:16,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:56:19,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 07:56:23,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 07:56:25,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:56:25,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:56:26,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:56:26,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1193800.0, ans=0.2 2023-10-03 07:56:29,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:56:30,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:56:33,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 07:56:37,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:56:38,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1193866.6666666667, ans=0.125 2023-10-03 07:56:38,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1193866.6666666667, ans=0.1 2023-10-03 07:56:40,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:56:41,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:56:44,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:56:47,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1193866.6666666667, ans=0.125 2023-10-03 07:56:49,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:56:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:56:52,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:56:53,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:56:54,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:57:02,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1193933.3333333333, ans=0.125 2023-10-03 07:57:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:57:04,524 INFO [train.py:1046] (2/4) Epoch 34, batch 3800, loss[loss=0.1528, simple_loss=0.2124, pruned_loss=0.04658, over 19538.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.04073, over 4726081.92 frames. ], batch size: 388, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:57:07,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:08,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:57:08,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.68 vs. limit=15.0 2023-10-03 07:57:09,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 07:57:10,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:57:12,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:57:14,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:57:15,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1194000.0, ans=0.0 2023-10-03 07:57:17,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 07:57:17,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:19,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:57:20,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:57:20,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:57:21,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:23,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 07:57:27,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 07:57:27,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:57:30,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:57:30,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1194066.6666666667, ans=0.125 2023-10-03 07:57:32,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:57:34,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 07:57:35,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:57:35,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:38,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:38,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1194133.3333333333, ans=0.1 2023-10-03 07:57:39,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:44,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:57:44,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 07:57:46,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:57:54,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:57:59,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:58:02,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 07:58:03,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 07:58:05,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:06,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:58:06,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:08,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 07:58:12,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 07:58:12,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 07:58:12,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:13,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:58:17,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:58:17,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1194333.3333333333, ans=0.07 2023-10-03 07:58:19,192 INFO [train.py:1046] (2/4) Epoch 34, batch 3850, loss[loss=0.1783, simple_loss=0.2657, pruned_loss=0.04547, over 24393.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2393, pruned_loss=0.04008, over 4738424.32 frames. ], batch size: 77, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:58:19,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:58:24,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:58:25,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 07:58:27,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:58:27,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:27,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1194333.3333333333, ans=0.0 2023-10-03 07:58:28,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1194333.3333333333, ans=0.0 2023-10-03 07:58:31,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.74 vs. limit=12.0 2023-10-03 07:58:31,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:58:32,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:35,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:58:36,895 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.877e+02 2.078e+02 2.275e+02 4.210e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 07:58:36,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 07:58:37,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1194400.0, ans=0.0 2023-10-03 07:58:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:45,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:48,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:58:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:58:51,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:51,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:58:53,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:53,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:58:53,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1194466.6666666667, ans=0.125 2023-10-03 07:58:55,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:58:57,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:58:58,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:58:59,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 07:58:59,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 07:59:01,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:59:01,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:04,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:04,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:04,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 07:59:06,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 07:59:08,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:10,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 07:59:12,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:59:17,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:18,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:21,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:21,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 07:59:21,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1194600.0, ans=0.2 2023-10-03 07:59:26,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 07:59:27,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:27,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:29,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:59:29,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:59:30,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-03 07:59:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:31,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:31,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:59:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 07:59:32,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:59:34,194 INFO [train.py:1046] (2/4) Epoch 34, batch 3900, loss[loss=0.1584, simple_loss=0.2514, pruned_loss=0.03276, over 24305.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2383, pruned_loss=0.03966, over 4724623.51 frames. ], batch size: 74, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:59:34,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 07:59:35,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:35,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:37,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:59:37,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:39,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:59:39,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:39,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:41,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:59:41,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 07:59:41,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:45,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:59:45,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:59:47,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:59:48,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:59:50,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:59:51,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:53,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:59:54,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 07:59:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:59:56,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 07:59:56,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:56,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1194733.3333333333, ans=0.125 2023-10-03 07:59:57,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 07:59:59,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 08:00:03,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:00:03,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:00:03,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:00:03,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:04,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.45 vs. limit=22.5 2023-10-03 08:00:08,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1194800.0, ans=0.0 2023-10-03 08:00:09,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:00:10,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:00:13,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:00:13,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:00:13,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:00:18,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:00:19,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:00:19,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1194866.6666666667, ans=0.125 2023-10-03 08:00:27,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:00:28,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:00:37,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:00:39,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:40,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 08:00:40,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 08:00:40,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:42,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 08:00:43,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:00:43,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 08:00:47,990 INFO [train.py:1046] (2/4) Epoch 34, batch 3950, loss[loss=0.1606, simple_loss=0.238, pruned_loss=0.04155, over 23329.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2373, pruned_loss=0.03994, over 4713573.83 frames. ], batch size: 119, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:00:50,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:00:52,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 08:00:52,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:00:54,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:00:57,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:00:58,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.19 vs. limit=15.0 2023-10-03 08:01:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 08:01:03,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:01:03,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 08:01:05,109 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 08:01:05,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:01:06,448 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.876e+02 2.006e+02 2.250e+02 3.004e+02, threshold=4.013e+02, percent-clipped=0.0 2023-10-03 08:01:06,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:01:06,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:01:06,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:01:09,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 08:01:10,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:01:10,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:01:10,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:01:12,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:01:12,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:01:12,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1195066.6666666667, ans=0.1 2023-10-03 08:01:24,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:01:24,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:01:30,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 08:01:36,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 08:01:36,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 08:01:36,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:01:37,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:01:45,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:01:45,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:01:46,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:01:47,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:01:47,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 08:01:52,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:01:53,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:01:57,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 08:02:02,687 INFO [train.py:1046] (2/4) Epoch 34, batch 4000, loss[loss=0.14, simple_loss=0.2203, pruned_loss=0.02987, over 24609.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2381, pruned_loss=0.04029, over 4720980.49 frames. ], batch size: 60, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:02:06,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:12,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:17,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:02:19,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:02:19,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:20,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 08:02:20,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:02:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 08:02:20,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:02:20,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 08:02:23,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:02:26,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:02:26,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:02:26,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:02:26,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:02:26,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:02:28,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:02:29,457 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 08:02:29,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:02:29,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:32,784 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 08:02:34,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:02:34,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:02:36,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1195466.6666666667, ans=0.0 2023-10-03 08:02:40,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 08:02:40,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:02:41,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.52 vs. limit=15.0 2023-10-03 08:02:42,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:02:43,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 08:02:43,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:02:44,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 08:02:44,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:02:44,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:46,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:02:48,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:02:48,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:02:49,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:02:51,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 08:02:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:52,545 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 08:02:59,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:03:02,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 08:03:03,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:03:03,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:03:05,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:03:06,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:12,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:03:13,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:03:14,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 08:03:16,217 INFO [train.py:1046] (2/4) Epoch 34, batch 4050, loss[loss=0.1437, simple_loss=0.2216, pruned_loss=0.03283, over 23394.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2385, pruned_loss=0.04067, over 4715700.75 frames. ], batch size: 119, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:03:16,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:03:16,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:03:18,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:03:18,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1195666.6666666667, ans=0.09899494936611666 2023-10-03 08:03:19,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:03:20,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:03:23,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=1195666.6666666667, ans=12.0 2023-10-03 08:03:25,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:03:25,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.98 vs. limit=15.0 2023-10-03 08:03:29,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:03:29,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 08:03:31,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:03:32,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:03:34,567 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.804e+02 1.973e+02 2.142e+02 3.125e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-03 08:03:37,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:40,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:03:41,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 08:03:44,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 08:03:44,307 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 08:03:45,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:03:50,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1195800.0, ans=0.125 2023-10-03 08:03:51,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 08:03:51,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:03:56,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:03:58,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:58,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:03:58,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:04:03,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:04:08,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 08:04:08,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:04:10,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:04:12,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 08:04:15,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:04:21,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 08:04:23,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:04:23,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:04:24,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 08:04:24,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 08:04:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:27,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:04:27,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:27,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:04:30,028 INFO [train.py:1046] (2/4) Epoch 34, batch 4100, loss[loss=0.1606, simple_loss=0.2488, pruned_loss=0.03617, over 24656.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2397, pruned_loss=0.04116, over 4716844.40 frames. ], batch size: 68, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:04:35,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 08:04:37,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 08:04:39,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 08:04:39,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 08:04:39,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:40,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:40,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:40,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:04:40,989 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 08:04:42,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1196000.0, ans=10.0 2023-10-03 08:04:43,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1196066.6666666667, ans=0.125 2023-10-03 08:04:45,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:04:46,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:04:46,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:47,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:04:52,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:04:52,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:04:52,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:04:52,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 08:04:54,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:54,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:04:54,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:04:54,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:04:55,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 08:04:58,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:04:59,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 08:05:02,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:05:05,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:05:05,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 08:05:06,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:05:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:05:06,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:05:10,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 08:05:10,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:05:10,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1196133.3333333333, ans=0.0 2023-10-03 08:05:11,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:05:13,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 08:05:14,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:05:14,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:05:17,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:05:18,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.83 vs. limit=12.0 2023-10-03 08:05:22,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1196200.0, ans=0.125 2023-10-03 08:05:23,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:25,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:05:26,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:05:31,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:05:31,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:05:36,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:05:39,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:05:43,634 INFO [train.py:1046] (2/4) Epoch 34, batch 4150, loss[loss=0.1618, simple_loss=0.2484, pruned_loss=0.03762, over 24457.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2397, pruned_loss=0.04127, over 4726644.50 frames. ], batch size: 63, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:05:43,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:05:43,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:05:45,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:05:45,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:05:47,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 08:05:49,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:49,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 08:05:49,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 08:05:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 08:05:52,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:56,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:05:56,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:01,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:02,270 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.887e+02 2.039e+02 2.346e+02 3.122e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 08:06:02,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:06:02,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:06:05,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:06:05,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:06:05,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:06:09,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1196400.0, ans=0.0 2023-10-03 08:06:10,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:13,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:06:13,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 08:06:13,908 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=22.5 2023-10-03 08:06:16,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 08:06:16,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:06:16,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 08:06:16,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:06:17,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:06:20,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:22,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:24,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1196466.6666666667, ans=0.05 2023-10-03 08:06:25,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.53 vs. limit=10.0 2023-10-03 08:06:26,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 08:06:28,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:06:29,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:06:30,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 08:06:30,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:06:31,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 08:06:33,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:06:36,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:06:36,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:38,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 08:06:38,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:06:38,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:06:40,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:06:44,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 08:06:44,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:44,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:06:44,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:06:45,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 08:06:45,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:47,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 08:06:47,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:48,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:49,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 08:06:49,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:06:54,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:06:56,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 08:06:57,318 INFO [train.py:1046] (2/4) Epoch 34, batch 4200, loss[loss=0.1686, simple_loss=0.2549, pruned_loss=0.04118, over 23748.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2392, pruned_loss=0.04108, over 4714585.70 frames. ], batch size: 85, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:06:58,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:07:00,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:07:01,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:07:01,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:07:01,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:07:02,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 08:07:06,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 08:07:06,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:08,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:07:12,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:07:16,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:07:16,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:07:17,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:17,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 08:07:17,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:07:19,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:19,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:07:20,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:07:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:07:25,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 08:07:25,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:28,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:07:29,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:07:30,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:07:32,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:07:35,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:07:37,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 08:07:37,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:07:37,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1196800.0, ans=0.0 2023-10-03 08:07:38,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:07:45,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:07:47,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:07:51,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:07:56,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 08:07:58,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:08:03,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:08:03,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:03,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 08:08:09,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:08:11,119 INFO [train.py:1046] (2/4) Epoch 34, batch 4250, loss[loss=0.1502, simple_loss=0.2205, pruned_loss=0.03994, over 23521.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2386, pruned_loss=0.04067, over 4715634.79 frames. ], batch size: 285, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:08:13,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:08:13,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:08:15,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:20,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1197000.0, ans=0.1 2023-10-03 08:08:22,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:08:22,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 08:08:22,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:08:27,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:29,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:08:29,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1197066.6666666667, ans=0.05 2023-10-03 08:08:30,135 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.924e+02 2.069e+02 2.506e+02 3.818e+02, threshold=4.139e+02, percent-clipped=0.0 2023-10-03 08:08:31,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1197066.6666666667, ans=0.125 2023-10-03 08:08:33,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:33,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:36,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:08:36,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:08:37,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:38,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:40,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:40,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1197133.3333333333, ans=0.125 2023-10-03 08:08:42,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:08:43,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:08:45,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 08:08:47,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 08:08:47,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:48,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:08:49,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:50,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.93 vs. limit=15.0 2023-10-03 08:08:50,623 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.41 vs. limit=15.0 2023-10-03 08:08:51,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:08:51,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:51,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:53,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:08:55,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:08:55,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1197200.0, ans=0.09899494936611666 2023-10-03 08:08:58,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:09:01,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:01,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 08:09:01,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:09:01,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 08:09:02,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:09:04,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:09:05,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:09:06,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:09:08,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 08:09:10,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:09:10,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:09:13,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:09:13,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1197266.6666666667, ans=0.0 2023-10-03 08:09:16,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:18,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:09:19,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:09:20,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:09:22,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:09:23,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:09:23,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 08:09:24,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:09:25,710 INFO [train.py:1046] (2/4) Epoch 34, batch 4300, loss[loss=0.1526, simple_loss=0.2395, pruned_loss=0.03288, over 24470.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2384, pruned_loss=0.04035, over 4715220.77 frames. ], batch size: 66, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:09:29,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:09:29,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:09:34,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:09:40,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1197400.0, ans=0.125 2023-10-03 08:09:42,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:42,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 08:09:44,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:09:48,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:09:48,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:09:48,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 08:09:48,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.30 vs. limit=15.0 2023-10-03 08:09:49,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:09:50,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:09:51,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1197400.0, ans=0.1 2023-10-03 08:09:54,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 08:09:55,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:09:55,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 08:09:57,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1197466.6666666667, ans=0.2 2023-10-03 08:09:58,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:09:59,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:10:01,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:10:01,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:10:02,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1197466.6666666667, ans=0.0 2023-10-03 08:10:03,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:10:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:10:05,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:10:05,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 08:10:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 08:10:08,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:10:09,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1197533.3333333333, ans=0.125 2023-10-03 08:10:11,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:11,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:10:11,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:12,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:10:12,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 08:10:12,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 08:10:12,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 08:10:14,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:10:14,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 08:10:14,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 08:10:18,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:10:20,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 08:10:21,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:10:24,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:24,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:10:24,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1197600.0, ans=0.1 2023-10-03 08:10:25,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 08:10:28,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.91 vs. limit=10.0 2023-10-03 08:10:28,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:10:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:28,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:10:30,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:10:31,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:10:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:10:34,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:35,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:37,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:10:38,349 INFO [train.py:1046] (2/4) Epoch 34, batch 4350, loss[loss=0.1618, simple_loss=0.2511, pruned_loss=0.03629, over 24636.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2392, pruned_loss=0.04057, over 4726107.88 frames. ], batch size: 73, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:10:43,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 08:10:43,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:10:47,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:10:47,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1197666.6666666667, ans=0.2 2023-10-03 08:10:49,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:51,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:10:51,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:10:55,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1197733.3333333333, ans=0.2 2023-10-03 08:10:56,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:10:57,703 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.863e+02 2.001e+02 2.275e+02 3.129e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-03 08:10:59,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1197733.3333333333, ans=0.07 2023-10-03 08:11:00,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:11:03,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:11:03,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:11:05,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1197733.3333333333, ans=0.1 2023-10-03 08:11:06,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:11:09,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:11:11,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:11:15,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 08:11:16,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:11:16,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:22,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.79 vs. limit=10.0 2023-10-03 08:11:24,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:25,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 08:11:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:11:29,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:11:33,550 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 08:11:35,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:11:35,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:11:35,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1197866.6666666667, ans=0.1 2023-10-03 08:11:36,423 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 08:11:36,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1197933.3333333333, ans=0.125 2023-10-03 08:11:37,840 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 08:11:37,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:11:37,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:11:37,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:11:39,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:11:39,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:11:39,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:11:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 08:11:43,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:43,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:11:43,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:45,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 08:11:45,394 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 08:11:45,398 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 08:11:45,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 08:11:45,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1197933.3333333333, ans=0.09899494936611666 2023-10-03 08:11:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:11:50,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:11:50,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:11:51,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:11:52,795 INFO [train.py:1046] (2/4) Epoch 34, batch 4400, loss[loss=0.1665, simple_loss=0.2437, pruned_loss=0.04459, over 23653.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2405, pruned_loss=0.04116, over 4717054.92 frames. ], batch size: 232, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:11:52,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 08:11:56,197 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 08:11:56,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:00,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:12:02,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:04,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:12:05,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 08:12:05,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 08:12:06,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 08:12:06,348 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 08:12:07,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:12:07,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:12:09,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1198066.6666666667, ans=0.125 2023-10-03 08:12:10,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 08:12:11,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:13,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:13,307 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 08:12:14,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:14,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 08:12:14,854 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 08:12:18,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 08:12:19,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 08:12:19,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 08:12:21,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:21,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:12:21,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:12:22,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:12:24,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 08:12:24,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 08:12:25,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:26,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:12:26,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:28,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:28,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:28,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 08:12:30,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 08:12:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:39,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1198200.0, ans=0.125 2023-10-03 08:12:39,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1198200.0, ans=0.1 2023-10-03 08:12:39,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1198200.0, ans=0.025 2023-10-03 08:12:40,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:12:41,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 08:12:46,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:12:48,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:12:51,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:12:51,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 08:12:51,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:12:53,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:12:53,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:12:53,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:12:56,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 08:12:59,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 08:13:01,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 08:13:01,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:03,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 08:13:04,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:13:07,171 INFO [train.py:1046] (2/4) Epoch 34, batch 4450, loss[loss=0.1574, simple_loss=0.2436, pruned_loss=0.03555, over 24643.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2405, pruned_loss=0.04122, over 4721318.70 frames. ], batch size: 68, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:13:07,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:13:08,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 08:13:12,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:13:14,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:15,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:13:22,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:13:22,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:13:26,261 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.851e+02 2.013e+02 2.365e+02 4.076e+02, threshold=4.026e+02, percent-clipped=1.0 2023-10-03 08:13:26,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:27,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:13:27,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1198400.0, ans=0.125 2023-10-03 08:13:29,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1198400.0, ans=0.125 2023-10-03 08:13:30,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:13:30,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:32,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 08:13:32,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:13:34,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:34,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:13:34,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:13:37,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:13:41,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:13:41,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:13:44,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:13:44,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:45,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:13:50,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 08:13:51,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.75 vs. limit=10.0 2023-10-03 08:13:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 08:13:51,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 08:13:51,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:13:56,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:13:56,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 08:13:59,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:14:02,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:14:03,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 08:14:03,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:03,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:14:03,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:14:03,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:14:05,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:14:09,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:14:09,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 08:14:11,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:14:12,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:14:15,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:14:15,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:16,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:14:18,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:14:20,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 08:14:21,766 INFO [train.py:1046] (2/4) Epoch 34, batch 4500, loss[loss=0.1517, simple_loss=0.2364, pruned_loss=0.03349, over 24313.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2409, pruned_loss=0.04139, over 4720959.84 frames. ], batch size: 61, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:14:21,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:14:25,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:14:27,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 08:14:27,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 08:14:27,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:14:30,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1198666.6666666667, ans=0.125 2023-10-03 08:14:35,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:35,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:14:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:14:37,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:14:37,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:14:37,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:14:47,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1198733.3333333333, ans=0.125 2023-10-03 08:14:48,382 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.55 vs. limit=15.0 2023-10-03 08:14:48,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:14:48,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:14:50,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:14:51,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:14:53,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:15:00,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:15:05,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:15:08,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:15:09,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:15:11,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 08:15:11,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:11,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:12,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:14,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:15:17,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:15:17,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 08:15:17,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:15:17,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:20,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:15:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:15:24,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:26,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:15:26,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:15:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 08:15:30,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 08:15:30,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 08:15:33,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 08:15:37,042 INFO [train.py:1046] (2/4) Epoch 34, batch 4550, loss[loss=0.1604, simple_loss=0.2406, pruned_loss=0.04005, over 24500.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2401, pruned_loss=0.04133, over 4713802.92 frames. ], batch size: 63, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:15:37,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 08:15:38,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:15:41,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:15:41,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:15:44,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:15:48,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:15:50,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:51,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:15:51,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:15:51,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:54,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:15:54,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:15:57,246 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.900e+02 2.073e+02 2.296e+02 3.311e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 08:15:58,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:00,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 08:16:00,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 08:16:01,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:16:03,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 08:16:04,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1199066.6666666667, ans=0.2 2023-10-03 08:16:06,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 08:16:08,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:16:11,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 08:16:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:16:15,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:15,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:15,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:16:18,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 08:16:19,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:16:24,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:24,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:16:25,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:16:25,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 08:16:27,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 08:16:27,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:16:27,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 08:16:30,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 08:16:30,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:16:31,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:16:31,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:32,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:34,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:16:36,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:16:36,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 08:16:37,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:16:37,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 08:16:38,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 08:16:38,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:16:38,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 08:16:40,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1199266.6666666667, ans=0.0 2023-10-03 08:16:42,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:16:42,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:16:46,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:16:46,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:46,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:16:47,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:16:50,172 INFO [train.py:1046] (2/4) Epoch 34, batch 4600, loss[loss=0.1607, simple_loss=0.237, pruned_loss=0.0422, over 23764.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2391, pruned_loss=0.04125, over 4710742.50 frames. ], batch size: 149, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:16:50,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:16:52,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:16:54,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:54,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1199333.3333333333, ans=0.125 2023-10-03 08:16:57,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:16:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:16:59,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:00,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 08:17:01,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:17:02,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1199333.3333333333, ans=0.2 2023-10-03 08:17:04,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:17:06,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:08,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:12,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 08:17:14,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:17,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:19,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:17:19,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:24,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 08:17:24,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:17:25,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:17:31,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:31,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:17:32,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:17:36,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 08:17:38,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:17:42,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:43,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.85 vs. limit=15.0 2023-10-03 08:17:44,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:17:45,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:45,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 08:17:46,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:47,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 08:17:47,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:48,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:17:49,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:51,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:51,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:17:52,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 08:17:52,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 08:17:53,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 08:17:53,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:17:53,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:17:55,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:17:56,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:18:03,967 INFO [train.py:1046] (2/4) Epoch 34, batch 4650, loss[loss=0.165, simple_loss=0.2507, pruned_loss=0.03964, over 24473.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2384, pruned_loss=0.04107, over 4704788.62 frames. ], batch size: 69, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:18:07,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:18:10,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:18:10,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:18:12,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:18:12,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:18:12,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:18:13,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:18:17,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 08:18:20,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:18:21,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 08:18:21,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:18:23,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 08:18:23,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:18:23,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 08:18:24,433 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.454e+02 1.812e+02 2.012e+02 2.227e+02 3.293e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-03 08:18:24,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 08:18:24,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:24,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:18:28,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:18:30,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:30,133 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 08:18:33,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:34,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 08:18:36,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:36,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:18:38,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 08:18:39,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:18:42,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:18:45,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:18:51,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:54,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:54,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:55,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:18:57,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 08:18:57,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1199866.6666666667, ans=0.125 2023-10-03 08:18:58,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 08:18:58,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 08:18:58,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 08:18:59,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:01,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1199933.3333333333, ans=0.125 2023-10-03 08:19:06,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:19:06,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:06,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 08:19:07,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:07,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:19:08,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:19:08,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:19:11,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:19:11,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:19:13,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:19:17,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:20,903 INFO [train.py:1046] (2/4) Epoch 34, batch 4700, loss[loss=0.1647, simple_loss=0.2452, pruned_loss=0.04211, over 23417.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2391, pruned_loss=0.04085, over 4714229.19 frames. ], batch size: 119, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:19:20,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:19:20,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:19:22,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 08:19:22,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:19:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 08:19:30,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:30,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:31,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:19:32,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:33,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:19:39,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 08:19:39,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 08:19:39,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1200066.6666666667, ans=0.1 2023-10-03 08:19:41,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:43,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.41 vs. limit=15.0 2023-10-03 08:19:44,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:19:44,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:19:47,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:54,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:19:54,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:19:57,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:20:01,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1200133.3333333333, ans=0.1 2023-10-03 08:20:04,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 08:20:05,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1200200.0, ans=0.125 2023-10-03 08:20:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:20:07,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:07,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1200200.0, ans=0.125 2023-10-03 08:20:10,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 08:20:11,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:20:12,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1200200.0, ans=0.125 2023-10-03 08:20:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:20:17,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 08:20:19,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:19,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:19,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1200266.6666666667, ans=0.125 2023-10-03 08:20:22,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:20:22,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:20:22,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 08:20:24,000 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 08:20:25,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:25,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1200266.6666666667, ans=0.125 2023-10-03 08:20:25,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-10-03 08:20:28,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:28,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:28,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 08:20:28,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:30,192 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.05 vs. limit=15.0 2023-10-03 08:20:32,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 08:20:33,483 INFO [train.py:1046] (2/4) Epoch 34, batch 4750, loss[loss=0.1664, simple_loss=0.2428, pruned_loss=0.04497, over 23382.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2395, pruned_loss=0.04101, over 4714422.70 frames. ], batch size: 93, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:20:34,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:20:34,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:20:39,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:20:41,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:20:42,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 08:20:42,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:20:44,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 08:20:45,368 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.73 vs. limit=22.5 2023-10-03 08:20:47,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:20:47,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:48,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:20:55,001 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.401e+02 1.941e+02 2.055e+02 2.370e+02 3.747e+02, threshold=4.109e+02, percent-clipped=0.0 2023-10-03 08:20:55,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 08:20:59,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:21:00,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 08:21:02,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:21:03,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.41 vs. limit=15.0 2023-10-03 08:21:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:21:04,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:21:04,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:21:05,056 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 08:21:05,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 08:21:09,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 08:21:14,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:21:16,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:19,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:21:19,232 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 08:21:19,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:21:22,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:21:22,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1200533.3333333333, ans=0.0 2023-10-03 08:21:27,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:21:27,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 08:21:28,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 08:21:30,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:21:30,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:21:30,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:21:31,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:21:32,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 08:21:34,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 08:21:35,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:21:36,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.74 vs. limit=15.0 2023-10-03 08:21:38,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:21:38,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 08:21:38,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:21:40,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:21:43,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:21:43,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:43,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:21:46,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:21:47,621 INFO [train.py:1046] (2/4) Epoch 34, batch 4800, loss[loss=0.1424, simple_loss=0.2186, pruned_loss=0.03313, over 24569.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.04071, over 4718292.39 frames. ], batch size: 60, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:21:47,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 08:21:47,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 08:21:49,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 08:21:51,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:21:52,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:21:53,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 08:21:59,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:59,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:05,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:22:06,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:06,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:07,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 08:22:07,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:22:08,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1200733.3333333333, ans=0.125 2023-10-03 08:22:09,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:22:11,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:22:15,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:15,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1200800.0, ans=0.2 2023-10-03 08:22:16,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:16,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:22:18,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:18,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 08:22:18,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:19,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:20,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:21,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1200800.0, ans=10.0 2023-10-03 08:22:21,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1200800.0, ans=0.0 2023-10-03 08:22:22,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:26,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:26,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:22:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:22:28,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:30,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 08:22:30,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 08:22:31,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:31,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:22:33,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:22:33,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:22:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:22:35,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:22:35,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:22:37,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:22:37,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1200866.6666666667, ans=0.125 2023-10-03 08:22:40,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:41,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:22:45,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1200933.3333333333, ans=0.1 2023-10-03 08:22:46,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 08:22:46,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:47,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:47,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:22:49,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:52,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:22:53,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:22:53,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:53,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:22:53,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:22:55,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:22:58,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:22:58,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:58,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:59,447 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.36 vs. limit=10.0 2023-10-03 08:23:01,327 INFO [train.py:1046] (2/4) Epoch 34, batch 4850, loss[loss=0.1606, simple_loss=0.2375, pruned_loss=0.04187, over 23651.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2404, pruned_loss=0.04108, over 4712059.48 frames. ], batch size: 149, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:23:01,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 08:23:02,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 08:23:02,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:02,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:03,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.61 vs. limit=15.0 2023-10-03 08:23:04,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:23:04,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:06,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:23:08,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1201000.0, ans=0.0 2023-10-03 08:23:11,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1201000.0, ans=0.2 2023-10-03 08:23:13,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 08:23:15,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:23:19,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:23:19,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:23:20,842 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.893e+02 2.140e+02 2.446e+02 3.787e+02, threshold=4.279e+02, percent-clipped=0.0 2023-10-03 08:23:20,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:24,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:23:26,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:23:28,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:23:28,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 08:23:30,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:23:32,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:23:32,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:23:33,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:23:33,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 08:23:35,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:23:35,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:35,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1201133.3333333333, ans=0.5 2023-10-03 08:23:37,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:37,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 08:23:38,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.62 vs. limit=12.0 2023-10-03 08:23:39,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 08:23:40,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:23:48,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:23:49,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 08:23:50,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:50,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:23:54,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:23:54,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 08:23:56,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:56,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 08:23:57,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:57,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:23:58,629 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.77 vs. limit=12.0 2023-10-03 08:23:59,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 08:24:06,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:24:10,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:24:11,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:15,138 INFO [train.py:1046] (2/4) Epoch 34, batch 4900, loss[loss=0.1568, simple_loss=0.2499, pruned_loss=0.03185, over 24254.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2394, pruned_loss=0.0407, over 4725481.50 frames. ], batch size: 74, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:24:15,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.23 vs. limit=15.0 2023-10-03 08:24:17,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 08:24:17,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:24:22,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:24:24,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:24:24,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:24:27,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 08:24:31,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 08:24:35,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 08:24:37,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 08:24:37,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:24:37,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:24:38,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:24:38,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:38,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:24:38,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 08:24:41,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 08:24:42,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:24:44,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:24:44,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:24:47,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:24:48,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:24:50,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:24:50,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 08:24:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:24:53,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:53,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 08:24:53,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 08:24:57,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 08:24:58,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:24:58,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:24:59,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1201533.3333333333, ans=0.125 2023-10-03 08:25:00,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:25:00,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:01,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 08:25:01,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:25:01,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 08:25:03,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:05,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:25:07,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:25:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 08:25:11,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:25:12,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 08:25:12,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 08:25:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:25:19,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:25:21,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 08:25:21,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:25:21,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:25:23,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:27,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:25:27,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:25:27,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:25:29,182 INFO [train.py:1046] (2/4) Epoch 34, batch 4950, loss[loss=0.1582, simple_loss=0.2284, pruned_loss=0.04394, over 23665.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2381, pruned_loss=0.04065, over 4713663.04 frames. ], batch size: 256, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:25:29,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 08:25:29,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:25:29,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1201666.6666666667, ans=0.0 2023-10-03 08:25:32,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:25:33,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:25:33,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1201666.6666666667, ans=0.0 2023-10-03 08:25:34,327 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.62 vs. limit=12.0 2023-10-03 08:25:36,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 08:25:37,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 08:25:37,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:25:39,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 08:25:39,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:39,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:25:39,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:25:39,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:25:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:42,059 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:25:43,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:25:43,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:25:44,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:25:47,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:47,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:25:50,569 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.889e+02 2.059e+02 2.301e+02 3.763e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-03 08:25:50,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:25:56,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:57,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:26:00,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:00,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:01,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:26:03,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 08:26:04,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 08:26:05,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:08,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:26:08,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:26:09,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:26:09,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:26:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:26:11,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:26:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:26:14,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1201866.6666666667, ans=0.1 2023-10-03 08:26:15,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:26:17,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:17,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:18,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 08:26:18,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:26:19,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1201866.6666666667, ans=0.0 2023-10-03 08:26:20,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:26:24,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:26:26,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:26:26,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:26:28,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:28,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:26:29,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:26:31,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:26:31,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:26:31,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1201933.3333333333, ans=0.04949747468305833 2023-10-03 08:26:32,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:26:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 08:26:35,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1201933.3333333333, ans=0.0 2023-10-03 08:26:38,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:26:42,291 INFO [train.py:1046] (2/4) Epoch 34, batch 5000, loss[loss=0.1642, simple_loss=0.2306, pruned_loss=0.04889, over 23721.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2375, pruned_loss=0.04059, over 4694169.21 frames. ], batch size: 164, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:26:42,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 08:26:42,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 08:26:42,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1202000.0, ans=0.0 2023-10-03 08:26:48,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:48,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:26:51,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 08:26:51,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 08:26:54,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:26:55,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 08:26:55,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:26:55,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:26:57,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 08:26:59,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:59,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:27:01,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 08:27:01,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:27:01,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:02,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 08:27:03,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 08:27:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:27:05,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 08:27:05,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:27:05,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:05,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:27:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 08:27:06,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 08:27:09,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 08:27:09,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:27:09,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:10,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 08:27:10,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:27:10,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:27:13,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 08:27:16,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 08:27:16,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:27:18,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:27:22,723 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 08:27:22,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1202133.3333333333, ans=0.125 2023-10-03 08:27:24,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1202133.3333333333, ans=0.0 2023-10-03 08:27:27,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:27:28,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:28,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:28,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1202200.0, ans=0.1 2023-10-03 08:27:32,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 08:27:32,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:27:32,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:32,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:27:34,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 08:27:34,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:27:35,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1202200.0, ans=0.0 2023-10-03 08:27:38,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:27:39,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:27:43,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 08:27:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:51,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.90 vs. limit=15.0 2023-10-03 08:27:57,230 INFO [train.py:1046] (2/4) Epoch 34, batch 5050, loss[loss=0.1689, simple_loss=0.2528, pruned_loss=0.04247, over 23634.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2383, pruned_loss=0.04062, over 4703772.30 frames. ], batch size: 106, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:27:57,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:58,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:58,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:27:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:27:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:27:58,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:28:00,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:02,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:04,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 08:28:05,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:28:08,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:28:09,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:28:10,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 08:28:11,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:28:11,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:28:14,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:28:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:28:15,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:28:18,351 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.843e+02 2.007e+02 2.253e+02 3.128e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 08:28:22,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 08:28:23,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:28:24,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:28:24,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 08:28:26,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:28:28,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:28,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:28:29,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:28:29,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 08:28:29,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 08:28:30,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:33,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:28:36,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 08:28:38,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:28:41,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 08:28:42,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:28:42,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:28:42,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:28:43,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:28:45,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:28:46,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:28:46,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:48,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:28:48,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:28:48,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 08:28:49,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:28:52,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:28:56,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:28:56,869 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 08:28:56,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:28:58,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:28:59,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1202600.0, ans=0.1 2023-10-03 08:29:00,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:00,284 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 08:29:01,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:29:01,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 08:29:01,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:04,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:29:06,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:06,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 08:29:07,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 08:29:09,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:09,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:09,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:29:09,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1202666.6666666667, ans=0.07 2023-10-03 08:29:10,813 INFO [train.py:1046] (2/4) Epoch 34, batch 5100, loss[loss=0.17, simple_loss=0.236, pruned_loss=0.05197, over 23808.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2393, pruned_loss=0.04103, over 4709786.54 frames. ], batch size: 164, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:29:12,317 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 08:29:15,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:29:17,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 08:29:17,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 08:29:19,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:19,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1202666.6666666667, ans=0.125 2023-10-03 08:29:20,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:29:21,455 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=15.0 2023-10-03 08:29:22,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:29:23,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 08:29:24,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 08:29:24,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1202733.3333333333, ans=0.125 2023-10-03 08:29:28,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:29:29,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:29:33,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:36,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 08:29:38,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:39,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:39,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:29:45,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:45,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:45,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 08:29:48,075 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 08:29:48,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:48,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 08:29:49,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 08:29:52,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:57,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-03 08:29:58,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1202866.6666666667, ans=0.125 2023-10-03 08:30:00,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:04,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 08:30:04,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 08:30:04,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 08:30:06,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 08:30:06,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:30:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 08:30:11,157 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.62 vs. limit=15.0 2023-10-03 08:30:13,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 08:30:15,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:30:15,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:30:19,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 08:30:20,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:30:20,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 08:30:24,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.31 vs. limit=15.0 2023-10-03 08:30:25,336 INFO [train.py:1046] (2/4) Epoch 34, batch 5150, loss[loss=0.1749, simple_loss=0.2577, pruned_loss=0.04604, over 23904.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2404, pruned_loss=0.04131, over 4719006.45 frames. ], batch size: 86, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:30:26,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:30:26,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:30:26,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:30:26,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:30:28,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:30:28,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:30:29,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 08:30:29,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 08:30:29,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 08:30:31,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:30:31,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 08:30:32,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 08:30:36,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:30:37,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:30:42,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:30:42,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 08:30:43,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:43,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:30:44,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:30:44,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:30:44,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:30:46,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:30:46,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:30:48,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.915e+02 2.112e+02 2.409e+02 3.229e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 08:30:48,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 08:30:49,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:30:49,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:30:52,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:30:53,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 08:30:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:31:01,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:31:03,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 08:31:03,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1203133.3333333333, ans=0.125 2023-10-03 08:31:04,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1203133.3333333333, ans=0.0 2023-10-03 08:31:05,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:12,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:31:12,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:31:16,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:31:18,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:31:19,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 08:31:21,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1203200.0, ans=0.2 2023-10-03 08:31:22,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:31:23,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:31:24,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1203266.6666666667, ans=0.125 2023-10-03 08:31:25,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:31:28,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:31:30,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:31:31,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 08:31:33,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1203266.6666666667, ans=0.1 2023-10-03 08:31:36,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:31:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:31:40,217 INFO [train.py:1046] (2/4) Epoch 34, batch 5200, loss[loss=0.1462, simple_loss=0.2287, pruned_loss=0.03187, over 24597.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2412, pruned_loss=0.04119, over 4720692.15 frames. ], batch size: 60, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:31:40,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:31:40,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:31:42,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:31:42,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:31:42,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:31:42,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:31:43,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1203333.3333333333, ans=0.2 2023-10-03 08:31:46,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:31:47,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:31:49,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:50,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.89 vs. limit=22.5 2023-10-03 08:31:53,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 08:31:55,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:31:55,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:31:57,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:57,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:31:58,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:32:01,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 08:32:02,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:32:04,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:05,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 08:32:06,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:32:08,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:32:08,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.93 vs. limit=6.0 2023-10-03 08:32:09,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.16 vs. limit=22.5 2023-10-03 08:32:09,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 08:32:09,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 08:32:12,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 08:32:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:14,195 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 08:32:14,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:32:15,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:15,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:32:17,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 08:32:17,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:32:20,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:32:21,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 08:32:21,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 08:32:21,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 08:32:27,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 08:32:27,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:32:32,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:32:32,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:32:35,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 08:32:36,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:32:36,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 08:32:36,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:36,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:32:39,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:32:41,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:32:44,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:44,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:32:44,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:50,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:32:51,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 08:32:52,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:32:52,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:32:54,306 INFO [train.py:1046] (2/4) Epoch 34, batch 5250, loss[loss=0.1654, simple_loss=0.2447, pruned_loss=0.043, over 24446.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2405, pruned_loss=0.04097, over 4712970.54 frames. ], batch size: 66, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:32:54,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:55,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:32:55,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:32:58,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:33:01,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:33:01,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:33:03,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:33:07,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:33:10,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:33:12,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:33:13,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:33:15,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 08:33:15,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:33:16,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:33:18,464 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.928e+02 2.131e+02 2.518e+02 4.702e+02, threshold=4.262e+02, percent-clipped=2.0 2023-10-03 08:33:19,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.68 vs. limit=15.0 2023-10-03 08:34:03,909 INFO [train.py:1046] (2/4) Epoch 34, batch 5300, loss[loss=0.1533, simple_loss=0.2367, pruned_loss=0.03496, over 24466.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2386, pruned_loss=0.04055, over 4709905.20 frames. ], batch size: 63, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:34:09,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1204000.0, ans=0.125 2023-10-03 08:34:19,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:34:19,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 08:34:19,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 08:34:19,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:19,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:19,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:19,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:19,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:19,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:19,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:19,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:34:20,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:34:20,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 08:34:20,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 08:34:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 08:34:20,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:34:20,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 08:34:20,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 08:34:20,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:21,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:21,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:34:21,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:34:21,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:34:21,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:34:21,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:21,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:21,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:34:21,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:21,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:34:21,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:21,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:34:22,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 08:34:22,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:34:22,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:22,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 08:34:22,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 08:34:22,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:34:22,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:34:22,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 08:34:23,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 08:34:23,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:34:23,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:34:24,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:34:24,232 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 08:34:24,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 08:34:24,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:34:24,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:24,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 08:34:24,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 08:34:24,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 08:34:24,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:34:31,026 INFO [train.py:1046] (2/4) Epoch 35, batch 0, loss[loss=0.162, simple_loss=0.2504, pruned_loss=0.03683, over 24658.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2504, pruned_loss=0.03683, over 24658.00 frames. ], batch size: 73, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:34:31,026 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 08:34:43,446 INFO [train.py:1078] (2/4) Epoch 35, validation: loss=0.3289, simple_loss=0.2753, pruned_loss=0.1913, over 1125622.00 frames. 2023-10-03 08:34:43,446 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 08:34:44,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 08:34:44,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:34:46,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:34:51,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:34:51,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:34:51,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:53,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 08:34:54,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 08:34:57,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:57,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:35:00,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1204153.3333333333, ans=0.2 2023-10-03 08:35:01,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:35:01,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:03,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:35:03,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:35:04,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 08:35:07,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:35:15,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:35:15,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:17,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 08:35:17,401 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:35:18,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1204220.0, ans=0.1 2023-10-03 08:35:21,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:35:21,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:35:23,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:35:26,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:35:27,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1204286.6666666667, ans=0.1 2023-10-03 08:35:31,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:35:33,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1204286.6666666667, ans=0.0 2023-10-03 08:35:38,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 08:35:42,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 08:35:43,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:35:43,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:44,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:35:44,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:47,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 08:35:49,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:49,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:54,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:35:56,824 INFO [train.py:1046] (2/4) Epoch 35, batch 50, loss[loss=0.1611, simple_loss=0.2362, pruned_loss=0.04303, over 23628.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2424, pruned_loss=0.04038, over 1076000.45 frames. ], batch size: 256, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:35:58,321 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 08:35:59,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:36:01,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:36:03,819 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.925e+02 2.366e+02 2.722e+02 6.685e+02, threshold=4.732e+02, percent-clipped=5.0 2023-10-03 08:36:03,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:36:03,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 08:36:03,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:36:04,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:36:07,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:08,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:11,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:36:14,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 08:36:14,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:16,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1204486.6666666667, ans=0.125 2023-10-03 08:36:21,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:36:23,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 08:36:25,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 08:36:27,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:36:27,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:36:29,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:29,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:36:29,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:36:30,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:36:30,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:37,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:36:39,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:36:40,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:36:40,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 08:36:42,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:36:43,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:36:43,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 08:36:43,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:36:45,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 08:36:52,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:36:53,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:36:55,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:57,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:36:57,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:36:57,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1204686.6666666667, ans=0.1 2023-10-03 08:36:58,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 08:36:58,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 08:37:00,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:37:01,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:37:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:37:02,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:37:02,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 08:37:03,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 08:37:04,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 08:37:05,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:07,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:37:08,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 08:37:08,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 08:37:08,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:09,817 INFO [train.py:1046] (2/4) Epoch 35, batch 100, loss[loss=0.165, simple_loss=0.2547, pruned_loss=0.03766, over 24637.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2438, pruned_loss=0.04116, over 1892154.59 frames. ], batch size: 73, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:37:09,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:37:11,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:37:11,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:37:15,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:37:18,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:37:21,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:37:22,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 08:37:22,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:37:26,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:37:26,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:37:26,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:37:26,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:37:26,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1204820.0, ans=0.0 2023-10-03 08:37:27,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:37:29,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 08:37:32,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:37:32,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:32,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:37:33,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:37:36,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 08:37:38,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:38,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:37:39,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:37:42,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:37:46,646 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 08:37:46,666 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 08:37:49,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:37:49,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:37:52,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:37:53,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:56,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:00,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:01,594 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 08:38:04,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 08:38:07,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:38:07,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:38:11,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:13,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:14,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:38:17,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:38:17,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:21,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:21,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:38:21,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:21,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 08:38:22,598 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 08:38:22,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:38:24,005 INFO [train.py:1046] (2/4) Epoch 35, batch 150, loss[loss=0.1406, simple_loss=0.2302, pruned_loss=0.02549, over 24692.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2433, pruned_loss=0.04147, over 2527657.35 frames. ], batch size: 65, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:38:24,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:24,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:24,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 08:38:24,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:38:25,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:38:25,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:26,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:26,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:28,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:38:28,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:38:30,898 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.859e+02 2.007e+02 2.245e+02 3.352e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 08:38:31,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:33,132 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.29 vs. limit=12.0 2023-10-03 08:38:35,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:38:35,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:38:35,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:37,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:37,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:41,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:38:42,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:45,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 08:38:45,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 08:38:45,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 08:38:47,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1205153.3333333333, ans=0.125 2023-10-03 08:38:48,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:38:48,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:38:48,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1205153.3333333333, ans=0.1 2023-10-03 08:38:49,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:38:50,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:51,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:51,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:53,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:54,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 08:38:56,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:00,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:39:00,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1205220.0, ans=0.125 2023-10-03 08:39:00,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1205220.0, ans=0.1 2023-10-03 08:39:03,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:39:04,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 08:39:05,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1205220.0, ans=0.125 2023-10-03 08:39:09,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:39:09,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:39:09,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:39:10,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:39:11,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:39:13,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:39:14,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:15,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1205286.6666666667, ans=0.035 2023-10-03 08:39:15,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1205286.6666666667, ans=0.125 2023-10-03 08:39:16,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 08:39:20,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:21,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:21,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:39:21,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:39:23,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:26,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 08:39:27,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:39:29,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:39:30,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:39:33,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:39:33,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 08:39:35,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:39:35,255 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 08:39:36,562 INFO [train.py:1046] (2/4) Epoch 35, batch 200, loss[loss=0.1759, simple_loss=0.2505, pruned_loss=0.05061, over 22709.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2437, pruned_loss=0.04178, over 3029329.82 frames. ], batch size: 322, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:39:38,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:42,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:39:42,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:39:43,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 08:39:45,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:39:45,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:48,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 08:39:49,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:39:49,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1205486.6666666667, ans=0.0 2023-10-03 08:39:51,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:51,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:56,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:39:56,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:57,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:01,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1205486.6666666667, ans=22.5 2023-10-03 08:40:06,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1205553.3333333333, ans=0.1 2023-10-03 08:40:12,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.11 vs. limit=15.0 2023-10-03 08:40:14,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:40:14,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:40:14,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:40:16,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:40:18,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 08:40:18,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:40:19,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:20,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:40:21,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:40:22,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:40:22,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 08:40:23,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:40:23,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:27,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:40:34,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:40:40,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:41,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:40:47,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:50,611 INFO [train.py:1046] (2/4) Epoch 35, batch 250, loss[loss=0.1504, simple_loss=0.231, pruned_loss=0.03493, over 24590.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.242, pruned_loss=0.04113, over 3418820.43 frames. ], batch size: 60, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:40:50,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 08:40:50,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:50,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:40:50,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:40:50,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:40:52,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 08:40:52,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:40:53,521 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 08:40:54,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.13 vs. limit=22.5 2023-10-03 08:40:54,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:56,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:40:57,920 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.919e+02 2.120e+02 2.596e+02 4.381e+02, threshold=4.240e+02, percent-clipped=2.0 2023-10-03 08:40:58,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:59,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:41:03,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:41:03,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:41:04,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1205820.0, ans=0.0 2023-10-03 08:41:05,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:41:08,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:41:11,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1205820.0, ans=0.125 2023-10-03 08:41:18,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:41:20,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.40 vs. limit=15.0 2023-10-03 08:41:21,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:41:21,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:41:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:41:27,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:41:27,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1205886.6666666667, ans=0.125 2023-10-03 08:41:28,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:41:28,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:41:30,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:41:30,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:41:31,323 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.72 vs. limit=22.5 2023-10-03 08:41:31,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:41:33,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:41:36,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 08:41:36,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:41:39,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:41:39,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:41:39,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:41:39,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:41:40,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:41:40,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:41:43,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:41:43,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:41:44,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:41:47,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:41:52,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:41:53,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:41:57,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:41:59,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:42:02,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 08:42:04,662 INFO [train.py:1046] (2/4) Epoch 35, batch 300, loss[loss=0.1429, simple_loss=0.2198, pruned_loss=0.033, over 24318.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2401, pruned_loss=0.0406, over 3714009.97 frames. ], batch size: 56, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:42:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:42:04,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:42:07,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 08:42:07,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:42:07,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:42:07,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 08:42:11,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:42:13,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:42:14,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1206086.6666666667, ans=0.0 2023-10-03 08:42:17,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:42:17,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 08:42:19,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:42:20,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:42:20,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 08:42:20,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:42:23,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:42:28,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:42:28,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 08:42:32,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 08:42:32,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:35,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:42:36,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:36,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 08:42:36,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:42:39,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:42:42,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:42:42,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:42:45,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 08:42:45,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 08:42:47,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:42:50,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:51,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 08:42:53,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:42:55,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:42:58,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:42:58,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 08:43:03,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:03,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:43:06,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:07,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:43:07,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 08:43:07,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:43:09,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:09,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 08:43:10,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:12,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:12,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:43:12,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:13,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:18,386 INFO [train.py:1046] (2/4) Epoch 35, batch 350, loss[loss=0.1593, simple_loss=0.2529, pruned_loss=0.03286, over 24632.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2382, pruned_loss=0.03993, over 3941163.25 frames. ], batch size: 68, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:43:18,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:43:18,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 08:43:21,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:25,038 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.902e+02 2.096e+02 2.398e+02 4.416e+02, threshold=4.192e+02, percent-clipped=1.0 2023-10-03 08:43:27,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:43:28,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1206420.0, ans=0.2 2023-10-03 08:43:30,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:30,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1206420.0, ans=0.0 2023-10-03 08:43:30,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.14 vs. limit=22.5 2023-10-03 08:43:31,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:34,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 08:43:36,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:43:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 08:43:37,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:38,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 08:43:38,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:41,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 08:43:44,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:43:45,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:45,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:43:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:43:47,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:43:48,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:43:48,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:48,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:43:51,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:43:51,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:58,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:43:58,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:44:00,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:44:00,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:05,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 08:44:05,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:44:09,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:09,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:09,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:44:10,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1206620.0, ans=0.125 2023-10-03 08:44:11,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 08:44:14,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:15,528 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 08:44:16,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 08:44:16,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:19,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:44:19,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 08:44:21,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:23,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:44:24,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:25,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:25,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:27,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:44:31,941 INFO [train.py:1046] (2/4) Epoch 35, batch 400, loss[loss=0.1675, simple_loss=0.2431, pruned_loss=0.04595, over 23383.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2382, pruned_loss=0.04035, over 4103751.63 frames. ], batch size: 119, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:44:33,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:44:33,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 08:44:33,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:35,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:44:36,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:44:36,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:39,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:41,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:41,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 08:44:42,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1206753.3333333333, ans=0.125 2023-10-03 08:44:43,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 08:44:43,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:44:45,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 08:44:45,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:50,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:44:50,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:50,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 08:44:50,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:44:52,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:52,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:52,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:54,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1206820.0, ans=0.2 2023-10-03 08:44:55,490 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 08:44:56,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 08:45:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:45:01,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1206886.6666666667, ans=0.125 2023-10-03 08:45:03,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:45:04,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 08:45:04,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 08:45:05,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1206886.6666666667, ans=0.025 2023-10-03 08:45:09,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:45:10,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:45:19,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 08:45:21,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:45:22,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 08:45:23,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:45:25,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:45:25,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 08:45:29,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:45:31,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:45:34,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:45:37,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:45:37,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 08:45:39,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:45:41,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 08:45:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:45:43,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:45:44,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1207020.0, ans=0.125 2023-10-03 08:45:45,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 08:45:46,703 INFO [train.py:1046] (2/4) Epoch 35, batch 450, loss[loss=0.1998, simple_loss=0.2634, pruned_loss=0.06814, over 19414.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2384, pruned_loss=0.04021, over 4234994.71 frames. ], batch size: 388, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:45:48,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:45:48,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:45:48,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:45:49,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 08:45:49,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:45:50,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:45:52,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:45:52,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 08:45:52,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:45:53,602 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.846e+02 1.963e+02 2.221e+02 3.123e+02, threshold=3.927e+02, percent-clipped=0.0 2023-10-03 08:45:53,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:45:54,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1207086.6666666667, ans=0.0 2023-10-03 08:45:57,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:46:06,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:06,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:09,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 08:46:10,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 08:46:13,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:46:15,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:16,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:46:19,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:46:20,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:46:24,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 08:46:24,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 08:46:26,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 08:46:28,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:46:29,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:46:29,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:46:31,052 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 08:46:31,060 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 08:46:32,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:33,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:46:35,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:46:37,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:46:38,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:46:38,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 08:46:40,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 08:46:42,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:46,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:46:47,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:46:47,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 08:46:50,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:46:50,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 08:46:50,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 08:46:51,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:57,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:47:00,729 INFO [train.py:1046] (2/4) Epoch 35, batch 500, loss[loss=0.1531, simple_loss=0.2282, pruned_loss=0.03901, over 24611.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2391, pruned_loss=0.04077, over 4337034.17 frames. ], batch size: 60, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:47:00,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:47:00,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:47:02,214 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 08:47:02,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1207420.0, ans=0.0 2023-10-03 08:47:02,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1207420.0, ans=10.0 2023-10-03 08:47:05,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:47:05,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:47:05,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-10-03 08:47:06,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 08:47:08,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 08:47:08,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:11,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:47:14,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:47:15,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:47:17,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:47:18,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:47:18,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:28,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:28,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:47:30,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:47:30,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:30,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 08:47:31,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:47:33,487 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.199e-03 2023-10-03 08:47:34,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:47:35,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:47:35,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:47:35,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:37,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 08:47:41,191 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 08:47:43,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:47:45,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:45,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:47,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:47,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:47:48,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 08:47:50,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:47:52,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:47:52,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1207620.0, ans=0.125 2023-10-03 08:47:56,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:58,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:48:05,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:48:07,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 08:48:07,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:07,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1207686.6666666667, ans=0.125 2023-10-03 08:48:07,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1207686.6666666667, ans=0.125 2023-10-03 08:48:08,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:48:12,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 08:48:13,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:48:15,001 INFO [train.py:1046] (2/4) Epoch 35, batch 550, loss[loss=0.167, simple_loss=0.2555, pruned_loss=0.03926, over 24423.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2401, pruned_loss=0.04116, over 4411085.68 frames. ], batch size: 69, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:48:16,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:19,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 08:48:21,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 08:48:21,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:48:21,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 08:48:22,462 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.820e+02 2.073e+02 2.406e+02 3.793e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 08:48:22,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:48:22,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:48:23,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:23,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:25,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:48:26,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:48:28,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1207820.0, ans=0.2 2023-10-03 08:48:28,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1207820.0, ans=0.1 2023-10-03 08:48:29,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:29,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 08:48:29,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:48:32,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1207820.0, ans=0.0 2023-10-03 08:48:33,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:48:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:36,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:48:38,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:41,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 08:48:44,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 08:48:44,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1207886.6666666667, ans=0.125 2023-10-03 08:48:45,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:48:50,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:48:50,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:48:51,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:48:54,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:48:54,597 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 08:48:56,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:57,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 08:49:00,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:49:00,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:49:00,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:49:01,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:02,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 08:49:03,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 08:49:03,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1207953.3333333333, ans=0.1 2023-10-03 08:49:04,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:04,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:49:04,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:49:04,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:49:07,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:49:09,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:49:12,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:49:12,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:12,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:49:15,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:49:15,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:16,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:49:18,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:19,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:49:21,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 08:49:21,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1208020.0, ans=0.1 2023-10-03 08:49:25,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1208020.0, ans=0.0 2023-10-03 08:49:26,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 08:49:28,076 INFO [train.py:1046] (2/4) Epoch 35, batch 600, loss[loss=0.1672, simple_loss=0.2572, pruned_loss=0.03855, over 24659.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2403, pruned_loss=0.04104, over 4485689.25 frames. ], batch size: 68, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:49:29,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 08:49:30,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:49:32,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:49:32,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:36,700 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.29 vs. limit=22.5 2023-10-03 08:49:37,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:49:39,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:49:41,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 08:49:43,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:49:44,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:49:46,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:47,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 08:49:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:49:55,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 08:49:59,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:49:59,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:59,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:50:01,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten.whitening_limit, batch_count=1208220.0, ans=15.0 2023-10-03 08:50:03,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:50:03,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:50:04,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:11,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:50:16,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:16,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:50:16,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:50:23,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 08:50:26,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:50:27,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.38 vs. limit=15.0 2023-10-03 08:50:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:50:31,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 08:50:33,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:50:35,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 08:50:36,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:50:36,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:50:40,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1208353.3333333333, ans=0.0 2023-10-03 08:50:40,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1208353.3333333333, ans=0.0 2023-10-03 08:50:41,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 08:50:43,025 INFO [train.py:1046] (2/4) Epoch 35, batch 650, loss[loss=0.1474, simple_loss=0.2066, pruned_loss=0.04412, over 22800.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2398, pruned_loss=0.04097, over 4542605.98 frames. ], batch size: 322, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:50:43,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:50:44,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:50:46,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:50:48,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:50:48,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1208420.0, ans=0.07 2023-10-03 08:50:48,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1208420.0, ans=0.125 2023-10-03 08:50:49,939 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.925e+02 2.110e+02 2.430e+02 3.265e+02, threshold=4.220e+02, percent-clipped=0.0 2023-10-03 08:50:51,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 08:50:53,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:57,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:50:57,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:00,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:04,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 08:51:04,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:51:05,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:06,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.02 vs. limit=22.5 2023-10-03 08:51:09,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:51:09,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 08:51:13,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:14,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:15,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:51:15,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:17,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:51:19,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:51:19,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 08:51:19,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:19,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:51:23,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:24,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:51:24,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:25,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1208553.3333333333, ans=0.0 2023-10-03 08:51:26,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:51:26,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 08:51:27,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:51:27,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:51:29,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:51:29,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:51:29,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1208620.0, ans=0.125 2023-10-03 08:51:30,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:51:31,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 08:51:32,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 08:51:32,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:32,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:51:32,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:51:33,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:51:34,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:41,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:41,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:51:42,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:45,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:45,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 08:51:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:51,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1208686.6666666667, ans=0.0 2023-10-03 08:51:52,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:51:52,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:51:52,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:51:54,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:51:56,799 INFO [train.py:1046] (2/4) Epoch 35, batch 700, loss[loss=0.1591, simple_loss=0.2304, pruned_loss=0.04391, over 23980.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2385, pruned_loss=0.0408, over 4555647.03 frames. ], batch size: 196, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:51:57,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 08:51:58,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 08:52:01,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 08:52:02,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:03,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:52:05,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 08:52:10,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1208820.0, ans=0.0 2023-10-03 08:52:12,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:52:13,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:52:15,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1208820.0, ans=0.0 2023-10-03 08:52:16,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:17,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:52:17,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:52:21,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:23,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 08:52:23,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:52:25,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 08:52:27,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 08:52:33,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:52:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:52:33,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:52:37,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:52:39,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 08:52:44,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:52:44,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:52:46,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 08:52:46,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1208953.3333333333, ans=0.0 2023-10-03 08:52:48,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:52:50,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:52:53,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:52:57,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:52:57,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 08:53:02,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 08:53:02,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 08:53:04,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:07,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:53:10,785 INFO [train.py:1046] (2/4) Epoch 35, batch 750, loss[loss=0.1673, simple_loss=0.254, pruned_loss=0.04026, over 24435.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2387, pruned_loss=0.04037, over 4606151.58 frames. ], batch size: 77, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:53:10,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:10,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 08:53:16,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 08:53:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 08:53:17,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 08:53:18,615 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 2.000e+02 2.273e+02 2.606e+02 4.191e+02, threshold=4.547e+02, percent-clipped=0.0 2023-10-03 08:53:18,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 08:53:18,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 08:53:19,558 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.98 vs. limit=10.0 2023-10-03 08:53:20,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:53:20,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1209086.6666666667, ans=0.0 2023-10-03 08:53:21,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 08:53:21,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:21,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1209086.6666666667, ans=0.0 2023-10-03 08:53:22,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:53:24,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:26,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:53:26,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1209153.3333333333, ans=0.125 2023-10-03 08:53:27,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:53:27,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:29,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:53:30,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:53:31,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:53:33,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:33,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:53:34,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 08:53:36,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:53:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:53:37,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:53:38,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:53:40,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 08:53:40,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:53:43,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 08:53:43,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 08:53:43,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 08:53:43,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:53:43,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:53:45,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:53:51,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:53:52,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:53:52,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:53:54,069 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:53:54,543 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.91 vs. limit=15.0 2023-10-03 08:53:55,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:56,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1209286.6666666667, ans=0.125 2023-10-03 08:53:57,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:57,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 08:53:58,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:53:59,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 08:53:59,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:54:02,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:54:02,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 08:54:02,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:08,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:09,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1209353.3333333333, ans=0.0 2023-10-03 08:54:10,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:54:10,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:54:16,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1209353.3333333333, ans=0.0 2023-10-03 08:54:16,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.52 vs. limit=15.0 2023-10-03 08:54:17,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 08:54:17,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:54:17,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:18,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1209353.3333333333, ans=0.125 2023-10-03 08:54:20,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:20,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:23,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:23,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:54:25,518 INFO [train.py:1046] (2/4) Epoch 35, batch 800, loss[loss=0.1641, simple_loss=0.2388, pruned_loss=0.04472, over 23329.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2392, pruned_loss=0.04054, over 4630160.33 frames. ], batch size: 285, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:54:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:31,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:33,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1209420.0, ans=0.2 2023-10-03 08:54:34,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:54:34,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:34,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:34,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:37,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:40,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:42,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:54:43,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 08:54:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:47,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:47,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:54:47,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:54:48,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 08:54:48,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:48,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 08:54:52,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:54,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:56,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:56,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:00,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:00,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:04,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:55:05,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:55:05,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 08:55:07,355 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 08:55:07,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 08:55:08,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:55:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:55:10,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:10,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:55:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 08:55:16,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 08:55:19,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:55:21,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:55:25,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:55:29,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:29,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 08:55:30,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:55:32,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 08:55:38,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:55:39,646 INFO [train.py:1046] (2/4) Epoch 35, batch 850, loss[loss=0.2116, simple_loss=0.2798, pruned_loss=0.07169, over 19101.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2402, pruned_loss=0.04089, over 4649720.32 frames. ], batch size: 388, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:55:39,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:55:41,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 08:55:41,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:55:42,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:43,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 08:55:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:55:46,871 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.835e+02 2.028e+02 2.413e+02 3.992e+02, threshold=4.056e+02, percent-clipped=0.0 2023-10-03 08:55:46,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:55:47,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:48,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:55:50,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:55:52,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 08:55:52,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 08:55:52,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 08:55:52,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:55:52,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:55:55,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1209820.0, ans=0.125 2023-10-03 08:55:56,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:56,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:56,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:56:00,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:56:01,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:01,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 08:56:06,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 08:56:09,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:56:10,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 08:56:14,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 08:56:16,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 08:56:17,759 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 08:56:17,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:56:17,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:56:17,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 08:56:21,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:22,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:22,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 08:56:26,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:56:26,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:26,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.19 vs. limit=6.0 2023-10-03 08:56:27,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:56:27,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:56:30,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:56:30,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:56:31,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 08:56:31,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1209953.3333333333, ans=0.015 2023-10-03 08:56:34,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:56:34,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:56:35,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:56:35,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:56:37,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:41,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:56:43,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:56:43,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:56:44,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:56:54,424 INFO [train.py:1046] (2/4) Epoch 35, batch 900, loss[loss=0.1838, simple_loss=0.2543, pruned_loss=0.05659, over 23374.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2405, pruned_loss=0.04094, over 4674524.55 frames. ], batch size: 134, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:56:54,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:56:54,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:56:54,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 08:56:55,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:56:55,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:56:57,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 08:57:04,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:57:05,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:57:06,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 08:57:10,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:57:10,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 08:57:11,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:57:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:57:13,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:13,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1210153.3333333333, ans=0.125 2023-10-03 08:57:14,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:57:14,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:57:22,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:57:22,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:57:23,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:57:27,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:30,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 08:57:32,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:57:37,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:57:38,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:57:38,344 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 08:57:38,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1210286.6666666667, ans=0.125 2023-10-03 08:57:39,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 08:57:44,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=1210286.6666666667, ans=0.1 2023-10-03 08:57:45,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:57:45,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:57:47,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:57:51,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1210286.6666666667, ans=0.125 2023-10-03 08:57:53,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:57:53,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:57:55,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 08:57:55,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:58,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 08:58:01,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:58:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:02,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:58:02,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:06,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 08:58:06,831 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 08:58:08,142 INFO [train.py:1046] (2/4) Epoch 35, batch 950, loss[loss=0.1617, simple_loss=0.2503, pruned_loss=0.03659, over 24534.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2414, pruned_loss=0.04144, over 4676904.05 frames. ], batch size: 71, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:58:08,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 08:58:09,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 08:58:11,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:14,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 08:58:17,058 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.074e+02 2.262e+02 2.631e+02 4.033e+02, threshold=4.525e+02, percent-clipped=0.0 2023-10-03 08:58:17,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:20,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:20,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:22,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:58:23,890 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 08:58:26,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:28,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:58:28,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:30,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:58:30,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 08:58:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 08:58:32,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:34,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 08:58:34,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:58:38,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:38,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:58:38,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:40,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 08:58:42,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:58:44,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:58:44,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:58:50,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:58:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:54,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 08:58:54,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1210620.0, ans=0.125 2023-10-03 08:58:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 08:58:56,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:58:56,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1210620.0, ans=0.0 2023-10-03 08:58:58,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:58:59,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:59,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:59:04,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 08:59:04,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:59:07,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:59:08,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.11 vs. limit=15.0 2023-10-03 08:59:08,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:59:09,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 08:59:09,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:59:09,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:59:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 08:59:12,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:59:14,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:59:20,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:59:21,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 08:59:21,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 08:59:23,541 INFO [train.py:1046] (2/4) Epoch 35, batch 1000, loss[loss=0.1646, simple_loss=0.2472, pruned_loss=0.04098, over 24084.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2404, pruned_loss=0.04143, over 4686026.60 frames. ], batch size: 80, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:59:25,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:59:27,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1210753.3333333333, ans=0.2 2023-10-03 08:59:29,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 08:59:31,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:59:34,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:59:35,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 08:59:35,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 08:59:36,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1210753.3333333333, ans=15.0 2023-10-03 08:59:40,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:59:40,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:59:42,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:59:44,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 08:59:47,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 08:59:49,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 08:59:49,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:59:50,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 08:59:52,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1210886.6666666667, ans=0.2 2023-10-03 08:59:53,201 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.87 vs. limit=15.0 2023-10-03 08:59:53,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 08:59:53,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 08:59:55,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:59:55,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:59:55,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1210886.6666666667, ans=0.125 2023-10-03 08:59:55,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1210886.6666666667, ans=0.5 2023-10-03 09:00:04,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:00:05,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:00:05,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:05,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:00:05,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 09:00:05,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:00:07,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:00:07,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:00:08,441 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 09:00:12,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 09:00:13,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 09:00:13,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 09:00:15,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1210953.3333333333, ans=0.125 2023-10-03 09:00:17,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:00:23,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:23,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:00:25,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:25,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:00:27,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 09:00:29,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:00:29,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 09:00:30,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 09:00:32,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:00:32,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:00:33,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:00:36,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:00:38,078 INFO [train.py:1046] (2/4) Epoch 35, batch 1050, loss[loss=0.1655, simple_loss=0.254, pruned_loss=0.03851, over 24422.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2392, pruned_loss=0.04102, over 4678625.31 frames. ], batch size: 77, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:00:38,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:00:41,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:00:43,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:00:45,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:00:45,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:46,414 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.826e+02 1.998e+02 2.224e+02 3.015e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-03 09:00:47,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:00:49,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:00:51,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:00:51,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:00:53,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:00:53,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:00:54,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:00:56,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 09:00:56,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:00:56,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1211153.3333333333, ans=0.125 2023-10-03 09:00:57,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 09:00:57,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:00:57,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 09:00:58,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:01:01,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1211153.3333333333, ans=0.125 2023-10-03 09:01:05,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:01:06,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:01:06,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:01:08,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 09:01:09,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 09:01:09,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:01:12,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 09:01:15,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 09:01:16,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:19,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 09:01:22,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:01:22,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:01:22,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:01:27,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:01:31,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 09:01:32,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 09:01:33,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 09:01:34,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:01:34,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:01:35,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.25 vs. limit=8.0 2023-10-03 09:01:35,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 09:01:39,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:01:42,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:01:42,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:01:43,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:01:43,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:46,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:46,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 09:01:47,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:01:49,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 09:01:49,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 09:01:49,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:01:50,772 INFO [train.py:1046] (2/4) Epoch 35, batch 1100, loss[loss=0.1509, simple_loss=0.2244, pruned_loss=0.03871, over 23682.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2387, pruned_loss=0.04042, over 4690832.19 frames. ], batch size: 256, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:01:52,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:01:59,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:02:03,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:02:03,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:02:04,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:04,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 09:02:06,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:02:07,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 09:02:10,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:02:13,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:02:13,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 09:02:14,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:02:15,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:17,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:02:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:02:21,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:02:26,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:02:29,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 09:02:31,437 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 09:02:31,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:34,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:34,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:02:34,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:02:35,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 09:02:37,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:02:37,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:02:37,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:02:37,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:37,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 09:02:43,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:02:43,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 09:02:45,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:02:47,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1211620.0, ans=0.0 2023-10-03 09:02:48,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:02:51,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 09:02:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:02:54,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:56,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:56,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:02:58,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 09:02:58,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:02:59,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:03:00,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-10-03 09:03:00,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 09:03:01,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:03:01,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 09:03:02,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:02,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:03:02,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1211686.6666666667, ans=0.0 2023-10-03 09:03:03,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:03:05,197 INFO [train.py:1046] (2/4) Epoch 35, batch 1150, loss[loss=0.1553, simple_loss=0.2423, pruned_loss=0.03414, over 24638.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2389, pruned_loss=0.04058, over 4688357.46 frames. ], batch size: 68, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:03:06,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:09,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:03:12,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:03:12,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:03:12,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 09:03:12,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:03:12,874 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.62 vs. limit=12.0 2023-10-03 09:03:13,385 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.034e+02 2.362e+02 3.611e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 09:03:15,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 09:03:16,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:16,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:03:22,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 09:03:23,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:03:28,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:29,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:29,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 09:03:29,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:03:29,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:03:33,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 09:03:34,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:03:34,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1211886.6666666667, ans=0.125 2023-10-03 09:03:36,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:03:44,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:49,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:49,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 09:03:49,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:51,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:56,235 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 09:03:57,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:04:06,292 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 09:04:10,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:10,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:04:11,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:04:11,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:04:13,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.12 vs. limit=15.0 2023-10-03 09:04:14,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:04:17,435 INFO [train.py:1046] (2/4) Epoch 35, batch 1200, loss[loss=0.1778, simple_loss=0.2416, pruned_loss=0.05693, over 23822.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2398, pruned_loss=0.04091, over 4689294.12 frames. ], batch size: 179, lr: 2.92e-03, grad_scale: 32.0 2023-10-03 09:04:17,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1212086.6666666667, ans=0.05 2023-10-03 09:04:19,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1212086.6666666667, ans=15.0 2023-10-03 09:04:20,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:04:20,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:04:21,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:04:21,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:22,327 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.77 vs. limit=15.0 2023-10-03 09:04:23,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:04:25,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1212086.6666666667, ans=0.1 2023-10-03 09:04:26,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:04:26,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:04:29,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:04:29,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:04:33,083 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 09:04:35,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 09:04:40,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:04:40,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1212153.3333333333, ans=0.1 2023-10-03 09:04:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:04:44,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:04:46,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:04:46,908 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 09:04:48,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:55,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:04:55,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:04:55,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 09:04:57,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:05:00,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 09:05:05,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 09:05:05,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:05:05,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:05:07,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:05:08,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:05:09,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:05:09,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:05:09,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:05:11,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 09:05:12,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:05:12,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:05:12,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:05:13,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:05:14,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:05:19,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:05:20,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:05:23,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 09:05:24,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1212353.3333333333, ans=0.125 2023-10-03 09:05:29,252 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 09:05:31,165 INFO [train.py:1046] (2/4) Epoch 35, batch 1250, loss[loss=0.1579, simple_loss=0.2445, pruned_loss=0.03562, over 24480.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.241, pruned_loss=0.04093, over 4705157.64 frames. ], batch size: 66, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:05:31,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:05:32,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:05:34,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:05:35,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:05:37,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 09:05:40,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:05:41,844 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.899e+02 2.182e+02 2.478e+02 3.266e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 09:05:41,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:05:43,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 09:05:43,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1212420.0, ans=10.0 2023-10-03 09:05:44,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:05:44,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:05:48,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:05:48,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:05:50,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:05:50,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:05:53,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:05:57,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:05:57,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:05:57,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:05:59,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:06:00,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:03,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:03,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1212553.3333333333, ans=0.125 2023-10-03 09:06:04,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:06:10,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 09:06:10,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:06:12,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:06:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 09:06:14,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:06:15,635 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 09:06:15,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:15,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:16,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.79 vs. limit=10.0 2023-10-03 09:06:18,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:21,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:22,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:06:23,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 09:06:23,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 09:06:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 09:06:26,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:06:28,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 09:06:28,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:32,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 09:06:32,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:06:32,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 09:06:32,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:06:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:06:34,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:06:34,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:06:37,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 09:06:40,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:06:41,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:06:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:06:44,590 INFO [train.py:1046] (2/4) Epoch 35, batch 1300, loss[loss=0.1558, simple_loss=0.2436, pruned_loss=0.03401, over 24425.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2419, pruned_loss=0.04179, over 4700888.65 frames. ], batch size: 69, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:06:44,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1212753.3333333333, ans=0.2 2023-10-03 09:06:46,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:06:48,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:06:48,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 09:06:52,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:06:55,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:06:57,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:06:57,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:07:00,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:07:02,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 09:07:04,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1212820.0, ans=0.035 2023-10-03 09:07:05,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:07:06,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:07:06,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 09:07:09,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:07:10,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1212820.0, ans=0.125 2023-10-03 09:07:13,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:07:15,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:07:17,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:17,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:07:17,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1212886.6666666667, ans=0.07 2023-10-03 09:07:18,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:07:18,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 09:07:18,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1212886.6666666667, ans=0.0 2023-10-03 09:07:24,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:07:24,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:07:27,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 09:07:27,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:07:30,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:07:33,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:07:33,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 09:07:33,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:07:34,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 09:07:34,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:07:39,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:07:39,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:07:42,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 09:07:43,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 09:07:44,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 09:07:48,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:07:50,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 09:07:53,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:57,293 INFO [train.py:1046] (2/4) Epoch 35, batch 1350, loss[loss=0.1463, simple_loss=0.2307, pruned_loss=0.03101, over 24304.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2413, pruned_loss=0.04145, over 4705139.11 frames. ], batch size: 61, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:07:59,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 09:08:01,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:03,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:07,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:08:07,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1213086.6666666667, ans=0.2 2023-10-03 09:08:08,374 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.947e+02 2.144e+02 2.393e+02 3.515e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 09:08:08,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:09,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:08:11,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:08:15,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:08:17,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 09:08:17,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:08:18,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:08:21,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 09:08:21,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:08:23,402 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.34 vs. limit=15.0 2023-10-03 09:08:24,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:08:24,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 09:08:25,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 09:08:27,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 09:08:27,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1213220.0, ans=0.125 2023-10-03 09:08:28,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:28,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 09:08:37,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1213220.0, ans=0.0 2023-10-03 09:08:39,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:41,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1213286.6666666667, ans=0.125 2023-10-03 09:08:49,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:49,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:08:49,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 09:08:49,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff2.min_abs, batch_count=1213286.6666666667, ans=0.1 2023-10-03 09:08:52,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:08:52,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 09:08:53,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:08:53,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:57,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:08:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 09:08:59,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:09:04,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 09:09:07,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 09:09:12,355 INFO [train.py:1046] (2/4) Epoch 35, batch 1400, loss[loss=0.1673, simple_loss=0.2547, pruned_loss=0.03995, over 23932.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2394, pruned_loss=0.04095, over 4699210.01 frames. ], batch size: 86, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:09:15,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 09:09:16,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:09:19,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:09:19,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:09:23,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 09:09:26,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 09:09:34,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:09:36,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:09:39,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:09:39,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:09:39,817 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=12.0 2023-10-03 09:09:42,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:09:45,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 09:09:45,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1213553.3333333333, ans=0.125 2023-10-03 09:09:50,101 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-10-03 09:09:52,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:09:53,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:09:55,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1213620.0, ans=0.1 2023-10-03 09:09:56,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 09:09:58,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:09:58,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:09:58,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:09:59,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:10:01,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:10:01,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:10:01,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:10:03,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 09:10:03,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:10:05,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:06,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1213620.0, ans=0.0 2023-10-03 09:10:12,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:10:16,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 09:10:17,506 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-10-03 09:10:18,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:10:19,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:10:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 09:10:21,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1213686.6666666667, ans=0.125 2023-10-03 09:10:22,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:24,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:10:26,293 INFO [train.py:1046] (2/4) Epoch 35, batch 1450, loss[loss=0.1365, simple_loss=0.2154, pruned_loss=0.02881, over 24332.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2382, pruned_loss=0.04064, over 4690711.85 frames. ], batch size: 56, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:10:29,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:10:31,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:10:31,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:31,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 09:10:35,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1213753.3333333333, ans=0.0 2023-10-03 09:10:36,986 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.854e+02 2.034e+02 2.256e+02 3.370e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 09:10:38,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:38,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:10:40,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:10:40,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 09:10:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:10:43,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 09:10:43,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:43,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:43,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 09:10:45,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:10:46,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:10:46,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 09:10:46,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:46,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:10:48,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:50,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:53,051 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=12.0 2023-10-03 09:10:55,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:10:55,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:10:56,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:57,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:59,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:59,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:10:59,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:59,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:02,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 09:11:04,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1213886.6666666667, ans=0.125 2023-10-03 09:11:07,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:11:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 09:11:11,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:11:13,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:11:15,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:16,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 09:11:19,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:20,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 09:11:22,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 09:11:22,456 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:11:23,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:27,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:11:27,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:11:28,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 09:11:31,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1214020.0, ans=0.0 2023-10-03 09:11:32,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 09:11:33,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 09:11:33,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:35,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:11:41,106 INFO [train.py:1046] (2/4) Epoch 35, batch 1500, loss[loss=0.1543, simple_loss=0.2392, pruned_loss=0.03469, over 24339.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2387, pruned_loss=0.04054, over 4700246.50 frames. ], batch size: 77, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:11:42,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 09:11:42,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:11:42,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:11:44,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:45,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:11:45,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:11:47,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 09:11:48,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:11:48,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:11:48,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:11:48,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:11:51,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:11:52,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1214086.6666666667, ans=0.5 2023-10-03 09:11:53,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:11:58,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:11:58,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 09:11:59,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:11:59,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:12:00,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1214153.3333333333, ans=0.1 2023-10-03 09:12:01,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:12:05,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 09:12:09,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 09:12:10,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1214220.0, ans=0.125 2023-10-03 09:12:11,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:12:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 09:12:14,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:12:17,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:12:17,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:12:17,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:12:18,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 09:12:18,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:12:18,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:12:20,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 09:12:20,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:12:25,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:12:25,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 09:12:28,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.47 vs. limit=10.0 2023-10-03 09:12:31,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:12:33,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:12:37,273 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 09:12:37,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:37,330 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 09:12:39,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:12:41,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:12:42,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 09:12:43,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:12:45,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 09:12:46,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:48,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1214353.3333333333, ans=0.125 2023-10-03 09:12:49,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:12:51,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:51,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:12:51,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:51,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:12:54,132 INFO [train.py:1046] (2/4) Epoch 35, batch 1550, loss[loss=0.1629, simple_loss=0.2393, pruned_loss=0.04324, over 23371.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2392, pruned_loss=0.04059, over 4709986.72 frames. ], batch size: 119, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:12:54,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 09:12:54,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 09:12:54,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:12:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 09:12:55,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 09:12:55,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1214420.0, ans=0.1 2023-10-03 09:12:58,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:12:58,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:12:59,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:12:59,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:12:59,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:01,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:04,291 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.874e+02 2.151e+02 2.475e+02 3.456e+02, threshold=4.303e+02, percent-clipped=0.0 2023-10-03 09:13:04,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 09:13:04,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:04,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:13:05,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:13:09,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:13:09,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 09:13:11,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:13:12,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 09:13:13,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 09:13:13,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 09:13:13,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:15,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:15,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1214486.6666666667, ans=0.125 2023-10-03 09:13:19,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:13:22,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 09:13:22,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 09:13:31,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:34,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:13:34,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:13:34,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:13:35,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 09:13:41,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:13:43,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:46,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:13:47,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:13:49,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:49,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 09:13:49,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:13:52,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:13:52,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:52,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1214686.6666666667, ans=0.04949747468305833 2023-10-03 09:13:54,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 09:13:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 09:13:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:59,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 09:14:01,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1214686.6666666667, ans=0.09899494936611666 2023-10-03 09:14:05,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:14:05,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:05,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 09:14:08,468 INFO [train.py:1046] (2/4) Epoch 35, batch 1600, loss[loss=0.1521, simple_loss=0.2393, pruned_loss=0.03244, over 24578.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2398, pruned_loss=0.04044, over 4715067.35 frames. ], batch size: 71, lr: 2.92e-03, grad_scale: 32.0 2023-10-03 09:14:08,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:14:09,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:14:09,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:14:09,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:14:11,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:14:14,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1214753.3333333333, ans=0.125 2023-10-03 09:14:16,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:14:17,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 09:14:17,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 09:14:19,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 09:14:19,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1214753.3333333333, ans=0.2 2023-10-03 09:14:20,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:14:22,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 09:14:22,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1214820.0, ans=0.125 2023-10-03 09:14:23,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:14:25,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:14:29,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:14:33,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 09:14:36,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:14:37,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 09:14:37,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:14:37,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 09:14:38,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1214886.6666666667, ans=0.125 2023-10-03 09:14:42,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 09:14:50,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:51,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 09:14:52,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:52,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:14:52,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:14:56,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 09:15:00,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:15:01,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:15:01,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:03,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:04,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:15:05,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:15:05,971 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:15:07,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:15:08,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:15:13,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:14,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:15:16,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 09:15:16,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:15:18,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 09:15:22,290 INFO [train.py:1046] (2/4) Epoch 35, batch 1650, loss[loss=0.1636, simple_loss=0.2539, pruned_loss=0.03669, over 24620.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2402, pruned_loss=0.04061, over 4726021.20 frames. ], batch size: 73, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:15:24,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:15:24,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:15:25,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:15:25,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 09:15:25,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 09:15:25,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 09:15:27,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 09:15:31,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:31,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:15:32,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:15:32,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:15:34,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.944e+02 2.112e+02 2.392e+02 3.284e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 09:15:35,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:15:36,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 09:15:38,367 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:15:39,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:15:39,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:15:39,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:15:39,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:15:39,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 09:15:39,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 09:15:44,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:15:47,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:15:55,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 09:15:56,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:15:58,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 09:15:59,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:02,746 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.69 vs. limit=15.0 2023-10-03 09:16:03,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:16:04,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:16:04,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:04,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:16:04,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:07,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:09,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:09,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:16:09,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1215286.6666666667, ans=0.0 2023-10-03 09:16:11,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:16:12,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:16:13,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:16:17,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:16:17,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 09:16:20,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:16:20,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 09:16:21,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 09:16:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 09:16:21,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:16:21,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1215353.3333333333, ans=0.07 2023-10-03 09:16:23,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:16:24,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:24,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:24,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 09:16:28,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:29,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:16:29,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:33,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 09:16:34,013 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.56 vs. limit=22.5 2023-10-03 09:16:35,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1215420.0, ans=0.0 2023-10-03 09:16:36,032 INFO [train.py:1046] (2/4) Epoch 35, batch 1700, loss[loss=0.1437, simple_loss=0.2066, pruned_loss=0.04039, over 23460.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2393, pruned_loss=0.04049, over 4710179.44 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:16:36,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:36,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:16:37,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 09:16:37,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:16:38,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:16:38,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:42,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:16:42,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:16:42,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 09:16:43,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:16:51,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:54,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:16:58,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:16:58,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:16:59,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:16:59,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:17:01,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 09:17:03,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:17:03,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:04,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1215553.3333333333, ans=0.125 2023-10-03 09:17:06,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:17:07,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:17:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 09:17:09,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1215553.3333333333, ans=0.2 2023-10-03 09:17:10,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 09:17:12,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:13,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 09:17:15,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:17:24,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:24,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:24,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:17:25,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:17:25,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 09:17:25,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:17:28,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:28,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 09:17:28,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:17:28,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:17:28,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:28,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:17:31,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:17:31,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:17:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:32,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:17:34,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:36,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:17:38,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 09:17:40,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:42,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:17:45,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 09:17:47,691 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.18 vs. limit=22.5 2023-10-03 09:17:51,151 INFO [train.py:1046] (2/4) Epoch 35, batch 1750, loss[loss=0.1697, simple_loss=0.2487, pruned_loss=0.0453, over 23412.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2383, pruned_loss=0.0401, over 4709208.29 frames. ], batch size: 105, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:17:51,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:53,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:17:54,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:17:55,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 09:17:55,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:59,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:17:59,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:02,916 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.871e+02 1.981e+02 2.197e+02 2.904e+02, threshold=3.962e+02, percent-clipped=0.0 2023-10-03 09:18:03,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 09:18:05,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:08,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 09:18:08,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:18:09,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:18:13,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:18:14,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1215820.0, ans=0.1 2023-10-03 09:18:15,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 09:18:15,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:18:16,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 09:18:22,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:18:22,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1215886.6666666667, ans=0.0 2023-10-03 09:18:24,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:18:24,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:18:28,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:28,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:18:31,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:18:32,064 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:18:33,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:34,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:18:35,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:18:37,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 09:18:38,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:18:41,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 09:18:42,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:18:43,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:18:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:18:50,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 09:18:51,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:52,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:18:56,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:59,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:19:01,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:19:01,609 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.39 vs. limit=15.0 2023-10-03 09:19:02,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 09:19:02,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:19:04,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:19:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:05,612 INFO [train.py:1046] (2/4) Epoch 35, batch 1800, loss[loss=0.1525, simple_loss=0.2431, pruned_loss=0.03098, over 24670.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2376, pruned_loss=0.0397, over 4716145.08 frames. ], batch size: 73, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:19:05,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:19:05,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:19:05,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:19:08,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:19:08,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:19:10,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:19:13,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:19:14,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1216086.6666666667, ans=0.0 2023-10-03 09:19:15,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:19:17,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:19:20,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:19:21,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:22,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:23,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:19:26,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:19:26,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 09:19:26,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:30,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:33,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1216220.0, ans=0.125 2023-10-03 09:19:34,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 09:19:35,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 09:19:37,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 09:19:37,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:19:37,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:37,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:19:39,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:19:45,351 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 09:19:45,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:19:46,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-10-03 09:19:47,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:49,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 09:19:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 09:19:51,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:19:52,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:19:53,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:19:58,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 09:20:05,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:20:05,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 09:20:05,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:20:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:20:06,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:20:06,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 09:20:11,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:20:11,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:20:11,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1216353.3333333333, ans=0.125 2023-10-03 09:20:12,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 09:20:12,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:20:14,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:20:16,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:20:16,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:20:16,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:20:18,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:20:19,687 INFO [train.py:1046] (2/4) Epoch 35, batch 1850, loss[loss=0.18, simple_loss=0.2519, pruned_loss=0.05403, over 22797.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2383, pruned_loss=0.03996, over 4712765.79 frames. ], batch size: 322, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:20:19,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:20:19,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:20:22,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:20:23,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:20:26,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1216420.0, ans=0.125 2023-10-03 09:20:30,537 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.898e+02 2.066e+02 2.341e+02 4.051e+02, threshold=4.131e+02, percent-clipped=1.0 2023-10-03 09:20:30,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:20:30,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 09:20:34,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 09:20:37,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 09:20:39,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1216486.6666666667, ans=0.0 2023-10-03 09:20:40,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:20:40,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 09:20:40,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 09:20:51,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:20:52,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 09:20:52,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1216553.3333333333, ans=0.2 2023-10-03 09:20:55,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:20:56,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:20:59,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 09:20:59,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:00,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:21:02,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:21:05,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:21:06,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1216620.0, ans=0.125 2023-10-03 09:21:07,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:21:09,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:21:09,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:10,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:21:10,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:12,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:21:14,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:21:17,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 09:21:17,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:21:19,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1216686.6666666667, ans=0.1 2023-10-03 09:21:22,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:21:23,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:21:23,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 09:21:23,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 09:21:25,109 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 09:21:26,437 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 09:21:27,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:21:27,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:21:27,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:21:29,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:29,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 09:21:30,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:21:30,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:31,899 INFO [train.py:1046] (2/4) Epoch 35, batch 1900, loss[loss=0.2148, simple_loss=0.2786, pruned_loss=0.07552, over 19636.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2396, pruned_loss=0.04051, over 4722026.78 frames. ], batch size: 388, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:21:31,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:21:32,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:21:32,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1216753.3333333333, ans=0.0 2023-10-03 09:21:33,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:21:33,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 09:21:36,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:36,179 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 09:21:36,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:21:38,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:40,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1216753.3333333333, ans=0.125 2023-10-03 09:21:42,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:44,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:21:46,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 09:21:46,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 09:21:47,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:21:48,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-10-03 09:21:49,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:21:49,534 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 09:21:49,559 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 09:21:53,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1216820.0, ans=10.0 2023-10-03 09:21:54,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 09:21:55,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:21:58,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1216820.0, ans=0.125 2023-10-03 09:21:59,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 09:22:00,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1216886.6666666667, ans=0.0 2023-10-03 09:22:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 09:22:02,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1216886.6666666667, ans=0.125 2023-10-03 09:22:08,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 09:22:10,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 09:22:10,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:12,371 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 09:22:12,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 09:22:12,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 09:22:13,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 09:22:13,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:22:18,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 09:22:20,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:22:22,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1216953.3333333333, ans=0.035 2023-10-03 09:22:23,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:22:23,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 09:22:26,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:22:30,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 09:22:31,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:22:32,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1217020.0, ans=0.2 2023-10-03 09:22:37,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:22:37,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:22:37,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:22:37,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:22:39,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:22:40,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:22:40,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:22:41,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.19 vs. limit=10.0 2023-10-03 09:22:43,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:22:43,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:22:44,698 INFO [train.py:1046] (2/4) Epoch 35, batch 1950, loss[loss=0.1757, simple_loss=0.2487, pruned_loss=0.05131, over 22757.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2409, pruned_loss=0.04082, over 4710971.54 frames. ], batch size: 322, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:22:46,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:22:46,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:22:46,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:22:47,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:22:52,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:22:53,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:22:55,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:22:56,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 09:22:58,294 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.842e+02 2.075e+02 2.339e+02 3.045e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 09:22:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 09:22:58,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:59,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:01,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:23:01,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:01,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:02,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:23:05,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:23:05,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1217153.3333333333, ans=0.2 2023-10-03 09:23:06,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:23:06,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:23:06,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:09,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:12,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:23:12,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:12,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:23:12,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 09:23:12,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:23:12,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:23:13,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:18,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:18,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1217220.0, ans=0.125 2023-10-03 09:23:20,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:23:20,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1217220.0, ans=0.125 2023-10-03 09:23:25,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:23:28,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:23:29,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:23:29,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 09:23:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:23:32,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:23:34,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:23:34,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:23:38,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1217286.6666666667, ans=0.125 2023-10-03 09:23:41,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1217286.6666666667, ans=0.0 2023-10-03 09:23:42,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:43,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:46,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:49,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:51,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1217353.3333333333, ans=0.0 2023-10-03 09:23:53,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:23:54,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:54,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 09:23:54,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:23:56,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:56,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 09:23:58,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:23:59,661 INFO [train.py:1046] (2/4) Epoch 35, batch 2000, loss[loss=0.1766, simple_loss=0.2587, pruned_loss=0.04728, over 23450.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2403, pruned_loss=0.04084, over 4707833.50 frames. ], batch size: 93, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:24:01,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1217420.0, ans=0.0 2023-10-03 09:24:02,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:24:02,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:24:03,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:24:05,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:24:07,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:10,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 09:24:10,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:24:13,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:24:14,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 09:24:16,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:24:16,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:24:19,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:24:20,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 09:24:22,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:23,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:24,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.17 vs. limit=15.0 2023-10-03 09:24:25,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:25,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 09:24:25,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:24:27,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1217553.3333333333, ans=0.1 2023-10-03 09:24:28,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 09:24:28,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:24:31,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:24:32,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:24:32,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:32,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:24:33,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1217553.3333333333, ans=0.125 2023-10-03 09:24:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:24:35,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 09:24:38,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 09:24:38,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:24:38,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:24:40,402 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.76 vs. limit=15.0 2023-10-03 09:24:42,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:44,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:24:44,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:24:45,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:24:46,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:24:46,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:47,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:24:47,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:49,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:52,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:24:54,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 09:24:54,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1217620.0, ans=0.125 2023-10-03 09:24:59,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:25:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:02,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:03,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:25:03,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1217686.6666666667, ans=0.125 2023-10-03 09:25:05,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:08,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:25:08,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:08,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1217686.6666666667, ans=0.1 2023-10-03 09:25:09,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:25:09,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:25:12,118 INFO [train.py:1046] (2/4) Epoch 35, batch 2050, loss[loss=0.1397, simple_loss=0.2033, pruned_loss=0.03809, over 23388.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2387, pruned_loss=0.04051, over 4714328.29 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:25:12,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:12,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:13,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:25:15,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:19,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:25:22,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:25:24,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:25,760 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.915e+02 2.069e+02 2.253e+02 3.253e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 09:25:25,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:25:27,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 09:25:27,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:25:29,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:25:29,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:25:40,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:25:40,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:43,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 09:25:44,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:44,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 09:25:44,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:25:47,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:25:50,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:25:52,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:25:52,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:25:53,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:25:55,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:25:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:26:00,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:01,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:26:03,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:26:04,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:26:07,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:26:12,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:26:12,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 09:26:17,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:26:18,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:26:20,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:26:22,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 09:26:26,568 INFO [train.py:1046] (2/4) Epoch 35, batch 2100, loss[loss=0.1594, simple_loss=0.2332, pruned_loss=0.0428, over 23830.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2377, pruned_loss=0.04061, over 4714052.53 frames. ], batch size: 179, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:26:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 09:26:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:26:28,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:28,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:26:29,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.91 vs. limit=15.0 2023-10-03 09:26:29,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:26:29,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 09:26:29,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 09:26:32,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:26:35,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:26:35,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:26:39,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:26:39,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:26:39,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1218153.3333333333, ans=0.2 2023-10-03 09:26:40,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 09:26:42,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:26:42,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 09:26:42,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 09:26:43,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:26:43,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:26:43,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 09:26:44,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 09:26:50,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 09:26:50,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:26:53,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:26:53,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:57,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:26:57,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1218220.0, ans=0.125 2023-10-03 09:26:58,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 09:26:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:26:58,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 09:27:00,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 09:27:00,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:00,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 09:27:01,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 09:27:01,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 09:27:03,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:27:04,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:27:06,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:27:07,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:27:08,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:10,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:10,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 09:27:11,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:11,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:11,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:11,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 09:27:14,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 09:27:15,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 09:27:18,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1218286.6666666667, ans=15.0 2023-10-03 09:27:19,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:27:23,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:27:23,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 09:27:28,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:31,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:27:31,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:27:31,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:27:31,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 09:27:32,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:27:34,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:34,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:27:35,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:27:35,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:37,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 09:27:38,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 09:27:38,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:27:40,016 INFO [train.py:1046] (2/4) Epoch 35, batch 2150, loss[loss=0.1552, simple_loss=0.2346, pruned_loss=0.0379, over 23515.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2368, pruned_loss=0.0404, over 4701829.72 frames. ], batch size: 120, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:27:40,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:40,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:27:40,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:27:41,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:27:45,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 09:27:47,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:27:48,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:51,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:27:51,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:27:51,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:27:54,159 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.825e+02 1.971e+02 2.203e+02 3.479e+02, threshold=3.943e+02, percent-clipped=0.0 2023-10-03 09:27:55,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:55,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:27:55,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:28:00,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:00,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 09:28:04,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.40 vs. limit=12.0 2023-10-03 09:28:05,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:05,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:28:05,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1218486.6666666667, ans=0.0 2023-10-03 09:28:06,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:06,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:06,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:06,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:28:07,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:28:07,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:28:07,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:28:09,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 09:28:10,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:28:12,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:28:13,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:13,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:28:14,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:28:17,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1218553.3333333333, ans=0.0 2023-10-03 09:28:18,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:28:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:28:19,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:19,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 09:28:20,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:28:22,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:23,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:23,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:25,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:28:27,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:27,803 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.22 vs. limit=10.0 2023-10-03 09:28:28,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:28,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 09:28:30,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 09:28:30,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:28:30,454 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 09:28:31,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:28:33,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 09:28:33,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:28:33,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 09:28:33,193 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 09:28:33,193 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 09:28:34,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 09:28:36,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:36,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:28:36,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:28:36,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:28:37,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:37,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:28:48,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 09:28:51,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:28:53,564 INFO [train.py:1046] (2/4) Epoch 35, batch 2200, loss[loss=0.1615, simple_loss=0.2357, pruned_loss=0.04369, over 23369.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2376, pruned_loss=0.04021, over 4714410.83 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:28:58,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:59,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:28:59,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:01,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:29:04,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:29:04,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:29:04,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 09:29:08,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 09:29:11,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:29:15,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 09:29:17,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:29:18,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:29:18,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:29:22,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.86 vs. limit=15.0 2023-10-03 09:29:23,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:29:23,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 09:29:27,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:29:28,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1218886.6666666667, ans=0.2 2023-10-03 09:29:29,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:29:29,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 09:29:32,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:29:35,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:29:36,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:29:36,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:39,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 09:29:39,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:41,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 09:29:42,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:42,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:29:42,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:45,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:29:45,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:29:45,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:45,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:46,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:29:48,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:29:49,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:29:52,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:29:53,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:29:57,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:29:58,907 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 09:30:00,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:30:00,395 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 09:30:00,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1219020.0, ans=0.2 2023-10-03 09:30:01,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:30:01,754 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 09:30:03,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:03,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:30:06,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:07,814 INFO [train.py:1046] (2/4) Epoch 35, batch 2250, loss[loss=0.1789, simple_loss=0.2592, pruned_loss=0.04926, over 23272.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2379, pruned_loss=0.04011, over 4721759.86 frames. ], batch size: 93, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:30:07,927 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 09:30:10,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:30:11,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:30:17,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:30:18,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:30:22,204 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.843e+02 1.976e+02 2.228e+02 2.990e+02, threshold=3.951e+02, percent-clipped=0.0 2023-10-03 09:30:23,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:30:26,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:30:26,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 09:30:26,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:30:28,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:30:29,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 09:30:29,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:30:29,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:31,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:30:38,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:30:38,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:30:40,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:30:40,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1219220.0, ans=0.0 2023-10-03 09:30:41,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 09:30:43,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:44,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:30:47,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:30:48,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:30:49,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1219220.0, ans=15.0 2023-10-03 09:30:50,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:50,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:30:52,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1219286.6666666667, ans=0.125 2023-10-03 09:30:53,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:30:53,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:30:56,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:30:58,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1219286.6666666667, ans=0.1 2023-10-03 09:30:59,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:31:03,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:31:04,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:31:05,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:31:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:31:15,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:31:15,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 09:31:15,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:17,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:31:18,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 09:31:21,139 INFO [train.py:1046] (2/4) Epoch 35, batch 2300, loss[loss=0.1616, simple_loss=0.2364, pruned_loss=0.04337, over 23720.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2384, pruned_loss=0.04031, over 4724555.83 frames. ], batch size: 179, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:31:21,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:31:22,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:27,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:31:30,595 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 09:31:31,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:31:38,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:31:38,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:31:38,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:31:39,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:31:39,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 09:31:41,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:31:42,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:31:43,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:31:46,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:31:48,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:31:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:31:57,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:31:57,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:32:00,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:32:03,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:32:07,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:32:08,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:32:08,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:32:09,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 09:32:13,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:32:13,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:15,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:15,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:32:15,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:32:16,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 09:32:16,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:32:16,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 09:32:16,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:32:16,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:17,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 09:32:24,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:32:26,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1219686.6666666667, ans=10.0 2023-10-03 09:32:29,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:32:34,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:32:34,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:32:34,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:32:35,432 INFO [train.py:1046] (2/4) Epoch 35, batch 2350, loss[loss=0.1593, simple_loss=0.2241, pruned_loss=0.04729, over 23794.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04002, over 4735214.17 frames. ], batch size: 164, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:32:35,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:32:37,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:32:37,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:32:38,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 09:32:39,501 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.40 vs. limit=15.0 2023-10-03 09:32:44,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:32:44,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 09:32:49,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 09:32:50,632 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.929e+02 2.127e+02 2.368e+02 3.367e+02, threshold=4.254e+02, percent-clipped=0.0 2023-10-03 09:32:52,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:54,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:54,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:56,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:32:56,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:32:57,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 09:33:00,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:33:05,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 09:33:07,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:33:10,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:33:10,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:33:12,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:33:15,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 09:33:15,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:33:17,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:33:17,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:33:18,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:33:21,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:33:22,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 09:33:22,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1219953.3333333333, ans=0.0 2023-10-03 09:33:23,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:33:25,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:33:25,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:33:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 09:33:28,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:33:31,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 09:33:31,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:33:37,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 09:33:40,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 09:33:42,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:33:42,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 09:33:42,194 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 09:33:42,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 09:33:43,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 09:33:43,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1220020.0, ans=0.2 2023-10-03 09:33:46,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:33:49,755 INFO [train.py:1046] (2/4) Epoch 35, batch 2400, loss[loss=0.1477, simple_loss=0.2384, pruned_loss=0.0285, over 24655.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03993, over 4738500.78 frames. ], batch size: 68, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:33:49,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:33:50,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1220086.6666666667, ans=10.0 2023-10-03 09:33:55,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:33:56,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:33:56,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 09:33:56,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 09:34:03,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:34:03,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:34:05,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 09:34:05,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:34:07,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:07,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 09:34:13,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:16,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 09:34:20,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:34:25,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 09:34:26,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:34:28,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:28,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1220220.0, ans=0.125 2023-10-03 09:34:32,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:34:32,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 09:34:34,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:34:40,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1220286.6666666667, ans=0.125 2023-10-03 09:34:41,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:42,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:34:44,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:34:45,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:34:45,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:34:46,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:34:46,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:46,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:34:46,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:34:51,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:34:53,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:34:53,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 09:34:53,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 09:34:55,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:34:55,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:55,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 09:34:57,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 09:34:57,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 09:34:57,308 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 09:34:57,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 09:34:58,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1220353.3333333333, ans=0.1 2023-10-03 09:35:00,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:35:01,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:01,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:03,234 INFO [train.py:1046] (2/4) Epoch 35, batch 2450, loss[loss=0.1714, simple_loss=0.2469, pruned_loss=0.048, over 23622.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2383, pruned_loss=0.04004, over 4729153.69 frames. ], batch size: 149, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:35:03,274 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 09:35:03,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.31 vs. limit=22.5 2023-10-03 09:35:04,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:04,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:35:08,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:35:08,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=1220420.0, ans=0.025 2023-10-03 09:35:09,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:11,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:11,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:11,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1220420.0, ans=0.125 2023-10-03 09:35:12,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 09:35:15,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.76 vs. limit=15.0 2023-10-03 09:35:16,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:35:16,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:19,798 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.894e+02 2.090e+02 2.378e+02 3.280e+02, threshold=4.179e+02, percent-clipped=0.0 2023-10-03 09:35:21,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:35:21,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:35:21,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:35:21,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 09:35:21,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1220486.6666666667, ans=0.2 2023-10-03 09:35:26,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:27,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:35:29,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:35:32,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:35:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:34,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:35,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:37,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 09:35:39,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:35:42,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1220553.3333333333, ans=0.07 2023-10-03 09:35:46,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:47,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:35:47,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:35:47,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:49,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:35:50,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 09:35:53,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:55,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:35:56,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:56,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:36:02,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:36:02,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 09:36:02,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:36:04,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:36:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 09:36:04,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.32 vs. limit=10.0 2023-10-03 09:36:05,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:36:06,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:36:11,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:36:12,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:36:14,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:36:16,948 INFO [train.py:1046] (2/4) Epoch 35, batch 2500, loss[loss=0.1561, simple_loss=0.2391, pruned_loss=0.03655, over 24299.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2374, pruned_loss=0.03984, over 4718489.05 frames. ], batch size: 61, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:36:17,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 09:36:18,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:36:24,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:36:29,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1220753.3333333333, ans=0.125 2023-10-03 09:36:33,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:36:33,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:36:35,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:36:35,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 09:36:36,927 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:36:42,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:36:42,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:36:44,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:36:44,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:36:44,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 09:36:45,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:36:46,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:36:46,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 09:36:46,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:36:48,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 09:36:49,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:36:51,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1220886.6666666667, ans=0.125 2023-10-03 09:36:52,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:36:52,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:36:55,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:36:55,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 09:36:55,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:36:58,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:01,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:06,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:08,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:37:10,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1220953.3333333333, ans=0.125 2023-10-03 09:37:13,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:37:16,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 09:37:16,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:37:16,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:37:20,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:37:20,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:37:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 09:37:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 09:37:20,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 09:37:22,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1221020.0, ans=0.1 2023-10-03 09:37:25,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:26,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 09:37:27,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 09:37:27,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:37:29,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 09:37:30,696 INFO [train.py:1046] (2/4) Epoch 35, batch 2550, loss[loss=0.1584, simple_loss=0.246, pruned_loss=0.03547, over 24429.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2377, pruned_loss=0.03957, over 4719901.52 frames. ], batch size: 77, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:37:32,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 09:37:35,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:37:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:37:36,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:37:38,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:37:39,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 09:37:39,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:37:42,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 09:37:44,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:37:46,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:47,471 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.892e+02 2.118e+02 2.497e+02 3.276e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 09:37:49,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:37:50,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 09:37:50,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:37:50,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:37:51,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:53,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:37:53,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 09:37:53,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:37:53,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:53,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 09:38:05,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:38:05,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1221220.0, ans=0.125 2023-10-03 09:38:07,180 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.82 vs. limit=22.5 2023-10-03 09:38:08,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:09,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:09,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:38:11,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:38:16,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:38:16,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1221286.6666666667, ans=0.2 2023-10-03 09:38:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:38:18,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:38:18,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:38:18,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:38:20,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:38:23,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:23,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:30,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:38:30,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 09:38:30,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:38:30,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:31,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:38:32,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:38:33,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:38:41,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:38:41,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.77 vs. limit=15.0 2023-10-03 09:38:42,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:38:45,744 INFO [train.py:1046] (2/4) Epoch 35, batch 2600, loss[loss=0.1631, simple_loss=0.2575, pruned_loss=0.03432, over 24657.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.239, pruned_loss=0.04031, over 4717352.17 frames. ], batch size: 73, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:38:45,883 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 09:38:48,599 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 09:38:48,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:38:49,920 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 09:38:49,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 09:38:50,006 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 09:38:53,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:53,321 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 09:38:54,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 09:38:56,040 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 09:38:58,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:39:00,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 09:39:02,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 09:39:04,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:39:06,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 09:39:07,517 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 09:39:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 09:39:15,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:39:15,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:15,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:39:15,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 09:39:17,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:39:23,823 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 09:39:29,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:29,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:39:31,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 09:39:31,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:39:31,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:39:32,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 09:39:34,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:39:34,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:39:37,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:39:41,554 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 09:39:41,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:39:41,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:39:47,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:39:47,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:39:47,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 09:39:49,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:51,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:39:52,019 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.01 vs. limit=22.5 2023-10-03 09:39:52,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:39:58,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 09:39:58,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:00,022 INFO [train.py:1046] (2/4) Epoch 35, batch 2650, loss[loss=0.1755, simple_loss=0.2502, pruned_loss=0.0504, over 23264.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2396, pruned_loss=0.04033, over 4727276.55 frames. ], batch size: 105, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:40:00,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:40:03,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1221753.3333333333, ans=0.0 2023-10-03 09:40:04,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 09:40:04,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:05,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:40:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 09:40:07,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:08,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:09,781 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.68 vs. limit=15.0 2023-10-03 09:40:11,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:40:13,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:40:13,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1221820.0, ans=0.0 2023-10-03 09:40:14,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:40:15,674 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.893e+02 2.177e+02 2.429e+02 3.374e+02, threshold=4.354e+02, percent-clipped=0.0 2023-10-03 09:40:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 09:40:15,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:40:15,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:40:19,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 09:40:22,544 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 09:40:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:40:26,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 09:40:26,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1221820.0, ans=0.0 2023-10-03 09:40:28,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:28,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 09:40:31,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:31,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:40:31,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:32,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:40:33,262 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-03 09:40:34,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1221886.6666666667, ans=0.2 2023-10-03 09:40:36,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 09:40:36,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 09:40:38,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1221886.6666666667, ans=0.2 2023-10-03 09:40:41,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:40:45,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 09:40:45,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:46,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:40:46,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:40:46,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:48,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:40:51,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:51,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:40:54,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:40:55,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:40:57,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:57,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:40:58,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:58,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:41:00,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:41:02,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:03,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:41:03,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:41:03,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 09:41:08,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:41:09,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1222020.0, ans=0.125 2023-10-03 09:41:10,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:12,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:13,652 INFO [train.py:1046] (2/4) Epoch 35, batch 2700, loss[loss=0.2112, simple_loss=0.2751, pruned_loss=0.07363, over 19491.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2401, pruned_loss=0.04051, over 4727989.81 frames. ], batch size: 388, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:41:13,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:13,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:41:13,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:16,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:41:16,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 09:41:18,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:41:19,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 09:41:22,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:41:22,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:22,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:25,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:41:25,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:41:25,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:41:26,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:41:26,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 09:41:28,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:41:29,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:41:29,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:41:29,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:32,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.26 vs. limit=15.0 2023-10-03 09:41:32,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:41:34,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 09:41:34,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:41:40,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:41:40,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:41:45,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:41:45,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:41:45,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:41:45,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:41:49,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:41:51,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:41:52,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:41:52,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:41:55,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:55,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:41:55,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1222220.0, ans=0.125 2023-10-03 09:42:04,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:42:06,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:42:09,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:42:09,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:10,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:42:11,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:11,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:42:13,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:13,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1222353.3333333333, ans=0.125 2023-10-03 09:42:15,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:42:17,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:42:19,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:42:21,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:42:21,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:42:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 09:42:23,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:27,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:42:27,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 09:42:28,555 INFO [train.py:1046] (2/4) Epoch 35, batch 2750, loss[loss=0.1611, simple_loss=0.229, pruned_loss=0.04656, over 23811.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2394, pruned_loss=0.04009, over 4731355.06 frames. ], batch size: 212, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:42:28,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 09:42:28,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:33,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:33,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:34,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:34,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:42:35,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:38,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:42:38,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:42:38,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:42:38,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:38,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 09:42:38,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:42:39,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:44,655 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.884e+02 2.035e+02 2.268e+02 3.504e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 09:42:46,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 09:42:47,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:42:49,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:50,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:42:50,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:42:52,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:53,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:42:53,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:53,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:53,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1222486.6666666667, ans=0.1 2023-10-03 09:42:55,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1222486.6666666667, ans=0.5 2023-10-03 09:42:58,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:42:59,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:42:59,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:42:59,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:43:01,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:43:08,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:43:11,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:43:11,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:11,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1222620.0, ans=0.125 2023-10-03 09:43:14,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:43:14,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:43:14,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:43:19,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:43:20,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:43:20,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 09:43:21,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.70 vs. limit=10.0 2023-10-03 09:43:25,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:26,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 09:43:33,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:43:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:43:34,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 09:43:36,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:43:37,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1222686.6666666667, ans=0.2 2023-10-03 09:43:38,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:43:38,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 09:43:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:43:42,683 INFO [train.py:1046] (2/4) Epoch 35, batch 2800, loss[loss=0.134, simple_loss=0.2156, pruned_loss=0.02619, over 24425.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2377, pruned_loss=0.03991, over 4732342.59 frames. ], batch size: 58, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:43:42,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 09:43:42,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:43:42,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:43:44,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 09:43:44,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:43:44,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:47,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:43:47,774 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 09:43:47,776 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 09:43:51,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:52,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:43:52,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:43:55,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:43:56,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 09:43:58,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 09:44:00,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 09:44:01,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:03,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:44:03,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:06,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:07,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:07,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:44:08,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:44:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:44:19,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:44:21,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:21,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:44:22,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:28,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:44:28,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 09:44:28,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:44:29,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:29,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:44:32,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:44:34,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:37,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:44:40,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:44:40,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:40,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:44:40,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:44:40,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:44:43,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:43,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 09:44:43,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:44:44,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:44:44,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:44:44,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 09:44:46,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:46,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:44:46,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:44:48,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 09:44:52,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1223020.0, ans=0.0 2023-10-03 09:44:53,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:53,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:44:53,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:44:54,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1223020.0, ans=0.0 2023-10-03 09:44:56,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:44:56,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1223086.6666666667, ans=10.0 2023-10-03 09:44:58,072 INFO [train.py:1046] (2/4) Epoch 35, batch 2850, loss[loss=0.1567, simple_loss=0.2467, pruned_loss=0.03332, over 24639.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2375, pruned_loss=0.03977, over 4724088.25 frames. ], batch size: 68, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:44:59,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:45:00,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:01,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:45:03,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1223086.6666666667, ans=0.125 2023-10-03 09:45:04,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:06,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:45:07,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:45:07,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 09:45:13,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 09:45:13,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:15,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 09:45:16,440 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.902e+02 2.080e+02 2.462e+02 6.971e+02, threshold=4.161e+02, percent-clipped=1.0 2023-10-03 09:45:16,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:18,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1223153.3333333333, ans=0.0 2023-10-03 09:45:19,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 09:45:20,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 09:45:20,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:26,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.02 vs. limit=22.5 2023-10-03 09:45:32,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1223220.0, ans=0.0 2023-10-03 09:45:33,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:34,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1223220.0, ans=0.0 2023-10-03 09:45:35,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:45:35,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:45:37,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:45:37,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:45:37,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:45:40,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:45:40,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 09:45:43,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:45:43,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:45:43,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:44,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:46,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:46,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:46,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1223286.6666666667, ans=0.125 2023-10-03 09:45:47,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:45:50,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:45:51,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:53,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:54,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:45:55,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1223286.6666666667, ans=0.125 2023-10-03 09:45:59,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:46:00,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 09:46:01,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 09:46:04,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:46:04,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:04,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 09:46:06,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:46:06,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:06,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:06,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:46:06,523 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 09:46:07,789 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 09:46:07,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:46:09,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:12,868 INFO [train.py:1046] (2/4) Epoch 35, batch 2900, loss[loss=0.1525, simple_loss=0.2385, pruned_loss=0.03323, over 24336.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2377, pruned_loss=0.04001, over 4721501.54 frames. ], batch size: 77, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:46:14,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:46:14,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:14,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:46:15,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 09:46:21,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:46:21,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 09:46:21,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 09:46:22,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:46:22,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:46:24,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:46:26,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:46:29,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:46:29,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:46:33,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:46:33,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 09:46:33,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:46:34,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:35,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 09:46:36,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 09:46:38,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.66 vs. limit=10.0 2023-10-03 09:46:39,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:39,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 09:46:39,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:46:43,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:46:43,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:46:44,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1223553.3333333333, ans=0.1 2023-10-03 09:46:45,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:46:45,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:47,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1223553.3333333333, ans=0.125 2023-10-03 09:46:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:52,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:46:55,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 09:46:55,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 09:46:55,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:47:00,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:47:00,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1223620.0, ans=0.0 2023-10-03 09:47:01,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 09:47:02,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:47:08,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:47:13,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=1223686.6666666667, ans=0.1 2023-10-03 09:47:18,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:47:18,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:47:19,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 09:47:21,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-10-03 09:47:22,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:22,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 09:47:22,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:47:22,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:47:26,628 INFO [train.py:1046] (2/4) Epoch 35, batch 2950, loss[loss=0.1691, simple_loss=0.2574, pruned_loss=0.04041, over 24434.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2383, pruned_loss=0.04004, over 4720444.34 frames. ], batch size: 69, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:47:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:47:31,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 09:47:31,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:47:31,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:32,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:47:34,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:47:35,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 09:47:36,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 09:47:38,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:47:38,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:47:44,986 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.855e+02 2.022e+02 2.286e+02 3.734e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 09:47:46,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:47:48,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:47:49,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:47:49,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:47:52,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:47:52,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:47:54,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:54,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:54,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:47:57,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 09:48:01,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1223886.6666666667, ans=0.025 2023-10-03 09:48:02,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 09:48:02,991 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 09:48:04,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:48:04,415 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 09:48:07,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 09:48:07,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:48:07,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:48:07,076 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 09:48:07,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:48:09,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 09:48:09,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:48:09,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:48:13,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:48:14,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:48:14,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:16,279 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 09:48:16,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:48:16,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 09:48:17,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.35 vs. limit=15.0 2023-10-03 09:48:22,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:23,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:48:23,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 09:48:23,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:48:24,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.81 vs. limit=15.0 2023-10-03 09:48:25,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 09:48:28,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:48:28,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:48:28,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:48:31,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:31,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:48:32,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:48:32,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:32,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:48:34,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:48:34,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:48:35,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:48:37,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:37,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 09:48:38,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:41,553 INFO [train.py:1046] (2/4) Epoch 35, batch 3000, loss[loss=0.1464, simple_loss=0.2298, pruned_loss=0.03149, over 24426.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2386, pruned_loss=0.04008, over 4725846.15 frames. ], batch size: 58, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:48:41,554 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 09:48:53,192 INFO [train.py:1078] (2/4) Epoch 35, validation: loss=0.3596, simple_loss=0.2732, pruned_loss=0.223, over 1125622.00 frames. 2023-10-03 09:48:53,192 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 09:48:53,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:48:53,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:48:56,739 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 09:48:56,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 09:48:59,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:48:59,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:49:01,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 09:49:01,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:49:01,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1224086.6666666667, ans=0.2 2023-10-03 09:49:05,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1224086.6666666667, ans=0.2 2023-10-03 09:49:09,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:49:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:49:27,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 09:49:28,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:49:28,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1224220.0, ans=0.0 2023-10-03 09:49:31,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-03 09:49:33,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:49:33,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:49:34,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:49:36,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:49:36,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 09:49:37,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 09:49:38,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:49:38,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:49:40,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:49:40,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:49:40,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:40,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:49:45,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:49:45,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1224286.6666666667, ans=0.0 2023-10-03 09:49:47,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:49:47,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:49:48,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:49:48,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1224286.6666666667, ans=0.2 2023-10-03 09:49:51,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 09:49:51,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:49:51,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:49:51,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:49:55,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:55,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:58,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 09:49:58,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 09:49:58,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:49:59,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 09:49:59,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:50:03,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 09:50:07,176 INFO [train.py:1046] (2/4) Epoch 35, batch 3050, loss[loss=0.148, simple_loss=0.2378, pruned_loss=0.02909, over 24644.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2394, pruned_loss=0.04032, over 4725886.77 frames. ], batch size: 73, lr: 2.91e-03, grad_scale: 4.0 2023-10-03 09:50:07,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:50:08,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 09:50:08,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 09:50:08,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 09:50:08,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:50:09,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1224420.0, ans=0.125 2023-10-03 09:50:10,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:50:10,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:50:10,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:50:10,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:11,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:50:14,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 09:50:17,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:50:19,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:19,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:50:21,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1224486.6666666667, ans=0.1 2023-10-03 09:50:21,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1224486.6666666667, ans=0.0 2023-10-03 09:50:23,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:26,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.384e+02 1.939e+02 2.109e+02 2.349e+02 4.315e+02, threshold=4.217e+02, percent-clipped=1.0 2023-10-03 09:50:27,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 09:50:31,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 09:50:31,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 09:50:31,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:50:34,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:50:38,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:38,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:38,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:50:41,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:50:41,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:50:42,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:50:42,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:42,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:50:43,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:44,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1224553.3333333333, ans=0.125 2023-10-03 09:50:45,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:50:46,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:50:48,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 09:50:49,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:49,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:50:53,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:50:53,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:50:54,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:50:54,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:50:58,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:51:00,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:04,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:06,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:51:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:51:07,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:51:07,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:51:09,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:51:09,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1224686.6666666667, ans=0.0 2023-10-03 09:51:10,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 09:51:11,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:51:11,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:13,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 09:51:13,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1224686.6666666667, ans=0.125 2023-10-03 09:51:14,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:20,344 INFO [train.py:1046] (2/4) Epoch 35, batch 3100, loss[loss=0.1521, simple_loss=0.2418, pruned_loss=0.03121, over 24277.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2396, pruned_loss=0.04051, over 4721004.97 frames. ], batch size: 74, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:51:20,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:22,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:51:25,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:51:26,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 09:51:29,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 09:51:30,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 09:51:30,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1224753.3333333333, ans=0.125 2023-10-03 09:51:31,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:51:35,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:51:36,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:51:41,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:42,818 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.36 vs. limit=12.0 2023-10-03 09:51:46,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 09:51:51,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 09:51:53,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:51:53,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:51:53,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:51:54,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 09:51:56,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:51:56,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 09:51:56,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:51:57,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:58,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 09:52:00,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:52:04,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:52:04,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 09:52:07,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 09:52:07,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:07,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:52:09,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:09,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:10,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:52:10,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:52:10,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:52:12,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1224953.3333333333, ans=0.1 2023-10-03 09:52:13,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:52:13,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:52:13,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:14,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 09:52:16,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1224953.3333333333, ans=0.0 2023-10-03 09:52:16,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.85 vs. limit=15.0 2023-10-03 09:52:17,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:52:18,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 09:52:20,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:52:20,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1225020.0, ans=0.125 2023-10-03 09:52:22,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 09:52:22,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:24,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 09:52:26,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1225020.0, ans=0.125 2023-10-03 09:52:34,862 INFO [train.py:1046] (2/4) Epoch 35, batch 3150, loss[loss=0.1369, simple_loss=0.1927, pruned_loss=0.04059, over 19140.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2384, pruned_loss=0.04026, over 4710132.97 frames. ], batch size: 388, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:52:34,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 09:52:37,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:37,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:39,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:52:39,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:52:39,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 09:52:40,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:40,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:52:40,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 09:52:42,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:43,686 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 09:52:46,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 09:52:46,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:52:46,600 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 09:52:47,134 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.77 vs. limit=15.0 2023-10-03 09:52:48,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 09:52:49,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 09:52:50,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 09:52:50,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 09:52:50,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:50,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:52:51,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:52,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 09:52:54,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.875e+02 2.111e+02 2.412e+02 4.030e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 09:52:57,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:57,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:57,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:52:59,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:53:03,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 09:53:04,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:53:05,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:53:06,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:53:06,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 09:53:08,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 09:53:08,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:53:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:53:10,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 09:53:10,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:53:10,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:53:11,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:53:11,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:53:12,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1225220.0, ans=15.0 2023-10-03 09:53:13,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 09:53:13,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:53:14,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:17,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:53:17,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:53:19,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 09:53:19,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:22,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 09:53:22,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:22,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 09:53:23,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 09:53:24,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:53:24,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:25,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 09:53:27,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 09:53:27,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:53:29,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:53:31,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:32,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:53:34,240 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.76 vs. limit=15.0 2023-10-03 09:53:35,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:53:36,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:38,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 09:53:42,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:53:42,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:53:47,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:50,268 INFO [train.py:1046] (2/4) Epoch 35, batch 3200, loss[loss=0.165, simple_loss=0.2426, pruned_loss=0.04369, over 23358.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2374, pruned_loss=0.04039, over 4700764.26 frames. ], batch size: 119, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:53:50,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:53:50,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 09:53:53,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:55,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1225420.0, ans=0.125 2023-10-03 09:53:56,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:54:01,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:54:09,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:54:19,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 09:54:19,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:54:19,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1225553.3333333333, ans=0.0 2023-10-03 09:54:19,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1225553.3333333333, ans=0.125 2023-10-03 09:54:22,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 09:54:22,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:54:22,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.21 vs. limit=15.0 2023-10-03 09:54:25,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1225553.3333333333, ans=0.0 2023-10-03 09:54:26,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:54:26,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:54:28,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:54:32,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 09:54:33,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 09:54:35,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 09:54:37,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 09:54:39,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:54:45,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:54:45,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:54:45,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:54:45,559 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 09:54:45,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 09:54:50,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:54:52,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 09:54:52,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 09:54:53,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 09:54:53,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 09:54:56,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:54:59,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:54:59,699 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 09:54:59,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:54:59,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:54:59,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1225686.6666666667, ans=0.1 2023-10-03 09:55:01,712 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 09:55:02,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.28 vs. limit=22.5 2023-10-03 09:55:04,061 INFO [train.py:1046] (2/4) Epoch 35, batch 3250, loss[loss=0.16, simple_loss=0.2485, pruned_loss=0.03576, over 24310.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2375, pruned_loss=0.04007, over 4708020.91 frames. ], batch size: 74, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:55:05,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:55:09,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:55:10,580 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.56 vs. limit=15.0 2023-10-03 09:55:15,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:55:15,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 09:55:16,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:55:18,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:55:18,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:55:19,026 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:55:20,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:55:20,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:55:23,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:23,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:55:24,721 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.063e+02 2.296e+02 2.650e+02 3.939e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 09:55:24,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:24,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:24,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:24,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:55:26,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1225820.0, ans=0.125 2023-10-03 09:55:29,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:30,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:55:32,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:32,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:32,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1225886.6666666667, ans=0.125 2023-10-03 09:55:33,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:35,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:55:35,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:55:39,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 09:55:41,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:55:41,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:55:42,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:55:42,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:55:48,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:55:54,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:55:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 09:55:56,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:55:56,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:55:56,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:00,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 09:56:00,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 09:56:00,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:56:02,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:02,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:56:02,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:56:03,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:56:08,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:56:08,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:56:09,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 09:56:09,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:56:12,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 09:56:15,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:56:15,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 09:56:18,028 INFO [train.py:1046] (2/4) Epoch 35, batch 3300, loss[loss=0.1892, simple_loss=0.2562, pruned_loss=0.06109, over 22895.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2387, pruned_loss=0.0406, over 4704635.47 frames. ], batch size: 322, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:56:18,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 09:56:19,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 09:56:19,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:22,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:56:24,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:56:24,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:25,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:56:25,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:56:28,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:30,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:56:30,700 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:56:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 09:56:35,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:56:35,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1226153.3333333333, ans=0.125 2023-10-03 09:56:37,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:38,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:39,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.12 vs. limit=10.0 2023-10-03 09:56:40,054 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 09:56:41,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:56:41,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:56:42,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:56:42,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:56:44,144 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 09:56:48,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:48,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:56:50,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.56 vs. limit=12.0 2023-10-03 09:56:50,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:50,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 09:56:52,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 09:56:52,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:53,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:56:56,278 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 09:56:56,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 09:56:58,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:57:01,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 09:57:02,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:57:04,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:57:05,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:57:09,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:09,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:57:09,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:57:09,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:57:11,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:57:11,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:57:12,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:57:13,460 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 09:57:13,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 09:57:16,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:57:17,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:57:17,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:57:19,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:20,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:57:20,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:57:21,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:57:24,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:57:25,191 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.23 vs. limit=12.0 2023-10-03 09:57:27,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 09:57:29,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:30,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:31,612 INFO [train.py:1046] (2/4) Epoch 35, batch 3350, loss[loss=0.1642, simple_loss=0.2385, pruned_loss=0.04492, over 23781.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2396, pruned_loss=0.04083, over 4716835.93 frames. ], batch size: 195, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:57:33,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:57:33,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:57:34,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:35,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:35,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:40,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:57:40,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:41,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:57:43,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:44,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:57:45,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:47,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:57:48,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 09:57:49,884 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 09:57:49,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:50,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1226486.6666666667, ans=0.125 2023-10-03 09:57:52,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 09:57:52,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 09:57:53,948 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.867e+02 2.273e+02 2.647e+02 3.558e+02, threshold=4.546e+02, percent-clipped=0.0 2023-10-03 09:57:54,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:57:54,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:57:55,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:57:55,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 09:57:55,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:55,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1226486.6666666667, ans=0.1 2023-10-03 09:57:56,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:57:56,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:00,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:00,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:01,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:58:04,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:08,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:08,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:11,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1226553.3333333333, ans=0.125 2023-10-03 09:58:12,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:58:12,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:14,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:15,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:17,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:17,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1226620.0, ans=0.2 2023-10-03 09:58:18,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1226620.0, ans=0.125 2023-10-03 09:58:19,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 09:58:19,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:58:19,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 09:58:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:58:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 09:58:22,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:24,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:30,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1226620.0, ans=0.125 2023-10-03 09:58:32,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:33,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 09:58:33,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:58:34,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1226686.6666666667, ans=0.1 2023-10-03 09:58:35,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:58:38,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:58:41,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:58:43,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 09:58:45,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:58:46,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:58:47,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:48,890 INFO [train.py:1046] (2/4) Epoch 35, batch 3400, loss[loss=0.1826, simple_loss=0.2558, pruned_loss=0.05463, over 23875.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2412, pruned_loss=0.04143, over 4707902.82 frames. ], batch size: 212, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:58:48,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 09:58:49,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:49,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 09:58:50,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:58:51,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:58:51,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:58:53,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:58:53,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 09:58:57,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 09:58:57,316 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 09:58:57,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:01,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:59:01,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:59:01,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:01,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:59:08,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:59:08,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 09:59:16,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:59:18,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:19,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:59:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:59:27,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:59:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 09:59:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:36,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:36,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 09:59:36,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:59:36,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1226953.3333333333, ans=10.0 2023-10-03 09:59:38,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:59:38,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:59:39,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:59:42,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:45,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:59:45,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:59:50,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:59:51,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 09:59:57,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:00:02,663 INFO [train.py:1046] (2/4) Epoch 35, batch 3450, loss[loss=0.1618, simple_loss=0.2395, pruned_loss=0.04207, over 23232.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04155, over 4704878.88 frames. ], batch size: 105, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 10:00:02,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 10:00:07,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 10:00:09,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:00:09,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:00:09,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 10:00:10,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:00:11,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1227086.6666666667, ans=0.07 2023-10-03 10:00:14,684 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.76 vs. limit=10.0 2023-10-03 10:00:15,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:00:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:00:18,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:00:19,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:00:19,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:20,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1227153.3333333333, ans=0.1 2023-10-03 10:00:23,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:27,176 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.826e+02 1.965e+02 2.200e+02 4.257e+02, threshold=3.929e+02, percent-clipped=0.0 2023-10-03 10:00:28,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 10:00:31,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 10:00:32,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:00:33,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:00:33,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1227220.0, ans=0.07 2023-10-03 10:00:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:00:36,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1227220.0, ans=0.0 2023-10-03 10:00:39,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1227220.0, ans=0.125 2023-10-03 10:00:40,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 10:00:42,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:00:46,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:00:46,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:00:49,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:00:50,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:00:50,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1227286.6666666667, ans=0.95 2023-10-03 10:00:52,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 10:00:52,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:00:52,333 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:00:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:56,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:00:56,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1227286.6666666667, ans=0.0 2023-10-03 10:00:59,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 10:01:02,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:01:07,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:01:08,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:12,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:15,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:15,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:01:17,065 INFO [train.py:1046] (2/4) Epoch 35, batch 3500, loss[loss=0.1635, simple_loss=0.2479, pruned_loss=0.03954, over 24371.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2391, pruned_loss=0.04082, over 4700800.43 frames. ], batch size: 77, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 10:01:17,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:01:19,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:01:21,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:24,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:01:24,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1227420.0, ans=0.0 2023-10-03 10:01:26,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 10:01:26,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1227420.0, ans=0.1 2023-10-03 10:01:27,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:01:28,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1227420.0, ans=0.0 2023-10-03 10:01:30,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:01:33,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:33,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 10:01:39,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:01:39,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:01:40,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:01:40,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:01:40,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1227486.6666666667, ans=0.125 2023-10-03 10:01:41,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:01:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:41,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1227486.6666666667, ans=0.125 2023-10-03 10:01:42,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:01:42,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 10:01:43,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.86 vs. limit=15.0 2023-10-03 10:01:45,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:45,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:01:46,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:01:50,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:50,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 10:01:51,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:01:53,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:01:54,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:01:55,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:58,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:01:59,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:02:01,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 10:02:01,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 10:02:02,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 10:02:02,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:02:04,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:05,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:02:05,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:02:10,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:02:10,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:02:14,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:02:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 10:02:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 10:02:16,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:02:19,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:02:21,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:02:22,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:23,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 10:02:23,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:02:24,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1227686.6666666667, ans=0.125 2023-10-03 10:02:25,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:02:25,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 10:02:26,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.39 vs. limit=22.5 2023-10-03 10:02:26,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1227686.6666666667, ans=0.1 2023-10-03 10:02:28,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 10:02:31,058 INFO [train.py:1046] (2/4) Epoch 35, batch 3550, loss[loss=0.1696, simple_loss=0.2376, pruned_loss=0.05078, over 22894.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2384, pruned_loss=0.04081, over 4701679.10 frames. ], batch size: 322, lr: 2.90e-03, grad_scale: 4.0 2023-10-03 10:02:31,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:32,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:02:32,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:02:32,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:37,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:02:44,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:46,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 10:02:49,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:02:50,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:02:52,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:02:52,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:02:52,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:02:56,324 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.952e+02 2.132e+02 2.407e+02 4.209e+02, threshold=4.264e+02, percent-clipped=1.0 2023-10-03 10:02:56,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:02:56,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:02:57,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:57,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:02:59,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:03:01,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1227886.6666666667, ans=0.125 2023-10-03 10:03:01,610 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-03 10:03:02,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1227886.6666666667, ans=22.5 2023-10-03 10:03:03,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:03:05,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:03:07,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:03:07,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:03:07,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:03:07,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 10:03:07,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:09,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:11,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 10:03:17,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:17,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:03:18,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:19,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 10:03:19,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:03:22,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1227953.3333333333, ans=0.125 2023-10-03 10:03:23,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 10:03:23,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:03:23,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1227953.3333333333, ans=0.0 2023-10-03 10:03:24,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1227953.3333333333, ans=0.0 2023-10-03 10:03:26,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:03:26,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:03:29,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 10:03:30,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:03:35,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:03:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 10:03:35,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:41,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 10:03:45,622 INFO [train.py:1046] (2/4) Epoch 35, batch 3600, loss[loss=0.1649, simple_loss=0.2423, pruned_loss=0.04377, over 23312.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2378, pruned_loss=0.04067, over 4690189.99 frames. ], batch size: 105, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:03:45,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1228086.6666666667, ans=0.125 2023-10-03 10:03:46,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 10:03:46,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:03:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:03:49,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:49,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:52,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:03:55,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:03:57,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:58,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:03:58,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:04:00,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:00,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 10:04:02,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:04:03,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:03,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.52 vs. limit=10.0 2023-10-03 10:04:04,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:04:09,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:04:09,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:04:11,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:04:12,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 10:04:12,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:04:15,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:15,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1228220.0, ans=0.125 2023-10-03 10:04:16,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:04:18,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:18,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:04:19,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:04:20,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 10:04:29,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:04:29,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:04:31,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 10:04:35,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:04:39,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:42,085 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:04:43,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:46,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1228353.3333333333, ans=0.2 2023-10-03 10:04:49,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:04:49,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:04:49,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 10:04:50,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 10:04:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 10:04:53,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:04:53,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:04:54,842 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.84 vs. limit=22.5 2023-10-03 10:04:56,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 10:04:56,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:04:56,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:04:56,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:04:58,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 10:04:58,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 10:04:59,421 INFO [train.py:1046] (2/4) Epoch 35, batch 3650, loss[loss=0.1498, simple_loss=0.2334, pruned_loss=0.03306, over 23460.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2388, pruned_loss=0.04052, over 4699640.00 frames. ], batch size: 106, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:05:00,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:05:02,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 10:05:08,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 10:05:09,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:05:14,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 10:05:14,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 10:05:19,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:05:19,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:05:19,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:05:22,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:05:22,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:05:22,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 10:05:24,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:05:24,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:05:24,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 10:05:25,446 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.828e+02 1.992e+02 2.156e+02 3.543e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-03 10:05:25,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:05:25,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1228486.6666666667, ans=0.125 2023-10-03 10:05:25,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1228486.6666666667, ans=0.1 2023-10-03 10:05:26,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:05:26,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:28,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:05:31,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 10:05:31,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 10:05:31,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:05:31,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1228553.3333333333, ans=0.125 2023-10-03 10:05:34,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 10:05:35,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:05:36,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:05:43,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:05:44,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:44,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:05:45,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:05:46,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:05:49,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:05:52,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:05:52,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:05:52,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:05:53,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:05:54,398 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.88 vs. limit=6.0 2023-10-03 10:05:55,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:55,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:02,498 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 10:06:07,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:06:07,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:07,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:06:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:09,279 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.02 vs. limit=15.0 2023-10-03 10:06:10,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:06:11,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:11,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 10:06:11,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:14,643 INFO [train.py:1046] (2/4) Epoch 35, batch 3700, loss[loss=0.1576, simple_loss=0.2356, pruned_loss=0.03977, over 24466.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2399, pruned_loss=0.0411, over 4693911.15 frames. ], batch size: 58, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:06:14,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:06:17,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:06:17,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:06:20,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 10:06:20,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:22,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:06:22,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:06:22,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1228753.3333333333, ans=0.0 2023-10-03 10:06:26,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:06:28,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:06:29,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:06:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:06:31,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:32,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:06:35,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:06:36,526 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 10:06:45,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:06:45,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:06:46,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:06:46,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 10:06:48,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:06:49,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:51,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 10:06:53,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:54,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:06:56,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:56,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:06:59,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:07:02,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:07:02,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 10:07:03,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:03,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 10:07:05,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1228953.3333333333, ans=0.125 2023-10-03 10:07:07,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:07:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:07:12,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:13,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 10:07:16,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:07:16,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:07:16,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:07:16,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:17,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1229020.0, ans=0.125 2023-10-03 10:07:20,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:07:21,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 10:07:22,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 10:07:24,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:07:24,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:25,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:07:25,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:07:28,303 INFO [train.py:1046] (2/4) Epoch 35, batch 3750, loss[loss=0.1761, simple_loss=0.2503, pruned_loss=0.05098, over 23442.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2406, pruned_loss=0.04106, over 4712111.65 frames. ], batch size: 256, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:07:28,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:07:29,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:07:30,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1229086.6666666667, ans=0.0 2023-10-03 10:07:31,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:07:33,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 10:07:34,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 10:07:37,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:07:37,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 10:07:38,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:07:40,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:41,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:43,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:07:46,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:48,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:07:48,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:07:52,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:53,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.913e+02 2.089e+02 2.337e+02 3.206e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-03 10:07:55,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:07:55,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 10:07:56,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:07:57,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.43 vs. limit=15.0 2023-10-03 10:07:58,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:07:58,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:08:01,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 10:08:04,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 10:08:05,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:08:05,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:08:07,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:08:09,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.63 vs. limit=15.0 2023-10-03 10:08:11,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:14,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 10:08:17,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 10:08:20,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:24,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:08:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:08:29,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:08:33,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:08:34,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:08:35,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:08:37,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:08:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:08:42,695 INFO [train.py:1046] (2/4) Epoch 35, batch 3800, loss[loss=0.1762, simple_loss=0.2515, pruned_loss=0.05047, over 23741.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.24, pruned_loss=0.04114, over 4710515.45 frames. ], batch size: 150, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:08:46,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:08:49,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:08:49,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:08:51,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 10:08:53,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:55,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:08:56,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:08:59,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 10:08:59,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:00,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:09:02,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:09:02,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:09:03,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:04,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 10:09:08,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 10:09:08,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:09:10,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:09:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:09:13,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:09:15,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:09:15,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:18,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.33 vs. limit=15.0 2023-10-03 10:09:18,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:18,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1229553.3333333333, ans=0.1 2023-10-03 10:09:20,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:24,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 10:09:24,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 10:09:25,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1229553.3333333333, ans=0.07 2023-10-03 10:09:27,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:09:34,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:09:38,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-10-03 10:09:39,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:09:39,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1229620.0, ans=0.125 2023-10-03 10:09:40,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 10:09:41,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1229686.6666666667, ans=0.0 2023-10-03 10:09:41,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1229686.6666666667, ans=0.1 2023-10-03 10:09:42,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 10:09:42,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:09:45,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:09:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:46,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 10:09:49,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 10:09:49,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 10:09:49,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:51,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:09:52,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1229686.6666666667, ans=0.0 2023-10-03 10:09:57,212 INFO [train.py:1046] (2/4) Epoch 35, batch 3850, loss[loss=0.141, simple_loss=0.1924, pruned_loss=0.04483, over 19582.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2385, pruned_loss=0.04091, over 4695003.03 frames. ], batch size: 388, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:09:57,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:09:58,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:10:03,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:10:03,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 10:10:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:10:06,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:10:07,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1229753.3333333333, ans=0.125 2023-10-03 10:10:09,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1229753.3333333333, ans=0.125 2023-10-03 10:10:10,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:10:11,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1229820.0, ans=10.0 2023-10-03 10:10:12,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:10:14,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:10:15,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 10:10:21,200 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.31 vs. limit=22.5 2023-10-03 10:10:22,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.944e+02 2.219e+02 2.451e+02 3.928e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-03 10:10:22,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:22,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1229820.0, ans=0.125 2023-10-03 10:10:23,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:10:25,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:10:26,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:10:29,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:29,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:10:30,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:10:30,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:10:32,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:10:32,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:10:33,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:33,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:10:35,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 10:10:35,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 10:10:35,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:10:35,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:38,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:39,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:39,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 10:10:42,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 10:10:43,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:46,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 10:10:47,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:10:50,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1229953.3333333333, ans=0.2 2023-10-03 10:10:53,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:54,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:58,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:58,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 10:11:01,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 10:11:04,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:04,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:06,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:11:07,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:11:07,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:07,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:07,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:11:07,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 10:11:08,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:11:10,219 INFO [train.py:1046] (2/4) Epoch 35, batch 3900, loss[loss=0.1595, simple_loss=0.2292, pruned_loss=0.0449, over 23867.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2381, pruned_loss=0.04055, over 4707563.43 frames. ], batch size: 195, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:11:10,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 10:11:10,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:10,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:11,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:11:13,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:13,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:11:14,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:11:15,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:11:15,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 10:11:15,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:19,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:11:19,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:11:19,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:11:22,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:11:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:11:23,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:25,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:11:27,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 10:11:27,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:11:30,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 10:11:30,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 10:11:34,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 10:11:35,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1230153.3333333333, ans=0.1 2023-10-03 10:11:38,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:11:39,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1230220.0, ans=0.125 2023-10-03 10:11:40,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:11:40,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:11:41,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:11:45,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:11:46,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:11:49,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:11:49,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:11:51,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:11:56,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:11:56,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:12:00,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:12:02,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:12:12,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:12:16,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:12:16,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 10:12:16,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 10:12:16,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:12:19,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 10:12:20,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:12:20,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 10:12:23,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1230420.0, ans=0.0 2023-10-03 10:12:24,443 INFO [train.py:1046] (2/4) Epoch 35, batch 3950, loss[loss=0.1697, simple_loss=0.2404, pruned_loss=0.04948, over 23455.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2382, pruned_loss=0.04044, over 4712955.04 frames. ], batch size: 285, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:12:28,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:12:29,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-10-03 10:12:29,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 10:12:30,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:12:34,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:12:35,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:12:40,284 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 10:12:41,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:12:41,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 10:12:41,725 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 10:12:42,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:12:44,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1230486.6666666667, ans=0.125 2023-10-03 10:12:45,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:12:45,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:12:45,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:12:48,449 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.899e+02 2.029e+02 2.399e+02 3.247e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-03 10:12:48,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 10:12:49,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:12:51,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:12:51,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:12:52,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:12:53,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:12:56,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1230553.3333333333, ans=0.09899494936611666 2023-10-03 10:12:59,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1230553.3333333333, ans=0.2 2023-10-03 10:13:03,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:13:03,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:13:07,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=1230620.0, ans=0.02 2023-10-03 10:13:09,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 10:13:12,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 10:13:12,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 10:13:13,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:13:13,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:13:21,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:13:21,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:13:21,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:13:23,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:13:23,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 10:13:27,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:13:28,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:13:32,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 10:13:37,227 INFO [train.py:1046] (2/4) Epoch 35, batch 4000, loss[loss=0.1312, simple_loss=0.2127, pruned_loss=0.02486, over 24595.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2385, pruned_loss=0.04049, over 4717144.23 frames. ], batch size: 60, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:13:40,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:47,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1230753.3333333333, ans=0.125 2023-10-03 10:13:48,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:49,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1230820.0, ans=0.1 2023-10-03 10:13:51,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1230820.0, ans=0.125 2023-10-03 10:13:52,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1230820.0, ans=0.125 2023-10-03 10:13:54,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:13:54,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:13:54,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:54,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 10:13:56,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:13:57,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 10:13:57,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:13:57,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 10:14:00,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:03,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1230820.0, ans=0.0 2023-10-03 10:14:04,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:14:04,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:14:04,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:14:05,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:14:05,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:14:06,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:14:10,239 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 10:14:10,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:14:11,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 10:14:15,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:14:15,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:14:22,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 10:14:22,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:14:25,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:14:25,739 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 10:14:27,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:14:27,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 10:14:27,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:14:27,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1230953.3333333333, ans=0.0 2023-10-03 10:14:29,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:30,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:14:32,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:14:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:14:32,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:14:35,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 10:14:35,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:36,798 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 10:14:41,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:14:44,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 10:14:47,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:14:47,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:47,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:14:48,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:14:51,142 INFO [train.py:1046] (2/4) Epoch 35, batch 4050, loss[loss=0.1519, simple_loss=0.2334, pruned_loss=0.03516, over 23357.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2391, pruned_loss=0.04036, over 4713036.77 frames. ], batch size: 93, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:14:53,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:55,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:14:56,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 10:14:58,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:14:58,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:00,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:15:01,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:15:03,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:15:05,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:15:06,331 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.34 vs. limit=15.0 2023-10-03 10:15:08,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:15:09,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 10:15:12,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:15:12,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:15:12,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1231153.3333333333, ans=0.2 2023-10-03 10:15:17,185 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.812e+02 1.989e+02 2.164e+02 3.056e+02, threshold=3.978e+02, percent-clipped=0.0 2023-10-03 10:15:17,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:15:18,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:15:20,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 10:15:21,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 10:15:21,650 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 10:15:22,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:15:26,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.82 vs. limit=12.0 2023-10-03 10:15:29,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 10:15:31,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:15:31,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1231220.0, ans=0.07 2023-10-03 10:15:33,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:33,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1231220.0, ans=0.05 2023-10-03 10:15:33,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1231220.0, ans=0.125 2023-10-03 10:15:37,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:15:37,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:15:37,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:41,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:15:44,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 10:15:44,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:15:47,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:15:47,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 10:15:52,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:15:54,859 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.40 vs. limit=15.0 2023-10-03 10:15:55,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1231353.3333333333, ans=0.1 2023-10-03 10:15:58,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 10:15:58,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:15:58,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:16:03,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 10:16:03,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 10:16:03,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:03,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1231353.3333333333, ans=0.125 2023-10-03 10:16:06,179 INFO [train.py:1046] (2/4) Epoch 35, batch 4100, loss[loss=0.1669, simple_loss=0.2375, pruned_loss=0.0482, over 23377.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2401, pruned_loss=0.04081, over 4722008.32 frames. ], batch size: 285, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:16:06,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:16:08,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:09,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:16:13,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1231420.0, ans=0.2 2023-10-03 10:16:16,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 10:16:17,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 10:16:19,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 10:16:19,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 10:16:19,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:20,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:20,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:20,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:16:21,956 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 10:16:24,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:16:26,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:16:26,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:26,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:16:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:16:32,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:16:33,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:16:33,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 10:16:33,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1231553.3333333333, ans=0.125 2023-10-03 10:16:34,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:34,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:16:34,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:16:34,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:16:34,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 10:16:37,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:16:39,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 10:16:40,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:16:42,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:16:42,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 10:16:43,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:16:45,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:16:45,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:16:46,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 10:16:48,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:16:50,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:16:52,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 10:16:52,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:52,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:16:55,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:17:00,072 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.46 vs. limit=22.5 2023-10-03 10:17:00,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:02,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:17:03,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:17:08,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:09,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:17:11,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1231686.6666666667, ans=0.1 2023-10-03 10:17:12,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:17:15,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:17:18,750 INFO [train.py:1046] (2/4) Epoch 35, batch 4150, loss[loss=0.1537, simple_loss=0.245, pruned_loss=0.03126, over 24407.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2403, pruned_loss=0.04099, over 4711920.28 frames. ], batch size: 77, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:17:20,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:17:21,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:17:22,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:17:22,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:17:25,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 10:17:26,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:26,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 10:17:28,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 10:17:28,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 10:17:30,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:34,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:17:34,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:39,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:17:39,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:17:40,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:17:40,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:17:42,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:17:43,362 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.871e+02 2.078e+02 2.359e+02 3.570e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 10:17:43,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:17:48,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:52,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:17:53,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 10:17:54,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 10:17:54,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:17:56,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.84 vs. limit=15.0 2023-10-03 10:17:57,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 10:17:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:17:57,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:17:58,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:17:59,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:18:03,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 10:18:05,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1231953.3333333333, ans=0.125 2023-10-03 10:18:06,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:18:07,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 10:18:08,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:18:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 10:18:13,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:18:15,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:18:16,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:17,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 10:18:17,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:17,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:18:19,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:18:20,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 10:18:22,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:22,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:18:22,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:18:22,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 10:18:23,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:18:23,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:18:24,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:18:26,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:27,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 10:18:27,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:18:29,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1232020.0, ans=0.0 2023-10-03 10:18:32,343 INFO [train.py:1046] (2/4) Epoch 35, batch 4200, loss[loss=0.141, simple_loss=0.2207, pruned_loss=0.03062, over 24463.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2397, pruned_loss=0.04105, over 4719724.15 frames. ], batch size: 58, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:18:32,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:18:33,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 10:18:35,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:18:37,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:18:39,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:18:39,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:18:39,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:18:39,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.16 vs. limit=15.0 2023-10-03 10:18:41,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 10:18:44,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 10:18:44,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:48,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:50,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:18:53,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:18:55,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:18:55,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:55,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 10:18:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:56,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1232153.3333333333, ans=0.95 2023-10-03 10:18:56,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1232153.3333333333, ans=0.125 2023-10-03 10:18:56,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1232153.3333333333, ans=0.125 2023-10-03 10:18:57,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:57,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:18:57,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:18:59,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:19:01,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 10:19:01,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:19:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:19:07,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:19:10,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:19:10,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:19:11,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:19:11,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 10:19:12,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:19:13,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1232220.0, ans=0.0 2023-10-03 10:19:14,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:19:19,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:19:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:19:23,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1232286.6666666667, ans=0.125 2023-10-03 10:19:27,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:19:30,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 10:19:31,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:19:36,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:19:36,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:19:38,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 10:19:41,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=1232353.3333333333, ans=0.2 2023-10-03 10:19:43,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:19:47,740 INFO [train.py:1046] (2/4) Epoch 35, batch 4250, loss[loss=0.1645, simple_loss=0.2467, pruned_loss=0.04113, over 23361.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2378, pruned_loss=0.04045, over 4711436.69 frames. ], batch size: 93, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:19:49,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:19:49,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:19:51,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:19:53,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1232420.0, ans=0.125 2023-10-03 10:19:56,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:19:57,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 10:19:57,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:20:00,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:05,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:20:09,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:09,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:11,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:20:12,890 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.847e+02 2.096e+02 2.391e+02 3.957e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-03 10:20:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:20:13,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:14,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:14,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:17,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:20:19,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:19,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 10:20:23,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 10:20:23,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:24,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:20:24,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:25,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:20:25,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:25,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:26,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1232553.3333333333, ans=0.125 2023-10-03 10:20:28,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1232553.3333333333, ans=0.125 2023-10-03 10:20:30,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:20:31,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:20:33,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1232620.0, ans=0.09899494936611666 2023-10-03 10:20:35,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:20:37,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:39,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 10:20:39,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:20:39,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 10:20:40,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:20:41,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:20:43,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:43,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:20:46,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 10:20:46,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1232686.6666666667, ans=0.1 2023-10-03 10:20:49,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:20:49,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:20:53,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:55,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:58,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:20:58,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:20:59,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:20:59,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:21:01,289 INFO [train.py:1046] (2/4) Epoch 35, batch 4300, loss[loss=0.1385, simple_loss=0.2118, pruned_loss=0.03259, over 23616.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2373, pruned_loss=0.0402, over 4716530.65 frames. ], batch size: 232, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:21:01,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:21:01,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 10:21:04,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:21:08,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1232753.3333333333, ans=0.125 2023-10-03 10:21:10,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:21:10,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:21:13,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:21:19,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:21:19,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 10:21:21,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:21:21,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1232820.0, ans=0.125 2023-10-03 10:21:22,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:21:23,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:21:23,960 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 10:21:26,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:21:28,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:21:31,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 10:21:31,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:21:31,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 10:21:33,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:21:35,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:21:35,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1232886.6666666667, ans=0.1 2023-10-03 10:21:38,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:21:38,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:21:40,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:21:41,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:21:42,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:21:42,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 10:21:44,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 10:21:46,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:21:49,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:21:49,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:21:50,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:21:50,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:21:50,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 10:21:50,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 10:21:50,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 10:21:52,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:21:52,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 10:21:52,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 10:21:52,629 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.83 vs. limit=15.0 2023-10-03 10:21:56,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:21:57,570 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 10:21:58,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:22:00,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:01,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:22:03,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 10:22:05,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:22:05,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:06,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:22:06,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:22:07,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:22:11,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:22:12,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:13,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:13,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:22:15,221 INFO [train.py:1046] (2/4) Epoch 35, batch 4350, loss[loss=0.1607, simple_loss=0.2376, pruned_loss=0.0419, over 23714.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2376, pruned_loss=0.04009, over 4722628.40 frames. ], batch size: 149, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:22:18,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 10:22:19,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:22:24,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:22:25,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:28,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:22:28,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:22:33,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.36 vs. limit=15.0 2023-10-03 10:22:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:22:37,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:40,392 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.912e+02 2.047e+02 2.309e+02 3.251e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 10:22:41,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:22:41,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:22:43,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:22:45,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:22:46,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:22:48,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1233220.0, ans=0.0 2023-10-03 10:22:51,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 10:22:51,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:22:52,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:54,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1233220.0, ans=0.125 2023-10-03 10:22:57,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:59,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 10:23:03,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:04,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:23:08,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.69 vs. limit=15.0 2023-10-03 10:23:08,734 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 10:23:09,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.01 vs. limit=15.0 2023-10-03 10:23:12,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:12,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:23:13,502 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 10:23:13,573 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 10:23:13,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:23:13,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:14,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:23:16,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:17,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:23:17,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:23:17,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1233353.3333333333, ans=0.125 2023-10-03 10:23:20,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 10:23:20,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:20,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:20,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:22,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 10:23:23,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 10:23:23,486 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 10:23:23,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 10:23:24,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:23:26,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:23:26,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:23:28,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:23:29,411 INFO [train.py:1046] (2/4) Epoch 35, batch 4400, loss[loss=0.169, simple_loss=0.2534, pruned_loss=0.0423, over 23941.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2381, pruned_loss=0.04033, over 4714123.93 frames. ], batch size: 80, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:23:29,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 10:23:30,972 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 10:23:30,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:36,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:23:36,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:38,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:39,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 10:23:39,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 10:23:39,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 10:23:41,317 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 10:23:42,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:23:42,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:23:44,921 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:23:45,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 10:23:48,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:48,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:48,756 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 10:23:51,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:23:51,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 10:23:52,949 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 10:23:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 10:23:56,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 10:23:56,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 10:23:56,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:57,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:58,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1233553.3333333333, ans=0.125 2023-10-03 10:23:58,791 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.14 vs. limit=15.0 2023-10-03 10:23:59,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:24:01,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:24:02,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 10:24:03,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 10:24:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:24:05,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:24:05,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:24:06,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.02 vs. limit=22.5 2023-10-03 10:24:06,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:24:06,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 10:24:08,057 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 10:24:11,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:18,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:24:21,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 10:24:25,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:24:27,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:24:30,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:24:31,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 10:24:31,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:24:31,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:24:31,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:24:31,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:24:36,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 10:24:38,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 10:24:39,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 10:24:39,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:24:39,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 10:24:41,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:24:44,089 INFO [train.py:1046] (2/4) Epoch 35, batch 4450, loss[loss=0.1499, simple_loss=0.2307, pruned_loss=0.03452, over 24667.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2397, pruned_loss=0.04117, over 4705274.11 frames. ], batch size: 65, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:24:46,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:24:47,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 10:24:50,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:24:53,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:53,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:24:56,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:24:56,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:24:58,233 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:24:59,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:01,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:25:04,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:25:05,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:25:05,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 10:25:05,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:25:07,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:07,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:25:07,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:25:10,451 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.919e+02 2.096e+02 2.482e+02 3.695e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-03 10:25:10,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:25:15,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:15,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:16,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:25:16,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:25:18,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:25:18,669 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.09 vs. limit=15.0 2023-10-03 10:25:24,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 10:25:24,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1233886.6666666667, ans=0.125 2023-10-03 10:25:25,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 10:25:25,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 10:25:25,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:25:27,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:25:28,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 10:25:31,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:25:35,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:35,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 10:25:35,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:35,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:25:35,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:25:35,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:25:37,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:41,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:25:43,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 10:25:44,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:25:47,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:25:47,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:25:49,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:49,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:25:51,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:25:53,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 10:25:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:25:57,831 INFO [train.py:1046] (2/4) Epoch 35, batch 4500, loss[loss=0.1676, simple_loss=0.2392, pruned_loss=0.04806, over 23781.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2399, pruned_loss=0.04095, over 4708532.10 frames. ], batch size: 179, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:25:59,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:26:00,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 10:26:00,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 10:26:01,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:26:05,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:26:05,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:26:06,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:26:08,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:26:08,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:08,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:21,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:26:21,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:26:23,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:26:26,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:26:27,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:26:27,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1234220.0, ans=0.2 2023-10-03 10:26:32,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:26:37,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:26:42,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:26:44,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1234286.6666666667, ans=0.0 2023-10-03 10:26:45,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:26:45,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 10:26:45,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:26:47,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:26:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:26:49,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:26:52,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:52,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 10:26:52,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:26:52,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:26:58,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:26:58,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:27:00,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:02,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:27:04,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:27:04,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 10:27:06,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 10:27:06,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 10:27:09,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 10:27:12,137 INFO [train.py:1046] (2/4) Epoch 35, batch 4550, loss[loss=0.1493, simple_loss=0.2282, pruned_loss=0.03521, over 22366.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2392, pruned_loss=0.04039, over 4716704.89 frames. ], batch size: 49, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:27:12,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 10:27:12,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:27:16,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:27:17,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:27:19,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:27:23,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:27:25,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:27:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:27:27,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:27:27,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:29,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:27:31,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:27:33,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.00 vs. limit=22.5 2023-10-03 10:27:34,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:27:35,379 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.52 vs. limit=22.5 2023-10-03 10:27:37,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 10:27:37,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 10:27:38,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:27:39,284 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.927e+02 2.055e+02 2.353e+02 3.694e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 10:27:39,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 10:27:42,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 10:27:43,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:27:47,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 10:27:49,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:27:53,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:53,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:53,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:27:56,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 10:27:59,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:28:01,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:01,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:28:02,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:28:02,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1234620.0, ans=0.125 2023-10-03 10:28:04,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 10:28:04,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 10:28:05,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:28:07,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 10:28:10,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 10:28:10,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:28:10,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:10,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:28:11,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:12,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:28:13,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:28:14,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 10:28:17,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:28:17,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 10:28:17,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 10:28:17,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:28:18,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 10:28:21,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:28:21,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:28:23,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:28:23,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:24,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:28:25,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1234753.3333333333, ans=0.0 2023-10-03 10:28:26,307 INFO [train.py:1046] (2/4) Epoch 35, batch 4600, loss[loss=0.1452, simple_loss=0.2245, pruned_loss=0.03299, over 24399.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2388, pruned_loss=0.0403, over 4727683.09 frames. ], batch size: 58, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:28:26,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:28:29,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:28:30,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:32,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:28:35,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:28:35,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:28:36,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:28:37,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 10:28:39,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:28:43,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:28:43,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1234820.0, ans=0.125 2023-10-03 10:28:44,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:28:46,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:51,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 10:28:53,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:54,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.82 vs. limit=22.5 2023-10-03 10:28:55,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1234886.6666666667, ans=0.0 2023-10-03 10:28:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:58,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:28:58,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:29:03,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 10:29:03,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:29:03,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:29:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:09,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:29:11,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:29:14,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 10:29:15,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:29:18,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1234953.3333333333, ans=0.125 2023-10-03 10:29:21,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:21,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:29:24,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:24,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 10:29:24,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:25,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 10:29:25,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:27,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:28,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:28,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:29:30,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:31,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 10:29:32,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 10:29:32,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 10:29:32,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:34,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:29:35,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:36,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:36,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1235020.0, ans=0.125 2023-10-03 10:29:40,765 INFO [train.py:1046] (2/4) Epoch 35, batch 4650, loss[loss=0.1647, simple_loss=0.2502, pruned_loss=0.0396, over 24059.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2386, pruned_loss=0.04037, over 4724412.18 frames. ], batch size: 80, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:29:45,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:29:48,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:29:48,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:49,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:29:49,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:49,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:29:51,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 10:29:58,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:30:01,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 10:30:01,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:30:02,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 10:30:02,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:30:03,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 10:30:03,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 10:30:03,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:04,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:30:07,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:30:08,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 10:30:10,230 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.852e+02 2.049e+02 2.338e+02 3.401e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-03 10:30:13,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:14,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 10:30:16,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=15.0 2023-10-03 10:30:18,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:18,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:30:18,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 10:30:18,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1235220.0, ans=0.125 2023-10-03 10:30:19,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:30:22,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:30:23,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.42 vs. limit=15.0 2023-10-03 10:30:27,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:30:30,140 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:30:31,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:32,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:34,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:30:36,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 10:30:36,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 10:30:38,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 10:30:38,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 10:30:40,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:30:41,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1235353.3333333333, ans=0.05 2023-10-03 10:30:42,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.53 vs. limit=15.0 2023-10-03 10:30:46,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:30:46,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:30:46,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 10:30:46,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:30:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:30:47,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:30:49,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:30:50,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1235353.3333333333, ans=0.125 2023-10-03 10:30:52,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:30:52,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:30:52,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:55,116 INFO [train.py:1046] (2/4) Epoch 35, batch 4700, loss[loss=0.1709, simple_loss=0.2534, pruned_loss=0.04415, over 23408.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2389, pruned_loss=0.04035, over 4723301.55 frames. ], batch size: 93, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:30:57,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:30:58,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:30:58,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:30:58,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 10:30:59,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:31:01,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 10:31:08,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.51 vs. limit=15.0 2023-10-03 10:31:08,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:10,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:31:10,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:31:11,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:31:13,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:31:16,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 10:31:17,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 10:31:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:21,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:31:21,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:31:24,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:29,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1235553.3333333333, ans=0.2 2023-10-03 10:31:31,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:31:32,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:31:34,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:31:39,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1235620.0, ans=0.0 2023-10-03 10:31:40,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 10:31:41,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:31:44,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:31:45,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 10:31:47,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:31:51,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:31:52,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 10:31:53,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:31:53,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:31:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:58,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:31:58,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 10:31:59,914 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 10:32:01,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:32:01,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:01,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:01,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 10:32:02,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:04,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1235686.6666666667, ans=0.0 2023-10-03 10:32:07,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 10:32:10,140 INFO [train.py:1046] (2/4) Epoch 35, batch 4750, loss[loss=0.1799, simple_loss=0.2548, pruned_loss=0.05249, over 23814.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.239, pruned_loss=0.04031, over 4728579.63 frames. ], batch size: 179, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:32:10,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:32:10,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:10,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1235753.3333333333, ans=0.0 2023-10-03 10:32:15,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:15,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:32:19,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 10:32:20,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:32:23,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 10:32:25,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:32:25,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:32:25,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:32:28,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1235820.0, ans=0.2 2023-10-03 10:32:28,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1235820.0, ans=0.1 2023-10-03 10:32:30,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 10:32:35,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:32:36,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 10:32:36,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1235820.0, ans=0.2 2023-10-03 10:32:37,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:32:39,854 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.852e+02 2.146e+02 2.471e+02 3.124e+02, threshold=4.292e+02, percent-clipped=0.0 2023-10-03 10:32:41,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:32:41,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:32:41,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:44,139 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 10:32:44,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 10:32:46,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.24 vs. limit=15.0 2023-10-03 10:32:50,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 10:32:51,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:32:53,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:32:53,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1235953.3333333333, ans=0.0 2023-10-03 10:32:55,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:32:55,950 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 10:32:55,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:32:59,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:33:02,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:33:03,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 10:33:03,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 10:33:05,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:33:05,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:33:06,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:06,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:33:06,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 10:33:07,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1235953.3333333333, ans=0.0 2023-10-03 10:33:09,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 10:33:12,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:12,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1236020.0, ans=0.125 2023-10-03 10:33:15,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:33:15,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 10:33:15,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:33:17,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:18,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:33:18,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:20,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:33:21,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:33:21,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 10:33:24,454 INFO [train.py:1046] (2/4) Epoch 35, batch 4800, loss[loss=0.2278, simple_loss=0.281, pruned_loss=0.08731, over 19159.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2397, pruned_loss=0.04051, over 4732649.96 frames. ], batch size: 388, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:33:24,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 10:33:24,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 10:33:28,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:33:28,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:33:29,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 10:33:34,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:35,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1236086.6666666667, ans=0.125 2023-10-03 10:33:39,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:33:39,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:39,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:41,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 10:33:41,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:33:41,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:33:42,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:33:42,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1236153.3333333333, ans=0.0 2023-10-03 10:33:44,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1236153.3333333333, ans=0.125 2023-10-03 10:33:47,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:33:50,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:50,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:33:50,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:51,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 10:33:51,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:53,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:55,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:59,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:59,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1236220.0, ans=0.125 2023-10-03 10:34:00,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:34:00,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:34:03,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:34:04,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:05,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 10:34:07,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 10:34:07,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:08,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:34:08,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:34:08,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:34:08,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:34:10,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:34:11,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:34:14,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:34:17,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:17,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:22,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 10:34:23,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:34:23,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:25,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:34:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:29,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:34:30,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:34:30,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:31,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:34:31,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:34:33,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:34:35,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:35,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:35,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:34:37,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 10:34:38,459 INFO [train.py:1046] (2/4) Epoch 35, batch 4850, loss[loss=0.1551, simple_loss=0.2324, pruned_loss=0.03894, over 23243.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.04078, over 4726521.47 frames. ], batch size: 105, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:34:40,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 10:34:40,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:34:40,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:34:41,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:34:41,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:42,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:43,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1236420.0, ans=0.0 2023-10-03 10:34:47,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1236420.0, ans=0.125 2023-10-03 10:34:51,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 10:34:53,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:56,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:34:57,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:34:57,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:01,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:35:02,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:35:03,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:35:03,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 10:35:06,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:35:07,876 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.944e+02 2.127e+02 2.497e+02 3.827e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 10:35:08,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:35:09,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:35:09,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:35:09,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 10:35:12,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:35:12,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:15,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1236553.3333333333, ans=0.1 2023-10-03 10:35:18,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:18,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 10:35:18,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 10:35:19,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:35:27,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:35:27,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 10:35:28,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:35:28,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:35:30,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.30 vs. limit=15.0 2023-10-03 10:35:32,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:35:32,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1236620.0, ans=0.1 2023-10-03 10:35:33,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 10:35:33,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:33,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 10:35:33,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:35,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:35:36,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 10:35:43,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1236686.6666666667, ans=0.0 2023-10-03 10:35:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:46,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1236686.6666666667, ans=0.125 2023-10-03 10:35:49,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:35:49,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:35:52,528 INFO [train.py:1046] (2/4) Epoch 35, batch 4900, loss[loss=0.1595, simple_loss=0.2489, pruned_loss=0.03506, over 24696.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2388, pruned_loss=0.0404, over 4718057.92 frames. ], batch size: 73, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:35:55,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 10:35:55,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:35:58,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:35:59,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:59,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:36:03,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 10:36:06,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 10:36:11,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 10:36:12,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 10:36:12,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:36:12,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:36:12,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:36:13,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:36:13,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:36:14,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 10:36:17,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 10:36:19,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:36:19,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:36:20,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:36:22,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:36:23,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:36:25,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:36:25,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 10:36:27,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:36:29,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:36:30,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 10:36:30,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 10:36:30,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1236886.6666666667, ans=0.125 2023-10-03 10:36:33,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 10:36:36,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:36:36,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:36:36,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:36:37,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:36:38,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 10:36:38,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:36:38,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 10:36:40,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:36:42,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:36:44,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:36:47,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 10:36:47,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:36:48,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 10:36:49,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 10:36:50,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1237020.0, ans=0.125 2023-10-03 10:36:52,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1237020.0, ans=0.1 2023-10-03 10:36:56,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:36:56,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1237020.0, ans=0.125 2023-10-03 10:36:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:36:57,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1237020.0, ans=0.0 2023-10-03 10:36:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 10:36:59,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:36:59,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:37:01,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:05,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:37:05,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:37:06,713 INFO [train.py:1046] (2/4) Epoch 35, batch 4950, loss[loss=0.1548, simple_loss=0.2382, pruned_loss=0.03569, over 23446.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2376, pruned_loss=0.04028, over 4718358.09 frames. ], batch size: 93, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:37:06,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:37:06,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 10:37:08,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:37:09,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1237086.6666666667, ans=0.125 2023-10-03 10:37:10,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:37:11,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:37:13,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 10:37:13,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 10:37:13,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:37:13,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 10:37:15,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:15,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:37:15,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:37:15,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:18,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:37:21,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:37:21,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:37:21,772 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:37:24,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:24,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:37:28,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:37:33,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:34,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:37:36,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:36,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:37,597 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.912e+02 2.157e+02 2.432e+02 3.456e+02, threshold=4.313e+02, percent-clipped=0.0 2023-10-03 10:37:37,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:37:39,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 10:37:40,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 10:37:43,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:44,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:37:44,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:37:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:37:45,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:37:47,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:37:49,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:51,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:37:53,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:37:55,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:56,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:56,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 10:37:56,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:37:59,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:38:02,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:38:04,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:38:04,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:38:05,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:38:05,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:38:05,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:38:08,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:38:08,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:38:08,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:38:09,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1237353.3333333333, ans=0.1 2023-10-03 10:38:11,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 10:38:12,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1237353.3333333333, ans=0.1 2023-10-03 10:38:15,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:19,244 INFO [train.py:1046] (2/4) Epoch 35, batch 5000, loss[loss=0.1464, simple_loss=0.2254, pruned_loss=0.03369, over 23681.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2377, pruned_loss=0.0398, over 4728766.57 frames. ], batch size: 149, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:38:19,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 10:38:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:38:24,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:38:25,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:38:27,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 10:38:28,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 10:38:30,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:38:33,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 10:38:33,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:38:33,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:38:34,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 10:38:36,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:38:36,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:38:36,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 10:38:36,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:37,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:38:37,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 10:38:39,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 10:38:39,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:38:39,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 10:38:39,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:38:40,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:40,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:38:40,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 10:38:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 10:38:43,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 10:38:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:38:44,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:46,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 10:38:46,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:38:47,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:49,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:52,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 10:38:53,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 10:38:53,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:38:54,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:38:59,634 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 10:39:01,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:39:02,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:39:02,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 10:39:06,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:39:07,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:39:07,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:39:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 10:39:10,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:39:10,603 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:39:14,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:39:14,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:39:17,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1237686.6666666667, ans=0.125 2023-10-03 10:39:20,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 10:39:20,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1237686.6666666667, ans=0.125 2023-10-03 10:39:23,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:29,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1237686.6666666667, ans=0.0 2023-10-03 10:39:32,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1237753.3333333333, ans=0.125 2023-10-03 10:39:33,440 INFO [train.py:1046] (2/4) Epoch 35, batch 5050, loss[loss=0.1607, simple_loss=0.2436, pruned_loss=0.03896, over 24448.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03999, over 4719205.67 frames. ], batch size: 69, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:39:33,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:39:34,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:34,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:39:34,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:39:36,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:39:36,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:39:36,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:40,566 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.72 vs. limit=22.5 2023-10-03 10:39:40,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:41,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 10:39:42,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:39:43,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:39:45,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:39:45,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 10:39:46,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:39:46,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:39:48,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:39:49,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:39:49,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:39:52,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1237820.0, ans=0.125 2023-10-03 10:39:58,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 10:39:58,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:40:00,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:40:00,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 10:40:01,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:40:01,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:01,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:40:03,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:40:03,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 10:40:04,409 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.891e+02 1.998e+02 2.212e+02 3.004e+02, threshold=3.997e+02, percent-clipped=0.0 2023-10-03 10:40:04,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 10:40:06,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:08,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:09,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1237886.6666666667, ans=0.125 2023-10-03 10:40:11,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:11,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 10:40:13,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:40:16,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 10:40:16,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:40:17,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:40:18,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:40:19,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:40:22,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:40:24,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:40:24,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:25,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:40:25,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:40:25,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 10:40:25,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:40:28,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:40:30,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1237953.3333333333, ans=0.125 2023-10-03 10:40:31,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:40:31,753 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 10:40:31,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:40:31,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:40:33,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:33,244 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 10:40:37,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:37,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 10:40:37,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:42,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:40:42,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:42,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 10:40:45,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 10:40:46,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:40:46,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:40:46,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:40:48,042 INFO [train.py:1046] (2/4) Epoch 35, batch 5100, loss[loss=0.1525, simple_loss=0.237, pruned_loss=0.03396, over 24660.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2399, pruned_loss=0.04061, over 4703612.39 frames. ], batch size: 65, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:40:49,625 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 10:40:52,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:55,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 10:40:55,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 10:40:57,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:41:00,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:41:01,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:41:03,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 10:41:03,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 10:41:09,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:41:10,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:41:13,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:41:17,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 10:41:17,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:41:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:41:20,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:41:21,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:22,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:23,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 10:41:24,756 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 10:41:26,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:26,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 10:41:26,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 10:41:29,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:41:38,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:41:41,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 10:41:41,423 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 10:41:41,436 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 10:41:44,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 10:41:44,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:46,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1238353.3333333333, ans=6.0 2023-10-03 10:41:47,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 10:41:51,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 10:41:54,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:41:54,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:41:55,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 10:41:57,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:41:58,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 10:42:00,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1238420.0, ans=0.1 2023-10-03 10:42:01,756 INFO [train.py:1046] (2/4) Epoch 35, batch 5150, loss[loss=0.1696, simple_loss=0.2484, pruned_loss=0.04536, over 23630.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2403, pruned_loss=0.04087, over 4707748.45 frames. ], batch size: 106, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:42:01,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:42:01,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:42:01,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:42:03,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:42:04,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:42:04,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:42:06,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 10:42:06,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 10:42:07,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 10:42:07,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:42:07,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 10:42:08,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:10,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 10:42:12,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:12,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.62 vs. limit=15.0 2023-10-03 10:42:13,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:17,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:42:17,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 10:42:17,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1238486.6666666667, ans=0.125 2023-10-03 10:42:18,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:19,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:42:21,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:42:21,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:42:22,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:42:22,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:42:22,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:42:23,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 10:42:23,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:42:25,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:42:25,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1238486.6666666667, ans=0.125 2023-10-03 10:42:26,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:42:26,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1238486.6666666667, ans=0.0 2023-10-03 10:42:28,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 10:42:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:42:33,230 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.014e+02 2.285e+02 2.770e+02 4.713e+02, threshold=4.570e+02, percent-clipped=3.0 2023-10-03 10:42:36,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:42:37,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 10:42:40,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:42:45,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:42:45,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:50,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:42:52,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:42:54,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 10:42:59,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:59,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:42:59,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:43:02,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1238686.6666666667, ans=15.0 2023-10-03 10:43:02,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:02,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1238686.6666666667, ans=0.0 2023-10-03 10:43:04,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:43:04,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 10:43:10,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:43:10,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:43:13,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:43:13,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:43:14,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:43:14,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:43:14,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:43:16,096 INFO [train.py:1046] (2/4) Epoch 35, batch 5200, loss[loss=0.1668, simple_loss=0.2477, pruned_loss=0.0429, over 23421.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.241, pruned_loss=0.0414, over 4706989.11 frames. ], batch size: 93, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:43:16,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:43:19,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:43:19,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1238753.3333333333, ans=0.2 2023-10-03 10:43:22,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:43:23,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 10:43:27,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:43:29,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:31,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:31,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:43:31,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:32,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 10:43:35,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:43:35,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:39,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 10:43:42,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:43:42,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:43:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 10:43:45,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 10:43:47,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 10:43:47,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:49,133 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 10:43:49,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:50,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:43:50,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:43:51,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 10:43:52,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:43:54,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:57,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 10:43:57,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 10:43:57,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 10:44:03,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 10:44:03,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:44:10,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:44:10,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:12,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 10:44:13,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:44:13,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 10:44:13,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:13,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:44:15,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:44:17,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:44:21,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:44:21,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:21,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:24,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:26,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 10:44:27,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:44:27,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:44:28,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:29,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:44:30,837 INFO [train.py:1046] (2/4) Epoch 35, batch 5250, loss[loss=0.1724, simple_loss=0.2367, pruned_loss=0.05404, over 23925.00 frames. ], tot_loss[loss=0.161, simple_loss=0.24, pruned_loss=0.04103, over 4697069.15 frames. ], batch size: 195, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:44:30,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:44:34,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:44:36,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:36,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:44:39,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:44:40,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.25 vs. limit=22.5 2023-10-03 10:44:43,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:46,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:44:48,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:44:48,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:44:51,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 10:44:51,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:51,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:54,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1239153.3333333333, ans=0.125 2023-10-03 10:44:56,181 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=12.0 2023-10-03 10:45:00,737 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.950e+02 2.167e+02 2.359e+02 3.354e+02, threshold=4.333e+02, percent-clipped=0.0 2023-10-03 10:45:24,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.whiten.whitening_limit, batch_count=1239286.6666666667, ans=15.0 2023-10-03 10:45:27,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1239353.3333333333, ans=0.1 2023-10-03 10:45:34,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1239353.3333333333, ans=0.125 2023-10-03 10:45:39,423 INFO [train.py:1046] (2/4) Epoch 35, batch 5300, loss[loss=0.1467, simple_loss=0.2255, pruned_loss=0.034, over 24624.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2382, pruned_loss=0.0406, over 4709306.32 frames. ], batch size: 60, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:45:39,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1239420.0, ans=0.0 2023-10-03 10:45:53,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:45:53,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 10:45:53,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 10:45:53,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:54,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:54,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:54,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:54,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:54,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:45:54,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:54,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:45:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:45:54,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 10:45:54,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 10:45:54,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 10:45:54,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:45:54,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 10:45:55,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 10:45:55,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:55,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:55,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:55,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:45:55,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:45:56,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:45:56,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:56,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:56,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:56,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:56,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:45:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:56,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:45:57,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 10:45:57,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:45:57,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:57,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 10:45:57,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 10:45:57,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:45:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:45:57,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 10:45:57,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 10:45:58,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:45:58,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:45:58,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:45:58,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 10:45:58,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 10:45:58,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:45:58,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:59,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 10:45:59,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 10:45:59,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 10:45:59,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:46:05,604 INFO [train.py:1046] (2/4) Epoch 36, batch 0, loss[loss=0.1372, simple_loss=0.2182, pruned_loss=0.02809, over 24451.00 frames. ], tot_loss[loss=0.1372, simple_loss=0.2182, pruned_loss=0.02809, over 24451.00 frames. ], batch size: 63, lr: 2.85e-03, grad_scale: 32.0 2023-10-03 10:46:05,605 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 10:46:17,643 INFO [train.py:1078] (2/4) Epoch 36, validation: loss=0.3188, simple_loss=0.2685, pruned_loss=0.1846, over 1125622.00 frames. 2023-10-03 10:46:17,644 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 10:46:20,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 10:46:20,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:46:23,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:46:23,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1239500.0, ans=0.125 2023-10-03 10:46:25,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1239500.0, ans=0.125 2023-10-03 10:46:26,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:26,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:46:26,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:27,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 10:46:29,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 10:46:30,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:32,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:36,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:37,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.74 vs. limit=15.0 2023-10-03 10:46:37,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:37,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:46:37,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:46:40,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 10:46:41,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:46:48,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:46:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:50,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 10:46:54,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:46:54,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:46:57,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:02,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:47:07,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:11,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 10:47:14,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 10:47:16,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:47:16,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:18,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:47:18,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:47:21,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 10:47:22,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:24,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:27,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:47:31,223 INFO [train.py:1046] (2/4) Epoch 36, batch 50, loss[loss=0.1507, simple_loss=0.2337, pruned_loss=0.03383, over 24564.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2379, pruned_loss=0.03846, over 1075272.08 frames. ], batch size: 60, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:47:31,320 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 10:47:32,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:47:36,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:47:37,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:47:37,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 10:47:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:47:38,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:47:40,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:47:40,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:47:41,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:47:44,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1239900.0, ans=0.0 2023-10-03 10:47:45,661 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.883e+02 2.047e+02 2.460e+02 5.185e+02, threshold=4.094e+02, percent-clipped=4.0 2023-10-03 10:47:45,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 10:47:45,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:52,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:47:52,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1239900.0, ans=0.1 2023-10-03 10:47:53,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 10:47:54,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1239900.0, ans=0.125 2023-10-03 10:47:55,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 10:47:56,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:47:58,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:47:58,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:59,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:47:59,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:48:00,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:48:00,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:48:04,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.79 vs. limit=15.0 2023-10-03 10:48:05,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1239966.6666666667, ans=0.05 2023-10-03 10:48:08,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:48:08,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1239966.6666666667, ans=0.125 2023-10-03 10:48:10,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:10,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:48:11,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 10:48:12,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:48:14,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:48:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 10:48:14,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:48:16,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 10:48:25,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:48:25,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:48:27,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:28,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:48:28,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:48:29,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 10:48:29,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 10:48:31,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:32,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-10-03 10:48:32,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:48:32,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1240100.0, ans=10.0 2023-10-03 10:48:34,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:48:34,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:48:34,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 10:48:35,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 10:48:36,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1240100.0, ans=0.125 2023-10-03 10:48:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 10:48:37,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1240100.0, ans=0.1 2023-10-03 10:48:38,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:48:38,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:48:39,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 10:48:39,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 10:48:40,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1240100.0, ans=0.0 2023-10-03 10:48:42,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:48:42,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:43,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:48:43,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:48:45,317 INFO [train.py:1046] (2/4) Epoch 36, batch 100, loss[loss=0.1661, simple_loss=0.2522, pruned_loss=0.03994, over 24636.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2394, pruned_loss=0.0402, over 1882619.30 frames. ], batch size: 68, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:48:45,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:48:48,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:48:51,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:48:52,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 10:48:52,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:57,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:48:57,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:48:57,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:57,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:48:58,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:49:00,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 10:49:02,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:49:02,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:02,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:04,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:49:07,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 10:49:07,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:09,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:10,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:49:11,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1240233.3333333333, ans=0.125 2023-10-03 10:49:11,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1240233.3333333333, ans=0.125 2023-10-03 10:49:13,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:49:17,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 10:49:17,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 10:49:18,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:18,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:49:21,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:49:23,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:24,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:26,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1240300.0, ans=0.1 2023-10-03 10:49:29,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1240366.6666666667, ans=0.0 2023-10-03 10:49:30,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:32,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 10:49:34,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 10:49:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:49:37,185 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:49:38,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:49:40,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:42,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:44,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.90 vs. limit=22.5 2023-10-03 10:49:45,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:49:45,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:49:48,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:48,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:49,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:49,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:49:50,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1240433.3333333333, ans=0.0 2023-10-03 10:49:51,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:51,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 10:49:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 10:49:51,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:53,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:49:54,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:49:54,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:54,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 10:49:54,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:49:54,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:49:54,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:49:55,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:56,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:56,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1240433.3333333333, ans=0.1 2023-10-03 10:49:57,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:49:57,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:49:59,052 INFO [train.py:1046] (2/4) Epoch 36, batch 150, loss[loss=0.1597, simple_loss=0.2358, pruned_loss=0.04183, over 23246.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2409, pruned_loss=0.04037, over 2516308.56 frames. ], batch size: 119, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:49:59,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:03,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:50:03,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:03,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:08,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:50:08,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:11,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:50:12,132 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.78 vs. limit=12.0 2023-10-03 10:50:12,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:13,948 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.846e+02 1.951e+02 2.154e+02 3.020e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-03 10:50:15,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 10:50:15,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 10:50:15,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 10:50:15,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1240566.6666666667, ans=0.2 2023-10-03 10:50:18,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:50:18,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:50:19,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:50:19,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:50:19,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:50:19,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:21,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:23,094 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 10:50:24,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:50:30,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:35,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1240633.3333333333, ans=0.125 2023-10-03 10:50:36,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:50:36,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 10:50:41,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:50:41,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:41,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:50:42,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:50:45,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:50:45,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:50:46,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:46,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 10:50:52,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:52,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.51 vs. limit=6.0 2023-10-03 10:50:53,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:50:53,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:50:55,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:50:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:58,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 10:51:01,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:51:02,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:51:02,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:04,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:51:04,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 10:51:04,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:51:04,337 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 10:51:10,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:51:12,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:51:12,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:51:12,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1240833.3333333333, ans=0.125 2023-10-03 10:51:13,582 INFO [train.py:1046] (2/4) Epoch 36, batch 200, loss[loss=0.15, simple_loss=0.2303, pruned_loss=0.03482, over 21601.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2424, pruned_loss=0.04118, over 3003724.40 frames. ], batch size: 47, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:51:16,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 10:51:16,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:16,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:19,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 10:51:20,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:51:20,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:21,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:51:22,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1240833.3333333333, ans=0.125 2023-10-03 10:51:23,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1240833.3333333333, ans=0.0 2023-10-03 10:51:26,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:51:26,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:51:26,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:37,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1240900.0, ans=0.0 2023-10-03 10:51:47,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:51:47,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:51:48,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:51:49,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:51:49,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:51:49,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:51:51,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:51:52,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:51:52,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:52,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:51:52,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 10:51:53,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:51:54,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:58,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:51:58,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1241033.3333333333, ans=0.2 2023-10-03 10:52:02,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:52:06,282 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:52:11,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:11,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:52:16,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:19,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 10:52:19,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:52:19,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:52:21,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:52:22,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:52:23,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 10:52:25,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:52:25,418 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 10:52:26,684 INFO [train.py:1046] (2/4) Epoch 36, batch 250, loss[loss=0.151, simple_loss=0.2383, pruned_loss=0.0319, over 24656.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2407, pruned_loss=0.04031, over 3382264.78 frames. ], batch size: 68, lr: 2.85e-03, grad_scale: 4.0 2023-10-03 10:52:28,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:31,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:52:32,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:32,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:52:34,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:52:34,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:36,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:52:39,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:52:45,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.828e+02 1.971e+02 2.158e+02 2.749e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 10:52:48,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:52:50,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:52:50,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:52:56,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:52:57,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:52:57,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1241300.0, ans=0.0 2023-10-03 10:52:58,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:52:59,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:52:59,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:52:59,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:53:01,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:53:03,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:53:06,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 10:53:06,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:53:08,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:53:08,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:53:08,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:53:10,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:53:12,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:53:12,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:53:15,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:15,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:53:16,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:20,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:53:23,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:26,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:53:31,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:32,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:53:35,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 10:53:35,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:53:35,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:53:38,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 10:53:38,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:53:41,194 INFO [train.py:1046] (2/4) Epoch 36, batch 300, loss[loss=0.1602, simple_loss=0.2506, pruned_loss=0.03489, over 24314.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2389, pruned_loss=0.04052, over 3664558.58 frames. ], batch size: 74, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:53:41,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:53:41,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 10:53:45,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:47,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:53:50,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:53:50,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 10:53:50,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1241500.0, ans=0.1 2023-10-03 10:53:51,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:52,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:53:53,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 10:53:53,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:53:58,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:54:00,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-03 10:54:04,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:54:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 10:54:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 10:54:06,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:09,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:54:09,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:09,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 10:54:09,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:54:12,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:54:13,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:54:15,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:54:17,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-10-03 10:54:20,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:54:20,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 10:54:21,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:54:24,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:24,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 10:54:25,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:54:29,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:54:32,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:54:32,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 10:54:36,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1241700.0, ans=0.0 2023-10-03 10:54:37,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:37,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:54:38,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1241766.6666666667, ans=0.125 2023-10-03 10:54:40,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:41,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:54:41,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 10:54:41,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:54:42,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:54:44,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 10:54:47,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:47,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:48,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:54:48,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:54:48,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:53,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:54:53,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 10:54:55,179 INFO [train.py:1046] (2/4) Epoch 36, batch 350, loss[loss=0.1662, simple_loss=0.2484, pruned_loss=0.04196, over 23283.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2375, pruned_loss=0.03985, over 3905581.21 frames. ], batch size: 105, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:54:55,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.66 vs. limit=6.0 2023-10-03 10:54:56,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:00,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:55:01,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1241833.3333333333, ans=0.035 2023-10-03 10:55:06,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:06,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:08,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 10:55:10,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:55:10,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 10:55:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:13,004 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.890e+02 2.063e+02 2.336e+02 3.358e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-03 10:55:13,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 10:55:13,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:55:17,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 10:55:17,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:55:20,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:55:22,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:55:22,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:22,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:23,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:55:23,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:25,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:55:26,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:55:26,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:32,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:55:32,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:55:34,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:55:35,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:40,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 10:55:40,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:45,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:45,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:55:45,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:55:49,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 10:55:50,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:55:51,700 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 10:55:51,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 10:55:51,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:55,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1242100.0, ans=0.125 2023-10-03 10:55:56,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:55:56,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 10:55:57,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:55:59,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:56:00,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:02,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:02,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:56:05,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:56:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:56:08,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:56:10,303 INFO [train.py:1046] (2/4) Epoch 36, batch 400, loss[loss=0.1521, simple_loss=0.2401, pruned_loss=0.03201, over 24657.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2377, pruned_loss=0.03963, over 4095635.08 frames. ], batch size: 65, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:56:10,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 10:56:10,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:12,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:56:13,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:14,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:16,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:17,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 10:56:18,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1242166.6666666667, ans=0.2 2023-10-03 10:56:18,472 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.01 vs. limit=10.0 2023-10-03 10:56:20,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 10:56:20,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:22,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 10:56:22,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:26,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:56:26,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:26,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 10:56:28,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:56:28,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:28,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:28,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:30,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 10:56:30,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 10:56:35,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:37,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:37,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1242233.3333333333, ans=0.0 2023-10-03 10:56:38,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 10:56:40,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 10:56:41,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:56:42,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1242300.0, ans=0.125 2023-10-03 10:56:43,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:56:48,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1242300.0, ans=0.1 2023-10-03 10:56:51,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=22.5 2023-10-03 10:56:51,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 10:56:54,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:56:56,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1242366.6666666667, ans=0.1 2023-10-03 10:56:57,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 10:56:58,503 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-10-03 10:56:59,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:59,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:57:00,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 10:57:03,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:57:05,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:57:06,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:57:09,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:11,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 10:57:12,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:57:14,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 10:57:17,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:57:17,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:57:19,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 10:57:21,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:57:21,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:57:21,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:57:22,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 10:57:23,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:57:24,719 INFO [train.py:1046] (2/4) Epoch 36, batch 450, loss[loss=0.1704, simple_loss=0.2486, pruned_loss=0.04611, over 23814.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2386, pruned_loss=0.03984, over 4240323.51 frames. ], batch size: 212, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:57:24,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:57:24,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:57:24,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 10:57:24,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:57:27,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:57:29,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:57:39,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:39,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:57:42,358 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.918e+02 2.094e+02 2.356e+02 3.401e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-03 10:57:42,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 10:57:43,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 10:57:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:57:48,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:51,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:57:56,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:57:57,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:57:57,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1242633.3333333333, ans=0.95 2023-10-03 10:57:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 10:58:00,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 10:58:00,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 10:58:00,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:01,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:03,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:58:05,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 10:58:05,053 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 10:58:06,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:58:08,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:58:08,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:58:08,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1242700.0, ans=0.5 2023-10-03 10:58:11,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:58:11,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:58:12,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 10:58:12,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 10:58:12,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1242700.0, ans=0.07 2023-10-03 10:58:14,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:58:17,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:58:17,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:58:19,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 10:58:23,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:58:23,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 10:58:25,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 10:58:25,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1242766.6666666667, ans=0.1 2023-10-03 10:58:26,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:58:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:58:31,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:58:35,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:58:35,233 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 10:58:39,707 INFO [train.py:1046] (2/4) Epoch 36, batch 500, loss[loss=0.1693, simple_loss=0.2504, pruned_loss=0.04414, over 23087.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2394, pruned_loss=0.03997, over 4344249.08 frames. ], batch size: 105, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:58:39,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:39,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:58:41,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 10:58:43,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 10:58:43,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:45,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:58:49,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:58:51,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:58:52,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:58:53,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:54,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:58:54,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1242900.0, ans=0.125 2023-10-03 10:59:06,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:06,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 10:59:06,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:59:06,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:07,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 10:59:07,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:59:10,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:59:12,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:59:12,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:59:12,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:12,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 10:59:15,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 10:59:15,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=15.0 2023-10-03 10:59:17,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:19,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:59:25,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 10:59:27,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.91 vs. limit=15.0 2023-10-03 10:59:27,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:59:29,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:32,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:33,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.84 vs. limit=10.0 2023-10-03 10:59:35,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:35,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1243033.3333333333, ans=0.0 2023-10-03 10:59:43,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:46,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 10:59:46,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:46,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:49,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 10:59:49,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:59:50,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:53,729 INFO [train.py:1046] (2/4) Epoch 36, batch 550, loss[loss=0.1515, simple_loss=0.2377, pruned_loss=0.03266, over 24463.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2402, pruned_loss=0.04013, over 4434260.69 frames. ], batch size: 69, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:59:56,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 10:59:57,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 10:59:58,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:58,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 10:59:59,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:59:59,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:00:00,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:00,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:00,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:00:00,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1243166.6666666667, ans=0.0 2023-10-03 11:00:02,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:00:05,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:00:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 11:00:06,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:00:11,068 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.849e+02 2.026e+02 2.312e+02 3.706e+02, threshold=4.052e+02, percent-clipped=0.0 2023-10-03 11:00:11,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:11,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:14,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:00:14,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:14,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1243233.3333333333, ans=0.125 2023-10-03 11:00:17,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1243233.3333333333, ans=0.0 2023-10-03 11:00:18,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 11:00:19,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 11:00:19,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:00:24,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:00:24,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:00:26,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:00:28,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:28,753 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 11:00:30,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:00:34,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:00:34,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:00:34,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:00:34,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:36,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 11:00:38,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 11:00:40,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:00:40,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:00:42,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:00:42,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:00:45,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:00:45,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:00:48,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:00:48,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:49,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 11:00:51,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:00:54,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:00:55,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:00:56,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:58,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:00:58,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 11:01:05,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 11:01:06,990 INFO [train.py:1046] (2/4) Epoch 36, batch 600, loss[loss=0.1749, simple_loss=0.259, pruned_loss=0.04537, over 23972.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2407, pruned_loss=0.04082, over 4481341.87 frames. ], batch size: 86, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 11:01:08,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 11:01:09,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:01:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:01:09,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:10,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=1243500.0, ans=0.1 2023-10-03 11:01:17,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:01:19,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:01:20,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 11:01:21,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:01:25,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:01:26,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:28,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 11:01:28,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:01:31,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1243566.6666666667, ans=0.05 2023-10-03 11:01:32,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 11:01:35,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:01:35,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:35,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:01:36,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1243633.3333333333, ans=0.2 2023-10-03 11:01:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:01:43,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:01:43,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:49,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:01:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:54,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:01:54,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:02:01,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 11:02:02,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1243700.0, ans=0.1 2023-10-03 11:02:06,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:02:07,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:02:11,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 11:02:12,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:02:14,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 11:02:14,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:02:15,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:02:21,592 INFO [train.py:1046] (2/4) Epoch 36, batch 650, loss[loss=0.1502, simple_loss=0.2281, pruned_loss=0.03617, over 24354.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2395, pruned_loss=0.0407, over 4513390.28 frames. ], batch size: 61, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:02:23,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 11:02:24,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:02:26,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:02:26,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:02:29,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:30,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1243833.3333333333, ans=0.0 2023-10-03 11:02:31,033 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.86 vs. limit=15.0 2023-10-03 11:02:31,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 11:02:31,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:02:38,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:02:38,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:02:39,286 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.874e+02 2.139e+02 2.425e+02 3.850e+02, threshold=4.279e+02, percent-clipped=0.0 2023-10-03 11:02:39,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1243900.0, ans=0.125 2023-10-03 11:02:42,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:45,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 11:02:46,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:02:47,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:02:50,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:02:51,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:02:53,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:54,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:54,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:02:56,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:57,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:02:57,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1243966.6666666667, ans=0.0 2023-10-03 11:02:59,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:02:59,322 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 11:02:59,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:59,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:03:03,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:03,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:03:04,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1243966.6666666667, ans=0.125 2023-10-03 11:03:05,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:05,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:03:06,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 11:03:06,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:03:07,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:03:09,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:03:09,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:03:10,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:03:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 11:03:12,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 11:03:14,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:14,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:03:14,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:03:14,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:03:15,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:03:17,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1244033.3333333333, ans=0.0 2023-10-03 11:03:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:22,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:03:24,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:03:26,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:26,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:03:26,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:32,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.29 vs. limit=15.0 2023-10-03 11:03:33,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:03:33,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:03:33,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:03:33,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:03:35,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1244166.6666666667, ans=0.1 2023-10-03 11:03:36,718 INFO [train.py:1046] (2/4) Epoch 36, batch 700, loss[loss=0.1728, simple_loss=0.2635, pruned_loss=0.04107, over 24429.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2385, pruned_loss=0.04024, over 4559806.16 frames. ], batch size: 69, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:03:38,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 11:03:39,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.20 vs. limit=6.0 2023-10-03 11:03:39,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 11:03:41,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 11:03:42,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:45,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:03:47,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 11:03:48,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1244166.6666666667, ans=0.125 2023-10-03 11:03:51,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:03:54,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:03:56,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:59,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:03:59,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:04:01,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:04:04,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 11:04:04,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:04:07,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 11:04:10,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 11:04:12,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:04:14,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:04:15,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:04:19,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:04:21,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 11:04:24,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:04:25,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:04:25,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 11:04:29,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:04:30,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:04:31,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:04:36,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:04:36,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 11:04:38,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1244433.3333333333, ans=0.0 2023-10-03 11:04:40,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 11:04:42,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 11:04:44,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1244433.3333333333, ans=0.125 2023-10-03 11:04:45,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:46,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:04:46,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:04:50,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:50,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 11:04:51,439 INFO [train.py:1046] (2/4) Epoch 36, batch 750, loss[loss=0.1571, simple_loss=0.2291, pruned_loss=0.04257, over 23751.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2378, pruned_loss=0.03968, over 4602652.18 frames. ], batch size: 164, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:04:52,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 11:04:53,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 11:04:53,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 11:04:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 11:04:55,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 11:04:55,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:04:57,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 11:04:57,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:59,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:04:59,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1244500.0, ans=0.2 2023-10-03 11:05:00,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:03,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:03,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:05:03,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:05:04,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:05:06,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:05:07,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1244566.6666666667, ans=0.125 2023-10-03 11:05:09,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:05:10,662 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.795e+02 1.933e+02 2.137e+02 2.929e+02, threshold=3.865e+02, percent-clipped=0.0 2023-10-03 11:05:10,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:12,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:12,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 11:05:13,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:05:15,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:05:15,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1244566.6666666667, ans=0.125 2023-10-03 11:05:17,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:05:18,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:05:19,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 11:05:19,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:05:19,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 11:05:21,256 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 11:05:21,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 11:05:21,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:05:23,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:05:26,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:05:26,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1244633.3333333333, ans=0.2 2023-10-03 11:05:28,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1244633.3333333333, ans=0.125 2023-10-03 11:05:32,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:05:32,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:05:32,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:05:34,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:35,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1244700.0, ans=0.125 2023-10-03 11:05:36,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:05:36,834 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.82 vs. limit=22.5 2023-10-03 11:05:37,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 11:05:37,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:05:38,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 11:05:38,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:05:40,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:05:42,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 11:05:42,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:05:42,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1244700.0, ans=0.2 2023-10-03 11:05:46,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:05:48,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:05:49,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:51,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:05:54,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 11:05:55,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:05:56,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:00,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:00,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:04,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:04,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:06:05,746 INFO [train.py:1046] (2/4) Epoch 36, batch 800, loss[loss=0.1683, simple_loss=0.2396, pruned_loss=0.04848, over 23837.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2379, pruned_loss=0.04008, over 4626533.24 frames. ], batch size: 164, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:06:11,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1244833.3333333333, ans=0.09899494936611666 2023-10-03 11:06:11,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1244833.3333333333, ans=0.125 2023-10-03 11:06:13,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:13,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:15,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:06:15,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:16,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:17,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:18,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:22,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:22,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1244900.0, ans=0.2 2023-10-03 11:06:23,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:06:25,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 11:06:26,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:28,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:28,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:06:28,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:06:28,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 11:06:28,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:30,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 11:06:32,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:34,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:36,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:06:38,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:38,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:06:44,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:06:44,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 11:06:46,897 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 11:06:46,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 11:06:46,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:06:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:48,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:06:50,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1245033.3333333333, ans=0.125 2023-10-03 11:06:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 11:06:54,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 11:06:55,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:06:57,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:07:00,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:07:03,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:07:04,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 11:07:04,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:07:07,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 11:07:12,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:07:16,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:07:16,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 11:07:16,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:07:17,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:07:19,157 INFO [train.py:1046] (2/4) Epoch 36, batch 850, loss[loss=0.17, simple_loss=0.2563, pruned_loss=0.04185, over 24086.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2394, pruned_loss=0.0408, over 4647631.35 frames. ], batch size: 80, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:07:19,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 11:07:19,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:19,693 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.01 vs. limit=15.0 2023-10-03 11:07:20,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:07:20,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:23,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:07:23,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:07:25,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 11:07:25,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 11:07:25,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 11:07:28,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:07:28,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:07:29,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:29,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:07:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:07:32,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1245233.3333333333, ans=0.1 2023-10-03 11:07:33,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.10 vs. limit=10.0 2023-10-03 11:07:33,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:35,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:07:35,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 11:07:38,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1245233.3333333333, ans=0.125 2023-10-03 11:07:39,062 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.931e+02 2.185e+02 2.563e+02 4.001e+02, threshold=4.370e+02, percent-clipped=1.0 2023-10-03 11:07:40,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 11:07:43,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:45,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 11:07:47,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1245300.0, ans=0.125 2023-10-03 11:07:49,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 11:07:51,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 11:07:53,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 11:07:53,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:07:53,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:07:54,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:07:55,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1245300.0, ans=0.125 2023-10-03 11:07:56,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:58,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 11:08:00,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:08:01,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:01,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:08:03,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:08:05,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:08:08,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:08:09,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 11:08:10,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:08:12,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:08:13,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:08:13,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:08:13,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:15,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:08:18,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:08:18,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:08:19,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:21,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:08:27,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:08:28,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:08:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 11:08:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:08:31,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:08:32,855 INFO [train.py:1046] (2/4) Epoch 36, batch 900, loss[loss=0.1718, simple_loss=0.2463, pruned_loss=0.04872, over 22698.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2398, pruned_loss=0.04085, over 4661870.04 frames. ], batch size: 322, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:08:32,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 11:08:39,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:08:41,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:41,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 11:08:44,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:08:44,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 11:08:46,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:08:49,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:08:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:08:49,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:08:49,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:08:54,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.33 vs. limit=15.0 2023-10-03 11:08:58,863 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-10-03 11:08:59,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:59,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:59,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:09:03,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:09:08,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 11:09:10,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:09:13,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1245633.3333333333, ans=0.1 2023-10-03 11:09:14,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:09:15,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:09:16,005 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 11:09:17,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 11:09:23,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:09:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:09:24,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:09:31,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:31,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:09:33,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 11:09:33,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:09:35,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 11:09:37,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:09:37,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:40,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:09:40,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:09:41,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.32 vs. limit=15.0 2023-10-03 11:09:44,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 11:09:44,279 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 11:09:45,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 11:09:45,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 11:09:46,949 INFO [train.py:1046] (2/4) Epoch 36, batch 950, loss[loss=0.1524, simple_loss=0.2279, pruned_loss=0.03843, over 23489.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2398, pruned_loss=0.04066, over 4680432.97 frames. ], batch size: 134, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:09:47,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1245833.3333333333, ans=0.0 2023-10-03 11:09:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:51,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 11:09:54,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:09:57,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:09:57,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:09:57,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:10:01,448 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 11:10:04,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:05,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:10:07,346 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.892e+02 2.032e+02 2.261e+02 4.305e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-03 11:10:07,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:10:07,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:10:07,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 11:10:08,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:10:10,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:10,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 11:10:11,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:10:14,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:10:15,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:10:16,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 11:10:18,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 11:10:22,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:10:24,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:10:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:10:28,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:10:31,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 11:10:32,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 11:10:32,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:10:34,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:10:34,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:34,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:10:37,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 11:10:38,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:10:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:10:41,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:42,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 11:10:42,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:42,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:10:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 11:10:46,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:10:49,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:55,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:10:56,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 11:10:56,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 11:10:58,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1246100.0, ans=0.125 2023-10-03 11:10:59,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:11:00,878 INFO [train.py:1046] (2/4) Epoch 36, batch 1000, loss[loss=0.167, simple_loss=0.2514, pruned_loss=0.04124, over 24023.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2391, pruned_loss=0.04094, over 4681466.17 frames. ], batch size: 86, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:11:01,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1246166.6666666667, ans=0.0 2023-10-03 11:11:03,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 11:11:05,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:07,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1246166.6666666667, ans=0.1 2023-10-03 11:11:09,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:11:09,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 11:11:10,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 11:11:14,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:15,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:11:15,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:17,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1246233.3333333333, ans=0.125 2023-10-03 11:11:19,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 11:11:20,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 11:11:22,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 11:11:23,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:11:25,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 11:11:26,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 11:11:26,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 11:11:26,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1246233.3333333333, ans=0.025 2023-10-03 11:11:27,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:27,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:36,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:36,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:11:37,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:37,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:37,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 11:11:39,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:11:39,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:11:40,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:40,746 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 11:11:43,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 11:11:45,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 11:11:47,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 11:11:48,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:11:53,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-10-03 11:11:56,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:56,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:11:56,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:56,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1246366.6666666667, ans=0.0 2023-10-03 11:11:58,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:12:00,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 11:12:02,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:12:02,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 11:12:03,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 11:12:05,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:12:05,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:12:06,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:12:09,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:12:10,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:12:13,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:12:15,119 INFO [train.py:1046] (2/4) Epoch 36, batch 1050, loss[loss=0.1472, simple_loss=0.2292, pruned_loss=0.03258, over 24429.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2377, pruned_loss=0.04102, over 4682480.68 frames. ], batch size: 66, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:12:15,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:12:17,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 11:12:18,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:12:19,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:12:20,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1246500.0, ans=0.07 2023-10-03 11:12:24,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:12:25,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:12:28,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:12:28,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:12:28,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:12:30,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:12:31,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 11:12:32,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:12:33,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 11:12:36,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:12:36,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 11:12:36,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:12:37,305 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.909e+02 2.045e+02 2.275e+02 3.142e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 11:12:41,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1246566.6666666667, ans=0.0 2023-10-03 11:12:42,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:12:42,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:12:42,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:12:45,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 11:12:45,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 11:12:47,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:12:49,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 11:12:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 11:12:54,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:12:58,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:12:59,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:12:59,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:13:01,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:13:04,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:13:07,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 11:13:09,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 11:13:09,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff2.min_abs, batch_count=1246700.0, ans=0.1 2023-10-03 11:13:10,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 11:13:11,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:13:11,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:13:13,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 11:13:16,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:13:17,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:13:17,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:13:19,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:13:19,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:13:22,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:13:22,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 11:13:25,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:13:25,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 11:13:25,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 11:13:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:13:29,566 INFO [train.py:1046] (2/4) Epoch 36, batch 1100, loss[loss=0.1548, simple_loss=0.2488, pruned_loss=0.03035, over 24031.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2374, pruned_loss=0.04049, over 4687698.03 frames. ], batch size: 80, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:13:29,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:13:33,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:13:35,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1246833.3333333333, ans=0.125 2023-10-03 11:13:38,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:13:41,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:13:41,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:13:41,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 11:13:42,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:13:43,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:13:47,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:13:48,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:13:48,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 11:13:50,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:13:52,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:13:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:13:55,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:13:56,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:14:02,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:14:03,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 11:14:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 11:14:06,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:09,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:10,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1246966.6666666667, ans=0.0 2023-10-03 11:14:11,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:14:11,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:14:13,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 11:14:15,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:14:15,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:14:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:14:15,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:15,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 11:14:23,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:14:23,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 11:14:24,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:14:30,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:14:31,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1247100.0, ans=0.125 2023-10-03 11:14:32,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 11:14:32,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 11:14:33,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:36,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:14:36,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:14:39,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 11:14:39,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:14:39,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:14:40,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 11:14:40,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:14:42,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 11:14:43,464 INFO [train.py:1046] (2/4) Epoch 36, batch 1150, loss[loss=0.1488, simple_loss=0.2273, pruned_loss=0.03513, over 23664.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2385, pruned_loss=0.04014, over 4710171.51 frames. ], batch size: 149, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:14:43,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:14:43,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:14:45,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:14:45,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1247166.6666666667, ans=0.0 2023-10-03 11:14:50,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:14:51,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:14:52,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:14:54,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:14:54,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 11:14:54,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:14:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 11:14:59,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:14:59,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:15:04,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 11:15:06,092 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.838e+02 2.022e+02 2.263e+02 3.460e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 11:15:07,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:15:10,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:15:10,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:12,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 11:15:12,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:15:12,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:15:12,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1247300.0, ans=10.0 2023-10-03 11:15:15,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1247300.0, ans=0.0 2023-10-03 11:15:18,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 11:15:19,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:15:20,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:15:22,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1247300.0, ans=0.125 2023-10-03 11:15:24,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1247300.0, ans=0.125 2023-10-03 11:15:28,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:31,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1247366.6666666667, ans=0.125 2023-10-03 11:15:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:34,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 11:15:35,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:35,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1247366.6666666667, ans=0.125 2023-10-03 11:15:36,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:40,438 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 11:15:43,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:43,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1247433.3333333333, ans=0.1 2023-10-03 11:15:50,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 11:15:53,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:15:55,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:15:56,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:15:56,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:15:57,876 INFO [train.py:1046] (2/4) Epoch 36, batch 1200, loss[loss=0.1641, simple_loss=0.2375, pruned_loss=0.04539, over 23557.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.0401, over 4716216.14 frames. ], batch size: 256, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:15:59,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:04,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:16:04,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:16:06,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:06,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:06,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:16:08,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:16:11,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:16:11,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:11,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:16:13,863 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 11:16:16,003 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=15.0 2023-10-03 11:16:16,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 11:16:19,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:16:19,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1247566.6666666667, ans=0.125 2023-10-03 11:16:24,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:16:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:28,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:16:28,716 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 11:16:30,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:36,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:16:36,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:16:37,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 11:16:38,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:16:39,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1247633.3333333333, ans=0.125 2023-10-03 11:16:41,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 11:16:43,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 11:16:43,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:45,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:16:46,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:16:47,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:16:48,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:48,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:16:50,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:16:50,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 11:16:50,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:16:51,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:16:51,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:16:53,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:53,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:16:58,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:16:59,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:17:04,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 11:17:08,149 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 11:17:09,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:17:10,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-10-03 11:17:10,802 INFO [train.py:1046] (2/4) Epoch 36, batch 1250, loss[loss=0.1529, simple_loss=0.2476, pruned_loss=0.0291, over 24689.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2395, pruned_loss=0.04021, over 4724165.11 frames. ], batch size: 73, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:17:12,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:17:13,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:17:14,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:17:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 11:17:20,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:17:21,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:21,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1247833.3333333333, ans=0.0 2023-10-03 11:17:23,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 11:17:23,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1247900.0, ans=0.1 2023-10-03 11:17:25,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:17:26,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:17:30,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:17:31,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:31,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:17:32,976 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.893e+02 2.096e+02 2.315e+02 3.131e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-03 11:17:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:17:36,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:17:39,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 11:17:39,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:17:39,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:17:41,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:17:41,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:44,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:17:44,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:17:49,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 11:17:49,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:17:50,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:17:52,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 11:17:53,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:53,994 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 11:17:54,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:54,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:58,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:18:01,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:18:01,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:18:05,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 11:18:05,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 11:18:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 11:18:08,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:09,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 11:18:09,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:18:12,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 11:18:12,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:18:13,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 11:18:13,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:18:13,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:18:14,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:18:14,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:18:16,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 11:18:19,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:18:20,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:18:22,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:18:24,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:18:25,571 INFO [train.py:1046] (2/4) Epoch 36, batch 1300, loss[loss=0.1605, simple_loss=0.2398, pruned_loss=0.04056, over 23525.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2405, pruned_loss=0.04068, over 4717944.86 frames. ], batch size: 120, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:18:26,231 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.31 vs. limit=15.0 2023-10-03 11:18:27,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:18:27,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 11:18:31,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:33,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:18:34,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:18:34,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:18:37,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:18:37,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 11:18:41,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1248233.3333333333, ans=0.0 2023-10-03 11:18:44,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:18:44,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:18:47,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 11:18:50,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:18:50,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.84 vs. limit=15.0 2023-10-03 11:18:53,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:18:54,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:18:54,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:55,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1248300.0, ans=0.125 2023-10-03 11:18:56,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:18:58,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:18:58,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:18:58,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 11:19:04,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:19:04,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:19:06,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 11:19:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:19:07,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:19:10,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:19:11,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 11:19:11,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:19:11,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 11:19:12,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:19:15,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:19:15,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:19:21,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 11:19:22,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 11:19:23,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 11:19:29,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:19:31,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 11:19:33,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:19:40,015 INFO [train.py:1046] (2/4) Epoch 36, batch 1350, loss[loss=0.1689, simple_loss=0.2557, pruned_loss=0.04109, over 24303.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2395, pruned_loss=0.04042, over 4718767.70 frames. ], batch size: 74, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:19:41,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 11:19:42,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1248500.0, ans=0.09899494936611666 2023-10-03 11:19:44,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:19:45,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:19:47,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1248500.0, ans=0.0 2023-10-03 11:19:48,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:19:48,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:19:50,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:19:51,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1248500.0, ans=0.2 2023-10-03 11:19:52,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:19:55,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:19:57,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 11:19:58,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:20:00,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:20:01,941 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.023e+02 2.236e+02 2.575e+02 4.914e+02, threshold=4.472e+02, percent-clipped=3.0 2023-10-03 11:20:02,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 11:20:03,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:20:03,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:20:03,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 11:20:06,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 11:20:09,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 11:20:11,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:11,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 11:20:21,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:29,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:29,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:30,173 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:20:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 11:20:34,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:34,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 11:20:34,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:20:35,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:20:37,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:20:39,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 11:20:40,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:20:45,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 11:20:46,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1248766.6666666667, ans=0.125 2023-10-03 11:20:46,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1248766.6666666667, ans=0.2 2023-10-03 11:20:48,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 11:20:52,868 INFO [train.py:1046] (2/4) Epoch 36, batch 1400, loss[loss=0.1589, simple_loss=0.2469, pruned_loss=0.03542, over 23895.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2382, pruned_loss=0.0402, over 4707357.57 frames. ], batch size: 86, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:20:54,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 11:20:54,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:58,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:21:00,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:21:04,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 11:21:05,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 11:21:10,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1248900.0, ans=6.0 2023-10-03 11:21:11,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1248900.0, ans=0.125 2023-10-03 11:21:15,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:21:17,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:21:18,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:21:19,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:21:23,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:21:23,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 11:21:25,053 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:21:33,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:35,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:40,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 11:21:40,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:21:41,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:21:41,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:21:41,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:21:43,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1249033.3333333333, ans=0.125 2023-10-03 11:21:44,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:21:44,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:21:44,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:21:45,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 11:21:47,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:21:48,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1249033.3333333333, ans=0.07 2023-10-03 11:21:50,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:52,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:21:53,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1249100.0, ans=0.125 2023-10-03 11:21:57,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1249100.0, ans=0.125 2023-10-03 11:21:58,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 11:22:00,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 11:22:01,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:22:01,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1249100.0, ans=0.125 2023-10-03 11:22:04,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 11:22:04,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:06,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:22:08,012 INFO [train.py:1046] (2/4) Epoch 36, batch 1450, loss[loss=0.1455, simple_loss=0.226, pruned_loss=0.03254, over 24475.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2377, pruned_loss=0.04007, over 4698245.80 frames. ], batch size: 63, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:22:08,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:22:10,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:22:10,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:10,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 11:22:15,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:15,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:22:16,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:22:16,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 11:22:17,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:22:19,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 11:22:20,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:23,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:23,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 11:22:23,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:22:25,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:22:25,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 11:22:25,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:26,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:22:28,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:29,486 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.817e+02 1.972e+02 2.237e+02 2.946e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-03 11:22:30,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:34,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:22:34,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:22:37,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:37,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:39,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:40,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:22:40,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:40,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:22:42,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1249300.0, ans=0.1 2023-10-03 11:22:44,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 11:22:46,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1249300.0, ans=0.2 2023-10-03 11:22:47,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:22:51,562 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 11:22:51,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:22:53,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:22:55,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:22:55,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 11:22:59,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1249366.6666666667, ans=0.2 2023-10-03 11:23:00,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:00,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 11:23:00,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1249366.6666666667, ans=0.1 2023-10-03 11:23:01,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 11:23:05,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:08,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:23:08,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:23:10,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 11:23:11,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 11:23:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 11:23:14,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:15,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:23:21,483 INFO [train.py:1046] (2/4) Epoch 36, batch 1500, loss[loss=0.1677, simple_loss=0.2571, pruned_loss=0.03911, over 24126.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2379, pruned_loss=0.0396, over 4720321.21 frames. ], batch size: 86, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:23:25,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 11:23:25,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:23:25,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:23:25,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1249500.0, ans=0.04949747468305833 2023-10-03 11:23:27,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:27,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:23:29,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:23:30,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 11:23:31,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:23:31,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:23:32,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1249500.0, ans=0.125 2023-10-03 11:23:33,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:23:33,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:23:34,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:23:34,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:23:40,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:23:40,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 11:23:41,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:23:41,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:23:43,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:45,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 11:23:51,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 11:23:52,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:54,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 11:23:56,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:23:59,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:24:00,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:24:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:02,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 11:24:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:24:02,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:24:02,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 11:24:04,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:24:09,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:24:09,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 11:24:15,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:24:16,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:24:18,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1249700.0, ans=0.125 2023-10-03 11:24:19,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 11:24:19,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:19,458 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 11:24:19,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:20,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:24:22,266 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 11:24:23,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:24:26,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 11:24:27,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:30,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:24:30,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:32,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:24:32,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:33,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:24:35,215 INFO [train.py:1046] (2/4) Epoch 36, batch 1550, loss[loss=0.1518, simple_loss=0.2358, pruned_loss=0.03389, over 24289.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.03949, over 4725476.51 frames. ], batch size: 61, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:24:35,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 11:24:35,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 11:24:36,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:24:36,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 11:24:38,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 11:24:40,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:41,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:41,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1249833.3333333333, ans=0.0 2023-10-03 11:24:42,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=12.0 2023-10-03 11:24:42,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:24:43,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:24:44,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:44,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:47,223 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 11:24:47,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:47,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:24:48,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:24:50,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:24:50,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 11:24:50,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1249900.0, ans=0.125 2023-10-03 11:24:51,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:51,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 11:24:52,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 11:24:52,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 11:24:54,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:55,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:24:56,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.814e+02 1.950e+02 2.198e+02 3.185e+02, threshold=3.899e+02, percent-clipped=0.0 2023-10-03 11:24:59,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:25:03,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 11:25:03,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 11:25:11,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:25:14,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:25:15,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:25:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:25:17,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 11:25:21,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:25:23,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:25,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:25:27,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:25:28,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:25:28,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 11:25:28,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:25:31,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:25:31,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:32,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 11:25:32,701 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 11:25:35,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:25:38,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 11:25:41,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.69 vs. limit=15.0 2023-10-03 11:25:44,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-10-03 11:25:45,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:25:46,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:46,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 11:25:47,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.58 vs. limit=12.0 2023-10-03 11:25:49,033 INFO [train.py:1046] (2/4) Epoch 36, batch 1600, loss[loss=0.1363, simple_loss=0.2168, pruned_loss=0.02791, over 24333.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2392, pruned_loss=0.03984, over 4729353.57 frames. ], batch size: 56, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:25:49,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:25:50,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:25:50,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:25:50,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:25:50,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:25:53,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:25:53,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 11:25:54,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 11:25:55,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1250166.6666666667, ans=0.125 2023-10-03 11:25:56,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 11:25:57,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:26:00,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 11:26:00,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:26:03,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:26:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:26:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 11:26:12,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1250233.3333333333, ans=0.09899494936611666 2023-10-03 11:26:15,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:26:15,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 11:26:15,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:17,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 11:26:20,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1250300.0, ans=0.1 2023-10-03 11:26:22,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 11:26:29,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:26:29,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 11:26:31,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:26:31,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:26:31,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:26:33,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 11:26:36,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1250366.6666666667, ans=0.0 2023-10-03 11:26:38,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 11:26:40,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:26:41,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:42,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:42,263 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:26:43,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:26:44,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:26:46,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:26:48,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:26:50,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1250433.3333333333, ans=0.0 2023-10-03 11:26:54,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:54,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:26:56,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 11:26:56,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:26:58,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 11:27:02,342 INFO [train.py:1046] (2/4) Epoch 36, batch 1650, loss[loss=0.1756, simple_loss=0.2423, pruned_loss=0.05444, over 23620.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2398, pruned_loss=0.04026, over 4727248.25 frames. ], batch size: 232, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:27:02,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:03,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:27:05,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:27:05,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 11:27:05,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 11:27:05,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 11:27:07,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 11:27:11,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:27:11,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:27:12,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:27:12,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:27:14,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:16,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 11:27:17,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:27:17,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:27:17,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:27:17,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:27:19,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 11:27:20,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 11:27:24,861 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.852e+02 2.089e+02 2.353e+02 3.371e+02, threshold=4.177e+02, percent-clipped=0.0 2023-10-03 11:27:24,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:27:26,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:27:33,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 11:27:35,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:36,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 11:27:39,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:27:43,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:27:43,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:27:43,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:27:47,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:27:47,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:49,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:49,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:49,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:27:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:27:51,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1250700.0, ans=0.1 2023-10-03 11:27:52,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:27:52,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:27:55,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1250700.0, ans=0.125 2023-10-03 11:27:58,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:27:58,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 11:27:58,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:27:59,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 11:27:59,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 11:27:59,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 11:28:01,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:01,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:28:01,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:28:01,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:28:01,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 11:28:05,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:28:06,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1250766.6666666667, ans=0.1 2023-10-03 11:28:06,283 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:28:07,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:28:08,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:28:09,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 11:28:16,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:28:16,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:28:16,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 11:28:16,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:28:17,844 INFO [train.py:1046] (2/4) Epoch 36, batch 1700, loss[loss=0.1687, simple_loss=0.256, pruned_loss=0.04066, over 24002.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2392, pruned_loss=0.04028, over 4711356.59 frames. ], batch size: 86, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:28:17,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:28:17,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:28:19,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:28:20,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:28:20,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 11:28:21,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.08 vs. limit=8.0 2023-10-03 11:28:23,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:28:31,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:28:33,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1250900.0, ans=0.1 2023-10-03 11:28:34,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:28:40,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:28:40,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:28:40,667 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:28:42,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:28:42,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:28:44,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 11:28:47,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:28:47,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:49,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:28:50,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.42 vs. limit=22.5 2023-10-03 11:28:51,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:28:54,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 11:28:54,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 11:28:55,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:57,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 11:28:58,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:28:59,501 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.90 vs. limit=15.0 2023-10-03 11:29:05,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:05,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:07,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:29:10,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:29:10,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 11:29:10,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:29:10,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1251033.3333333333, ans=0.0 2023-10-03 11:29:11,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:11,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 11:29:13,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:29:13,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:13,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:13,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:15,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:15,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:29:15,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:16,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:29:16,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:23,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:29:23,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1251100.0, ans=0.0 2023-10-03 11:29:24,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 11:29:24,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1251100.0, ans=0.125 2023-10-03 11:29:26,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:27,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:29:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 11:29:31,604 INFO [train.py:1046] (2/4) Epoch 36, batch 1750, loss[loss=0.142, simple_loss=0.2234, pruned_loss=0.03034, over 24325.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2376, pruned_loss=0.04005, over 4711891.95 frames. ], batch size: 61, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:29:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:34,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1251166.6666666667, ans=0.125 2023-10-03 11:29:37,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:37,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:29:39,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 11:29:39,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:41,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:29:41,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:46,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 11:29:48,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:50,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 11:29:50,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:51,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1251233.3333333333, ans=0.125 2023-10-03 11:29:52,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:29:55,391 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.875e+02 2.027e+02 2.400e+02 3.332e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 11:29:55,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:29:56,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 11:29:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:29:58,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 11:30:06,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:30:08,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:08,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:30:09,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1251300.0, ans=15.0 2023-10-03 11:30:12,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:12,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:30:14,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:30:15,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:17,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1251366.6666666667, ans=0.1 2023-10-03 11:30:18,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:30:20,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:30:20,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 11:30:24,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:30:25,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1251366.6666666667, ans=0.0 2023-10-03 11:30:26,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 11:30:26,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:30:28,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:30:28,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1251366.6666666667, ans=0.125 2023-10-03 11:30:29,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:30:31,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1251433.3333333333, ans=0.125 2023-10-03 11:30:33,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:30:34,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 11:30:34,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:36,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:30:41,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:30:43,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:30:45,170 INFO [train.py:1046] (2/4) Epoch 36, batch 1800, loss[loss=0.1761, simple_loss=0.2507, pruned_loss=0.05078, over 23846.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2374, pruned_loss=0.03985, over 4708887.14 frames. ], batch size: 195, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:30:45,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:30:45,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 11:30:46,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:46,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:30:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:30:46,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:30:46,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:30:48,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:30:52,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:30:52,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:54,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:30:55,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:31:00,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:31:00,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1251566.6666666667, ans=0.95 2023-10-03 11:31:01,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:31:03,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:05,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:06,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:07,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:31:07,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=1251566.6666666667, ans=0.1 2023-10-03 11:31:10,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:31:10,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 11:31:12,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:14,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:18,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 11:31:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 11:31:20,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 11:31:22,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:23,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:23,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:31:25,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:31:31,605 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 11:31:31,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:31:31,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1251700.0, ans=0.125 2023-10-03 11:31:33,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:34,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 11:31:34,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 11:31:35,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:31:37,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:31:37,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:31:39,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1251700.0, ans=0.07 2023-10-03 11:31:41,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 11:31:47,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:31:47,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 11:31:48,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:31:48,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:49,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1251766.6666666667, ans=0.125 2023-10-03 11:31:50,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:31:50,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 11:31:53,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:31:53,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:31:57,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 11:31:57,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:59,431 INFO [train.py:1046] (2/4) Epoch 36, batch 1850, loss[loss=0.165, simple_loss=0.2531, pruned_loss=0.03841, over 24677.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2377, pruned_loss=0.03994, over 4704948.44 frames. ], batch size: 73, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:31:59,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:31:59,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:31:59,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:01,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:01,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:32:03,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:32:03,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:32:05,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:32:06,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:32:12,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:32:12,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 11:32:15,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 11:32:16,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.68 vs. limit=10.0 2023-10-03 11:32:18,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 11:32:21,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:32:22,890 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.882e+02 2.066e+02 2.277e+02 3.527e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 11:32:22,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 11:32:22,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 11:32:25,985 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:32:33,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:32:34,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 11:32:34,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1251966.6666666667, ans=0.125 2023-10-03 11:32:37,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:32:37,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:32:43,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 11:32:43,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:32:43,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:32:43,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1252033.3333333333, ans=0.125 2023-10-03 11:32:44,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:32:45,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1252033.3333333333, ans=0.2 2023-10-03 11:32:46,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:32:49,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:32:53,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:32:53,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:32:54,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:32:54,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:32:55,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:58,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:32:59,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 11:33:01,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:33:06,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:33:06,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:33:06,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 11:33:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 11:33:09,499 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 11:33:09,572 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 11:33:12,169 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.88 vs. limit=10.0 2023-10-03 11:33:12,681 INFO [train.py:1046] (2/4) Epoch 36, batch 1900, loss[loss=0.1716, simple_loss=0.2568, pruned_loss=0.04315, over 24628.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2382, pruned_loss=0.04004, over 4697386.19 frames. ], batch size: 73, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:33:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:33:12,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:33:12,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:33:12,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:12,842 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 11:33:12,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:33:12,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:14,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:33:15,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.14 vs. limit=22.5 2023-10-03 11:33:15,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:33:16,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:33:17,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 11:33:20,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:20,901 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 11:33:20,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:33:21,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:33:26,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:33:28,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:33:28,498 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 11:33:30,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 11:33:31,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:33:31,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1252233.3333333333, ans=0.2 2023-10-03 11:33:33,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:33:33,419 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 11:33:33,469 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 11:33:36,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 11:33:37,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:33:40,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 11:33:42,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 11:33:47,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1252300.0, ans=0.0 2023-10-03 11:33:53,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 11:33:57,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 11:33:57,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:57,416 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 11:33:58,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 11:33:58,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 11:33:58,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 11:33:58,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:00,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1252366.6666666667, ans=0.2 2023-10-03 11:34:03,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 11:34:05,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:34:05,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1252366.6666666667, ans=0.1 2023-10-03 11:34:06,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:34:06,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 11:34:09,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:34:11,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 11:34:13,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:34:17,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:34:17,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:34:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:34:19,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:34:21,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:34:21,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:34:22,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:34:24,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:34:24,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:34:25,334 INFO [train.py:1046] (2/4) Epoch 36, batch 1950, loss[loss=0.146, simple_loss=0.226, pruned_loss=0.03304, over 24282.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2391, pruned_loss=0.04027, over 4707420.13 frames. ], batch size: 56, lr: 2.84e-03, grad_scale: 4.0 2023-10-03 11:34:27,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:34:27,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:34:28,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:34:28,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:34:31,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:34:32,586 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-10-03 11:34:33,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1252500.0, ans=0.1 2023-10-03 11:34:34,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:34:35,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:35,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:34:37,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 11:34:37,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:34:38,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:38,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:42,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:34:42,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:34:43,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:44,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:34:47,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:34:47,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:34:47,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:34:47,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:50,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:51,697 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.918e+02 2.114e+02 2.421e+02 3.439e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-03 11:34:53,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:34:53,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:34:53,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:34:53,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 11:34:54,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:34:54,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:34:54,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:58,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:35:01,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:35:04,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:35:07,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:35:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:35:07,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 11:35:07,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:35:11,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:35:13,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:35:13,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:35:20,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:25,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:26,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:35:30,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:35:30,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:35:30,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 11:35:31,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:35:31,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:35:31,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 11:35:34,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:35:38,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:35:39,756 INFO [train.py:1046] (2/4) Epoch 36, batch 2000, loss[loss=0.1783, simple_loss=0.2624, pruned_loss=0.04712, over 23944.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2397, pruned_loss=0.0401, over 4714237.52 frames. ], batch size: 86, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:35:39,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:35:41,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:35:42,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:35:43,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:48,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1252833.3333333333, ans=0.0 2023-10-03 11:35:48,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1252833.3333333333, ans=0.07 2023-10-03 11:35:49,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 11:35:49,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:35:52,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:35:54,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 11:35:56,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:35:56,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:35:58,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:36:00,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 11:36:03,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:04,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:04,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:06,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 11:36:06,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:36:08,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 11:36:08,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:36:11,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:12,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:36:12,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:12,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:36:12,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1252966.6666666667, ans=0.0 2023-10-03 11:36:13,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:36:13,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 11:36:17,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 11:36:17,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:36:17,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1252966.6666666667, ans=0.1 2023-10-03 11:36:18,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:21,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1252966.6666666667, ans=0.125 2023-10-03 11:36:22,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:24,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:36:24,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:36:25,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:36:28,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:36:29,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:29,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:36:29,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:31,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:34,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:36:34,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 11:36:39,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:36:39,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:42,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:43,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:36:46,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:49,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:49,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:50,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:36:50,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:36:53,336 INFO [train.py:1046] (2/4) Epoch 36, batch 2050, loss[loss=0.1378, simple_loss=0.2031, pruned_loss=0.03626, over 22750.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2387, pruned_loss=0.03967, over 4725775.81 frames. ], batch size: 322, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:36:53,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:55,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:58,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:59,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:37:04,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:37:06,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:37:07,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:37:08,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:37:08,605 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:37:09,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 11:37:09,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:37:11,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:37:11,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:37:14,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1253233.3333333333, ans=0.0 2023-10-03 11:37:15,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1253233.3333333333, ans=0.0 2023-10-03 11:37:20,198 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.868e+02 2.079e+02 2.374e+02 3.501e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-03 11:37:20,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:37:20,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:37:21,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 11:37:24,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:37:24,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1253300.0, ans=0.0 2023-10-03 11:37:26,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 11:37:26,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:37:29,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:37:34,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:37:35,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:37:36,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:37:37,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:37:37,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:37:37,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:37:39,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1253366.6666666667, ans=0.125 2023-10-03 11:37:40,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:37:43,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:37:45,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:37:46,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:37:50,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:37:54,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1253433.3333333333, ans=0.1 2023-10-03 11:37:55,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:37:58,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 11:38:03,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:38:04,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:38:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:38:07,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 11:38:10,444 INFO [train.py:1046] (2/4) Epoch 36, batch 2100, loss[loss=0.1558, simple_loss=0.237, pruned_loss=0.03727, over 23589.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2374, pruned_loss=0.03954, over 4714766.27 frames. ], batch size: 120, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:38:10,545 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 11:38:10,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:10,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:38:11,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:38:14,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:38:14,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 11:38:14,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 11:38:16,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:38:18,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1253500.0, ans=0.0 2023-10-03 11:38:19,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:38:20,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:38:24,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:24,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:38:24,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 11:38:25,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:38:26,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 11:38:26,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 11:38:28,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:28,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:38:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 11:38:28,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 11:38:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 11:38:33,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:38:36,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:38:37,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:38:40,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:38:40,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 11:38:40,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:41,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:38:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 11:38:43,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:44,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 11:38:44,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 11:38:44,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 11:38:47,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:38:49,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:38:52,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:38:52,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1253633.3333333333, ans=0.07 2023-10-03 11:38:53,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:38:55,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:55,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:55,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 11:38:55,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:55,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1253700.0, ans=0.0 2023-10-03 11:38:56,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:56,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:58,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 11:39:01,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 11:39:01,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 11:39:03,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1253700.0, ans=0.0 2023-10-03 11:39:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:39:07,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:39:08,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 11:39:11,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:14,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:39:14,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:39:14,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:39:14,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 11:39:14,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:39:17,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-10-03 11:39:17,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:17,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:39:17,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:39:19,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:20,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 11:39:22,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 11:39:22,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:24,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:39:24,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:39:25,359 INFO [train.py:1046] (2/4) Epoch 36, batch 2150, loss[loss=0.1618, simple_loss=0.2382, pruned_loss=0.04269, over 23156.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03937, over 4715741.44 frames. ], batch size: 105, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:39:25,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:39:25,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:39:27,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1253833.3333333333, ans=0.125 2023-10-03 11:39:30,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1253833.3333333333, ans=0.0 2023-10-03 11:39:31,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:39:33,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:34,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:37,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:39:37,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:37,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:39:37,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1253833.3333333333, ans=0.0 2023-10-03 11:39:40,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:41,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:39:41,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:39:44,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:45,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 11:39:50,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:39:51,382 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.841e+02 2.012e+02 2.249e+02 3.397e+02, threshold=4.024e+02, percent-clipped=0.0 2023-10-03 11:39:51,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:39:52,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:52,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:39:54,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:55,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:39:55,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:55,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:39:57,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:57,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1253966.6666666667, ans=0.1 2023-10-03 11:39:58,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 11:40:00,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:40:01,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:01,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1253966.6666666667, ans=0.2 2023-10-03 11:40:03,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:03,658 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:40:04,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:40:04,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:40:06,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:06,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:40:09,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:09,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 11:40:09,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:40:12,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:40:13,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:13,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:40:14,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:40:16,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:16,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:16,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1254033.3333333333, ans=0.125 2023-10-03 11:40:17,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 11:40:19,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 11:40:19,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:40:20,359 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 11:40:21,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:21,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:40:23,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 11:40:23,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:40:23,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 11:40:23,510 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 11:40:23,511 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 11:40:23,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 11:40:25,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:26,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:40:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:40:28,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:28,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:40:29,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:34,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1254100.0, ans=0.125 2023-10-03 11:40:37,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:40:37,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 11:40:38,509 INFO [train.py:1046] (2/4) Epoch 36, batch 2200, loss[loss=0.1487, simple_loss=0.2322, pruned_loss=0.03257, over 24470.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2368, pruned_loss=0.03907, over 4712364.07 frames. ], batch size: 63, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:40:40,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:40:44,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:45,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:40:45,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:45,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1254166.6666666667, ans=0.125 2023-10-03 11:40:46,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:40:48,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:49,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:49,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 11:40:54,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1254233.3333333333, ans=0.04949747468305833 2023-10-03 11:40:55,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 11:40:58,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:41:03,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 11:41:06,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:08,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:41:08,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:41:10,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:41:11,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 11:41:15,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:41:15,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:16,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 11:41:19,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:41:22,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:41:22,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:41:24,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:25,702 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.14 vs. limit=12.0 2023-10-03 11:41:26,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 11:41:26,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:27,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 11:41:28,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:28,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:41:28,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:30,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:41:32,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:41:32,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:32,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:33,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:41:33,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:41:36,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:41:40,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:41:41,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:41:44,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:41:45,481 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 11:41:48,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:41:48,261 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 11:41:49,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:41:49,610 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 11:41:51,357 INFO [train.py:1046] (2/4) Epoch 36, batch 2250, loss[loss=0.1464, simple_loss=0.2276, pruned_loss=0.03262, over 24373.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.03926, over 4720879.28 frames. ], batch size: 61, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:41:52,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:52,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:41:54,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:57,384 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 11:41:57,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:41:58,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1254500.0, ans=0.2 2023-10-03 11:42:00,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:42:05,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:42:05,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:42:09,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:11,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:42:11,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:42:12,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 11:42:12,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1254566.6666666667, ans=0.1 2023-10-03 11:42:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:42:13,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:42:16,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 11:42:16,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:42:16,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:17,886 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.895e+02 2.080e+02 2.389e+02 3.595e+02, threshold=4.160e+02, percent-clipped=0.0 2023-10-03 11:42:19,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:42:23,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:42:25,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 11:42:25,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:42:26,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 11:42:28,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:31,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:42:34,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:42:36,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:42:38,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:42:38,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:42:40,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:42:42,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:42:45,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1254700.0, ans=0.125 2023-10-03 11:42:46,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:42:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:42:51,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:42:51,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:42:53,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:42:58,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:42:59,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:43:01,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 11:43:01,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:01,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:43:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 11:43:06,073 INFO [train.py:1046] (2/4) Epoch 36, batch 2300, loss[loss=0.1989, simple_loss=0.2607, pruned_loss=0.06854, over 19493.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2387, pruned_loss=0.0397, over 4723854.97 frames. ], batch size: 388, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:43:07,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:43:07,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:14,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:14,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:43:18,532 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 11:43:19,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:25,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:43:25,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:43:26,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:43:26,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:26,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 11:43:26,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:43:28,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:43:30,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:43:31,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1254900.0, ans=0.125 2023-10-03 11:43:34,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:43:37,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:43:40,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:43:43,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:43:44,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:47,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:43:49,157 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:43:50,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:51,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:43:54,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:43:54,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:43:54,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 11:43:59,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:43:59,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:00,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:00,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:44:00,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:44:02,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 11:44:02,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:44:02,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 11:44:02,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:44:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:02,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 11:44:08,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:44:11,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:44:14,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:44:14,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1255100.0, ans=0.0 2023-10-03 11:44:15,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:44:15,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:44:17,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:44:17,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:44:17,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:44:18,609 INFO [train.py:1046] (2/4) Epoch 36, batch 2350, loss[loss=0.2199, simple_loss=0.2839, pruned_loss=0.07793, over 19777.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2397, pruned_loss=0.04039, over 4705787.76 frames. ], batch size: 389, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:44:18,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 11:44:19,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.81 vs. limit=12.0 2023-10-03 11:44:26,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:44:26,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 11:44:30,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1255166.6666666667, ans=0.125 2023-10-03 11:44:31,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 11:44:34,576 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.74 vs. limit=10.0 2023-10-03 11:44:35,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:38,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:39,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:39,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:44:39,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:44:40,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 11:44:44,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:44:44,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1255233.3333333333, ans=0.0 2023-10-03 11:44:45,733 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.891e+02 2.082e+02 2.272e+02 3.750e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 11:44:46,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1255233.3333333333, ans=0.0 2023-10-03 11:44:48,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 11:44:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:44:52,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1255300.0, ans=0.2 2023-10-03 11:44:53,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:44:54,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:44:55,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:44:56,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 11:44:57,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:44:59,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:44:59,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:45:00,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:45:02,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:45:06,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1255366.6666666667, ans=0.0 2023-10-03 11:45:07,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 11:45:07,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:45:10,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:45:10,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:45:11,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 11:45:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:45:15,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 11:45:15,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:45:19,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 11:45:22,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 11:45:23,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:45:23,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:45:23,491 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 11:45:23,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 11:45:24,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 11:45:27,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1255433.3333333333, ans=0.125 2023-10-03 11:45:27,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1255433.3333333333, ans=0.125 2023-10-03 11:45:28,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:45:32,231 INFO [train.py:1046] (2/4) Epoch 36, batch 2400, loss[loss=0.1523, simple_loss=0.2174, pruned_loss=0.0436, over 23681.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2397, pruned_loss=0.04022, over 4709528.14 frames. ], batch size: 232, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:45:32,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:45:37,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:45:39,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:45:39,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 11:45:40,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 11:45:42,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1255500.0, ans=0.2 2023-10-03 11:45:47,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:45:47,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:45:50,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 11:45:50,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:45:50,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:45:52,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 11:45:53,008 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.63 vs. limit=15.0 2023-10-03 11:45:57,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:45:59,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 11:46:02,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:46:07,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 11:46:09,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:46:11,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:11,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1255633.3333333333, ans=0.07 2023-10-03 11:46:15,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:46:15,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 11:46:16,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:46:22,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:23,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:46:28,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:46:29,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:46:29,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:46:29,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:46:29,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:29,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:46:29,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:46:32,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:46:32,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:46:34,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 11:46:34,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1255766.6666666667, ans=0.125 2023-10-03 11:46:36,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 11:46:37,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:46:37,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:39,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 11:46:40,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.22 vs. limit=15.0 2023-10-03 11:46:40,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 11:46:40,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 11:46:42,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 11:46:42,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 11:46:43,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:46:43,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:45,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:46:45,214 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 11:46:46,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:47,879 INFO [train.py:1046] (2/4) Epoch 36, batch 2450, loss[loss=0.1607, simple_loss=0.2412, pruned_loss=0.04016, over 23292.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2383, pruned_loss=0.04006, over 4685145.99 frames. ], batch size: 105, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:46:47,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:46:50,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:46:50,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:46:51,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1255833.3333333333, ans=0.125 2023-10-03 11:46:52,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1255833.3333333333, ans=0.0 2023-10-03 11:46:53,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:46:55,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:46:55,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 11:46:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:46:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:03,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:47:03,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:47:03,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:47:04,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 11:47:10,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:12,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:47:12,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:47:15,052 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.962e+02 2.115e+02 2.391e+02 5.447e+02, threshold=4.229e+02, percent-clipped=1.0 2023-10-03 11:47:16,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:47:16,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:17,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:17,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:47:19,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1255966.6666666667, ans=0.1 2023-10-03 11:47:20,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 11:47:21,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:47:30,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:30,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:30,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:47:32,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:47:32,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:32,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:47:33,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 11:47:36,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:38,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:47:41,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:47:41,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:47:44,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1256033.3333333333, ans=0.125 2023-10-03 11:47:45,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:47:45,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 11:47:46,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:47:48,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:47:48,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 11:47:49,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:47:50,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:47:54,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:47:57,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:57,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:48:00,571 INFO [train.py:1046] (2/4) Epoch 36, batch 2500, loss[loss=0.1587, simple_loss=0.2339, pruned_loss=0.0418, over 23752.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2378, pruned_loss=0.03982, over 4706835.58 frames. ], batch size: 164, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:48:00,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 11:48:02,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:48:08,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:48:18,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:48:18,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:48:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:48:20,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 11:48:23,110 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:48:25,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:48:25,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:48:27,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:48:27,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 11:48:28,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 11:48:29,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:29,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:48:29,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 11:48:29,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:31,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 11:48:31,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:36,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:48:37,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:48:40,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:48:40,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 11:48:42,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:48:42,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1256300.0, ans=0.0 2023-10-03 11:48:44,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:45,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.48 vs. limit=15.0 2023-10-03 11:48:48,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:54,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:48:59,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:49:02,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 11:49:02,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:49:02,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:49:05,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:49:05,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:49:05,864 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 11:49:05,864 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 11:49:05,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 11:49:06,601 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.62 vs. limit=6.0 2023-10-03 11:49:09,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:49:10,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 11:49:10,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 11:49:10,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:49:10,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 11:49:15,077 INFO [train.py:1046] (2/4) Epoch 36, batch 2550, loss[loss=0.1587, simple_loss=0.2476, pruned_loss=0.03487, over 24661.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2379, pruned_loss=0.03963, over 4707904.96 frames. ], batch size: 73, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:49:15,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 11:49:18,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:49:19,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:49:21,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:49:22,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:49:24,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 11:49:24,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:49:26,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 11:49:26,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1256500.0, ans=10.0 2023-10-03 11:49:28,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:49:29,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1256566.6666666667, ans=0.125 2023-10-03 11:49:30,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:34,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:49:34,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 11:49:34,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:49:34,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:49:34,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:49:35,017 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.15 vs. limit=22.5 2023-10-03 11:49:37,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:49:38,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 11:49:38,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:49:38,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:38,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 11:49:43,473 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.881e+02 2.105e+02 2.315e+02 3.420e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-03 11:49:49,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:49:56,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:49:56,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:56,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:49:56,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1256633.3333333333, ans=0.125 2023-10-03 11:49:57,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:50:00,754 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:50:03,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:50:06,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:50:06,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:50:06,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:50:07,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:50:07,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:50:13,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:50:13,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:50:13,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1256766.6666666667, ans=0.0 2023-10-03 11:50:17,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:50:17,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 11:50:17,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:50:19,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:50:19,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:50:19,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1256766.6666666667, ans=0.025 2023-10-03 11:50:20,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:50:22,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:27,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:50:27,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1256833.3333333333, ans=0.125 2023-10-03 11:50:29,092 INFO [train.py:1046] (2/4) Epoch 36, batch 2600, loss[loss=0.1564, simple_loss=0.2312, pruned_loss=0.04083, over 23795.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2389, pruned_loss=0.03997, over 4715387.34 frames. ], batch size: 232, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:50:30,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:31,948 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 11:50:36,025 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 11:50:36,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:50:36,080 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 11:50:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 11:50:37,930 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 11:50:39,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:50:39,573 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 11:50:40,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 11:50:42,859 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 11:50:42,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:50:44,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 11:50:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 11:50:48,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:50:48,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 11:50:50,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1256900.0, ans=0.1 2023-10-03 11:50:51,237 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 11:50:52,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 11:50:56,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1256900.0, ans=0.125 2023-10-03 11:50:58,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:50:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:58,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:50:58,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 11:50:59,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:51:04,412 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 11:51:09,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:51:10,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:10,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 11:51:10,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:51:10,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:51:12,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 11:51:16,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:51:17,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:51:18,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:19,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1257033.3333333333, ans=0.0 2023-10-03 11:51:20,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.93 vs. limit=15.0 2023-10-03 11:51:21,650 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 11:51:21,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:22,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:51:23,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1257033.3333333333, ans=0.125 2023-10-03 11:51:24,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1257033.3333333333, ans=0.1 2023-10-03 11:51:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:51:27,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:51:27,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 11:51:27,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:51:28,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:51:30,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:51:34,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 11:51:35,327 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.66 vs. limit=15.0 2023-10-03 11:51:35,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:37,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:51:40,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 11:51:42,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:43,352 INFO [train.py:1046] (2/4) Epoch 36, batch 2650, loss[loss=0.2128, simple_loss=0.279, pruned_loss=0.07329, over 19481.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2394, pruned_loss=0.04038, over 4699442.42 frames. ], batch size: 388, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:51:43,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:51:43,456 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 11:51:43,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:51:46,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:48,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:51:49,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:51:51,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:52,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 11:51:53,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:51:53,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:51:57,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 11:51:57,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 11:52:00,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:02,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 11:52:02,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:03,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 11:52:08,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:08,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 11:52:09,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:09,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:11,329 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.883e+02 2.080e+02 2.401e+02 3.232e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-03 11:52:14,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 11:52:14,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 11:52:16,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:52:22,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 11:52:22,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:23,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:23,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:52:23,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:52:24,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:26,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:52:26,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.90 vs. limit=12.0 2023-10-03 11:52:27,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:52:28,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:52:30,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:52:31,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:52:33,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:35,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:52:36,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:38,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:52:39,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:52:42,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:43,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:52:43,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:43,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 11:52:43,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1257433.3333333333, ans=0.1 2023-10-03 11:52:48,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:49,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:49,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:51,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:52:51,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1257433.3333333333, ans=0.0 2023-10-03 11:52:52,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:52:53,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:52:55,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:52:55,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 11:52:56,978 INFO [train.py:1046] (2/4) Epoch 36, batch 2700, loss[loss=0.1687, simple_loss=0.2521, pruned_loss=0.04264, over 23993.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2405, pruned_loss=0.04086, over 4695025.03 frames. ], batch size: 80, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:52:57,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1257500.0, ans=0.1 2023-10-03 11:52:58,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:52:58,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1257500.0, ans=0.0 2023-10-03 11:53:01,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 11:53:02,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:53:03,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:03,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:05,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:53:05,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:53:05,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:53:05,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:53:06,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 11:53:07,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:53:07,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1257500.0, ans=0.035 2023-10-03 11:53:10,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:53:10,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:53:10,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:53:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:53:13,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 11:53:14,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:53:16,878 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.76 vs. limit=22.5 2023-10-03 11:53:20,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:53:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:53:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:53:26,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:53:26,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1257633.3333333333, ans=0.07 2023-10-03 11:53:28,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:53:28,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:53:30,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:53:32,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:53:32,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:53:32,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:53:36,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:36,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:53:45,700 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:53:46,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:53:46,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:53:52,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:53:52,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:53:55,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:56,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:53:57,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:53:59,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:53:59,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:59,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:54:02,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:54:03,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:54:03,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:54:06,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 11:54:08,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:10,706 INFO [train.py:1046] (2/4) Epoch 36, batch 2750, loss[loss=0.1474, simple_loss=0.2221, pruned_loss=0.03633, over 23587.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2401, pruned_loss=0.04072, over 4693514.89 frames. ], batch size: 149, lr: 2.83e-03, grad_scale: 4.0 2023-10-03 11:54:10,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:54:10,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 11:54:13,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 11:54:13,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:15,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:54:17,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:17,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:54:18,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:23,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:54:23,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:54:23,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:54:23,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:23,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 11:54:23,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:54:23,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:24,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1257900.0, ans=0.1 2023-10-03 11:54:30,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 11:54:30,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:54:31,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:31,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:54:31,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:54:33,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:54:35,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:54:35,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:36,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:37,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1257900.0, ans=0.04949747468305833 2023-10-03 11:54:39,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:54:39,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:54:39,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1257966.6666666667, ans=0.125 2023-10-03 11:54:40,467 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.983e+02 2.239e+02 2.641e+02 5.389e+02, threshold=4.478e+02, percent-clipped=1.0 2023-10-03 11:54:40,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:54:40,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1257966.6666666667, ans=0.125 2023-10-03 11:54:41,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:42,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:54:47,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:51,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:54:51,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:54:54,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:54,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:54:56,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:55:00,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:55:00,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:55:00,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 11:55:06,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.08 vs. limit=6.0 2023-10-03 11:55:06,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:08,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 11:55:14,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:55:16,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:55:16,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 11:55:17,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:55:18,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:55:18,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 11:55:18,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:55:24,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 11:55:24,721 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.01 vs. limit=22.5 2023-10-03 11:55:25,397 INFO [train.py:1046] (2/4) Epoch 36, batch 2800, loss[loss=0.1468, simple_loss=0.2115, pruned_loss=0.04107, over 22883.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2376, pruned_loss=0.04034, over 4687616.06 frames. ], batch size: 322, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:55:25,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:25,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:55:25,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 11:55:25,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:55:27,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:28,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:55:28,823 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 11:55:28,824 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 11:55:29,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=22.5 2023-10-03 11:55:31,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:34,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:55:35,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:55:38,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:55:41,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 11:55:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 11:55:44,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 11:55:46,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:46,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:55:46,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:55:47,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1258233.3333333333, ans=0.2 2023-10-03 11:55:49,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.51 vs. limit=15.0 2023-10-03 11:55:50,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:55:51,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:51,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:55:51,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:55:56,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.14 vs. limit=15.0 2023-10-03 11:56:01,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:56:02,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:56:05,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:05,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:56:05,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:10,781 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.80 vs. limit=22.5 2023-10-03 11:56:11,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:56:11,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 11:56:11,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:12,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:56:12,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:56:15,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:17,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:21,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:56:22,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:56:23,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1258433.3333333333, ans=15.0 2023-10-03 11:56:24,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:24,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:56:24,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:56:24,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:56:24,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:56:24,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 11:56:26,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:56:28,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:56:28,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:56:28,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 11:56:29,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:29,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:56:30,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:56:32,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 11:56:36,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:56:36,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:56:38,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:56:39,665 INFO [train.py:1046] (2/4) Epoch 36, batch 2850, loss[loss=0.1674, simple_loss=0.2363, pruned_loss=0.0492, over 22789.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2374, pruned_loss=0.04, over 4683327.37 frames. ], batch size: 322, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:56:41,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:56:45,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:56:46,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:56:46,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:48,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:49,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:50,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:56:52,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 11:56:53,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1258566.6666666667, ans=0.125 2023-10-03 11:56:58,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 11:56:58,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:01,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 11:57:02,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:04,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 11:57:04,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 11:57:05,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:08,329 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.821e+02 1.991e+02 2.148e+02 3.259e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-03 11:57:08,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1258633.3333333333, ans=0.125 2023-10-03 11:57:17,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:57:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:57:17,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:57:18,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:57:19,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:57:19,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:57:21,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:57:21,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 11:57:24,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:57:24,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:57:25,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:57:25,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:26,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1258700.0, ans=0.0 2023-10-03 11:57:29,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:57:29,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:57:31,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:32,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:57:34,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:57:34,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1258700.0, ans=0.2 2023-10-03 11:57:35,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:35,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:38,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:57:40,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1258766.6666666667, ans=0.125 2023-10-03 11:57:44,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:57:45,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 11:57:45,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 11:57:46,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:57:46,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:57:46,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 11:57:48,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:57:49,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:57:49,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:57:49,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:57:49,740 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 11:57:51,063 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 11:57:51,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:57:51,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:52,531 INFO [train.py:1046] (2/4) Epoch 36, batch 2900, loss[loss=0.1586, simple_loss=0.2359, pruned_loss=0.04069, over 23364.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2373, pruned_loss=0.03991, over 4682630.71 frames. ], batch size: 119, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:57:52,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1258833.3333333333, ans=0.125 2023-10-03 11:57:53,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:57:53,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:57:54,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:57:55,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 11:57:59,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:59,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 11:57:59,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1258833.3333333333, ans=0.125 2023-10-03 11:58:01,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 11:58:02,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:58:02,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:58:03,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:58:05,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:58:09,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:58:10,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:58:12,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:58:12,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 11:58:12,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:58:14,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:15,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 11:58:15,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 11:58:17,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1258900.0, ans=0.125 2023-10-03 11:58:19,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:58:19,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 11:58:19,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:58:22,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:58:22,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:58:24,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1258966.6666666667, ans=10.0 2023-10-03 11:58:25,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:58:25,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:29,663 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.79 vs. limit=15.0 2023-10-03 11:58:30,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:58:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:58:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 11:58:33,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 11:58:33,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:58:36,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:58:39,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 11:58:39,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1259033.3333333333, ans=0.2 2023-10-03 11:58:40,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:58:46,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:52,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:58:53,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:58:53,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 11:58:58,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:58:58,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 11:58:58,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:59:00,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:59:04,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:59:05,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 11:59:07,054 INFO [train.py:1046] (2/4) Epoch 36, batch 2950, loss[loss=0.1581, simple_loss=0.231, pruned_loss=0.04256, over 23551.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2378, pruned_loss=0.03993, over 4681590.74 frames. ], batch size: 256, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:59:07,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:59:07,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:08,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:09,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1259166.6666666667, ans=0.0 2023-10-03 11:59:10,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:59:13,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 11:59:13,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 11:59:14,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:59:14,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:59:20,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:59:21,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:59:23,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:59:23,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:59:26,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:59:26,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:59:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:29,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:29,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:59:32,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 11:59:36,561 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.912e+02 2.122e+02 2.362e+02 3.535e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-03 11:59:36,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 11:59:36,690 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 11:59:36,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1259300.0, ans=0.125 2023-10-03 11:59:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:59:39,377 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 11:59:41,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 11:59:41,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:59:42,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:59:42,816 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 11:59:42,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:59:45,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 11:59:45,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:59:46,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:59:49,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:51,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:59:52,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:59:52,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 11:59:53,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:53,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 11:59:58,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.05 vs. limit=6.0 2023-10-03 11:59:59,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:00:01,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:00:03,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 12:00:03,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:00:04,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 12:00:07,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:00:07,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1259433.3333333333, ans=0.125 2023-10-03 12:00:08,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:00:10,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:00:11,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:00:12,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:00:14,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:00:15,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:15,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:00:15,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:00:17,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:00:17,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:00:18,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:18,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 12:00:19,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1259500.0, ans=0.0 2023-10-03 12:00:20,328 INFO [train.py:1046] (2/4) Epoch 36, batch 3000, loss[loss=0.1609, simple_loss=0.245, pruned_loss=0.03835, over 24035.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.0398, over 4705935.90 frames. ], batch size: 80, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:00:20,328 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 12:00:31,797 INFO [train.py:1078] (2/4) Epoch 36, validation: loss=0.3578, simple_loss=0.2691, pruned_loss=0.2232, over 1125622.00 frames. 2023-10-03 12:00:31,797 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 12:00:31,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:33,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:00:34,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:00:38,545 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 12:00:38,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 12:00:41,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:00:42,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:00:43,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 12:00:43,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:00:49,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:00:58,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:01:06,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 12:01:08,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:01:09,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:01:11,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:01:11,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:01:12,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:01:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 12:01:14,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 12:01:15,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:01:17,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:01:18,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:01:18,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:01:20,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:20,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:01:22,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:01:22,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:01:22,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:01:23,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1259700.0, ans=0.125 2023-10-03 12:01:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:01:24,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1259700.0, ans=0.2 2023-10-03 12:01:27,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 12:01:29,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:01:29,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:29,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:01:33,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:35,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:36,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 12:01:37,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 12:01:37,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:01:37,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 12:01:38,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:01:41,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 12:01:42,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:01:43,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:01:44,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 12:01:44,610 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.20 vs. limit=15.0 2023-10-03 12:01:45,286 INFO [train.py:1046] (2/4) Epoch 36, batch 3050, loss[loss=0.16, simple_loss=0.2288, pruned_loss=0.04558, over 23812.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2393, pruned_loss=0.04001, over 4710210.46 frames. ], batch size: 179, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:01:45,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 12:01:45,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:01:45,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:01:47,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:47,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:01:47,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:47,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:01:51,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 12:01:52,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:01:54,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:01:55,337 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.79 vs. limit=15.0 2023-10-03 12:01:55,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:01:58,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:59,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 12:02:06,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 12:02:06,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 12:02:06,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:09,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:02:12,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:02:12,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:15,670 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.931e+02 2.120e+02 2.392e+02 4.197e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-03 12:02:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:02:15,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:02:17,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:17,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:02:17,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:19,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:20,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1259966.6666666667, ans=0.125 2023-10-03 12:02:21,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:23,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:24,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 12:02:24,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:24,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:02:27,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=8.45 vs. limit=12.0 2023-10-03 12:02:28,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:02:28,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:02:30,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:02:30,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:35,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:35,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:42,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:42,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1260033.3333333333, ans=0.125 2023-10-03 12:02:43,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:02:43,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:45,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:02:47,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:02:48,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:02:48,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 12:02:49,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:02:49,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:51,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 12:02:52,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:55,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1260100.0, ans=0.2 2023-10-03 12:02:57,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.25 vs. limit=15.0 2023-10-03 12:02:58,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:58,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1260166.6666666667, ans=0.05 2023-10-03 12:02:59,596 INFO [train.py:1046] (2/4) Epoch 36, batch 3100, loss[loss=0.1619, simple_loss=0.2581, pruned_loss=0.03289, over 24696.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2392, pruned_loss=0.04022, over 4708360.48 frames. ], batch size: 73, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:02:59,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:03:02,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:03:05,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 12:03:07,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 12:03:08,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 12:03:08,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:03:11,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:03:11,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:13,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:03:16,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:22,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 12:03:26,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:03:27,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:29,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:03:29,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:03:30,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 12:03:32,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:03:32,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 12:03:32,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:03:34,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:34,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1260300.0, ans=15.0 2023-10-03 12:03:35,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 12:03:35,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:03:39,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:03:39,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1260300.0, ans=0.0 2023-10-03 12:03:41,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 12:03:43,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 12:03:43,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1260366.6666666667, ans=0.125 2023-10-03 12:03:44,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:45,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:47,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:03:47,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:49,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:03:49,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:03:49,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:03:52,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:03:52,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:03:52,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:52,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:03:57,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:03:57,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 12:04:00,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:04:01,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 12:04:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:01,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:01,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 12:04:08,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.07 vs. limit=15.0 2023-10-03 12:04:11,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 12:04:14,230 INFO [train.py:1046] (2/4) Epoch 36, batch 3150, loss[loss=0.1649, simple_loss=0.2468, pruned_loss=0.04147, over 23493.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2377, pruned_loss=0.03999, over 4706598.56 frames. ], batch size: 106, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:04:14,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:14,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:17,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:04:17,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:04:17,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 12:04:17,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:19,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:04:19,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 12:04:21,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:22,787 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=15.0 2023-10-03 12:04:24,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 12:04:24,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1260500.0, ans=0.025 2023-10-03 12:04:27,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 12:04:28,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:04:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 12:04:30,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 12:04:31,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 12:04:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 12:04:31,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 12:04:31,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:31,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:04:34,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:35,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 12:04:36,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:37,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:37,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:04:37,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1260566.6666666667, ans=0.035 2023-10-03 12:04:41,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:04:42,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 12:04:44,296 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.839e+02 2.057e+02 2.261e+02 3.088e+02, threshold=4.113e+02, percent-clipped=0.0 2023-10-03 12:04:44,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:04:45,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:04:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:04:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 12:04:49,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 12:04:50,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:04:50,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:04:50,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:04:51,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:51,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:04:53,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:04:53,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:04:54,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 12:04:55,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:04:55,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:04:56,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1260633.3333333333, ans=0.2 2023-10-03 12:04:58,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:04:58,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:05:00,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 12:05:01,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:01,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1260700.0, ans=0.125 2023-10-03 12:05:02,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 12:05:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:04,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 12:05:05,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 12:05:07,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:05:08,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:10,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 12:05:12,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 12:05:12,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:05:14,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.81 vs. limit=22.5 2023-10-03 12:05:16,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:05:16,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:16,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:05:21,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:05:22,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:24,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 12:05:28,481 INFO [train.py:1046] (2/4) Epoch 36, batch 3200, loss[loss=0.1714, simple_loss=0.2441, pruned_loss=0.04932, over 23946.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2374, pruned_loss=0.03958, over 4716730.46 frames. ], batch size: 195, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 12:05:28,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:05:28,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 12:05:32,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:33,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.57 vs. limit=6.0 2023-10-03 12:05:34,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:05:34,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 12:05:36,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:39,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1260833.3333333333, ans=0.0 2023-10-03 12:05:41,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:05:44,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:52,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:06:00,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 12:06:02,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:06:03,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 12:06:03,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:06:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:06:08,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:06:08,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1260966.6666666667, ans=0.1 2023-10-03 12:06:09,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:06:13,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 12:06:16,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 12:06:18,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 12:06:20,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 12:06:23,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:06:28,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:06:28,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:06:29,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:06:29,947 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 12:06:29,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:06:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:06:34,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 12:06:36,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 12:06:36,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 12:06:37,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 12:06:39,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:06:41,392 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:06:42,436 INFO [train.py:1046] (2/4) Epoch 36, batch 3250, loss[loss=0.1588, simple_loss=0.2382, pruned_loss=0.03972, over 23462.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2371, pruned_loss=0.03947, over 4710400.70 frames. ], batch size: 93, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 12:06:42,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:06:42,557 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 12:06:42,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:06:42,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:06:42,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1261166.6666666667, ans=0.1 2023-10-03 12:06:44,012 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 12:06:49,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:06:51,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:06:56,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1261233.3333333333, ans=0.1 2023-10-03 12:07:00,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:01,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 12:07:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:03,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:07:03,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:07:03,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:07:04,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:07:06,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:06,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:07:06,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:06,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:06,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:08,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:07:11,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:12,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:07:14,172 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.976e+02 2.164e+02 2.550e+02 4.020e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 12:07:14,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:14,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:16,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.67 vs. limit=15.0 2023-10-03 12:07:17,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:17,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:07:17,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:07:21,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 12:07:23,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:07:23,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:07:25,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:25,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-10-03 12:07:26,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:07:31,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:07:34,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.11 vs. limit=22.5 2023-10-03 12:07:37,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:07:39,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:39,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 12:07:39,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:07:39,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:07:39,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:41,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1261433.3333333333, ans=0.0 2023-10-03 12:07:42,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 12:07:42,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 12:07:43,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:07:43,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:44,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:44,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1261433.3333333333, ans=0.125 2023-10-03 12:07:45,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:07:45,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:48,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:07:49,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:07:49,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 12:07:51,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:07:53,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:07:53,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 12:07:57,709 INFO [train.py:1046] (2/4) Epoch 36, batch 3300, loss[loss=0.1431, simple_loss=0.2189, pruned_loss=0.03362, over 24451.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2371, pruned_loss=0.03929, over 4726062.00 frames. ], batch size: 58, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:07:57,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:07:57,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 12:08:00,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 12:08:00,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1261500.0, ans=0.125 2023-10-03 12:08:01,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 12:08:01,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:04,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:08:05,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:08:06,454 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.26 vs. limit=15.0 2023-10-03 12:08:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 12:08:08,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:08:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:11,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:08:14,423 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.90 vs. limit=12.0 2023-10-03 12:08:16,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 12:08:17,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:08:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:17,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:18,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1261566.6666666667, ans=0.125 2023-10-03 12:08:19,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 12:08:19,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:08:20,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:08:21,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:08:21,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:08:21,827 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 12:08:26,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:26,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:08:27,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:27,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 12:08:27,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 12:08:29,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:30,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:08:33,415 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 12:08:34,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 12:08:34,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:08:36,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 12:08:37,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:08:41,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:08:42,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:08:44,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:08:44,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:44,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:44,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:08:47,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:08:47,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:48,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:08:48,622 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 12:08:51,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 12:08:52,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:08:52,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:08:52,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:08:54,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:54,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:08:55,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:08:55,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:08:55,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:08:57,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:58,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:09:01,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 12:09:01,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:02,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:03,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1261766.6666666667, ans=0.1 2023-10-03 12:09:04,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1261766.6666666667, ans=0.125 2023-10-03 12:09:05,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:09:05,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:09:07,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:09,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:09:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:10,280 INFO [train.py:1046] (2/4) Epoch 36, batch 3350, loss[loss=0.1581, simple_loss=0.2445, pruned_loss=0.03588, over 24480.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2384, pruned_loss=0.03996, over 4721328.57 frames. ], batch size: 66, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:09:12,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:09:13,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:15,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:09:15,735 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.68 vs. limit=22.5 2023-10-03 12:09:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:20,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:09:23,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:23,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:09:24,599 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.07 vs. limit=15.0 2023-10-03 12:09:24,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 12:09:25,688 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 12:09:25,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:28,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 12:09:28,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 12:09:30,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:09:30,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:09:31,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:31,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 12:09:31,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1261900.0, ans=0.125 2023-10-03 12:09:33,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:33,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:09:34,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1261900.0, ans=0.125 2023-10-03 12:09:35,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:35,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:35,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:37,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:09:40,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:41,980 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.875e+02 2.056e+02 2.286e+02 3.351e+02, threshold=4.113e+02, percent-clipped=0.0 2023-10-03 12:09:42,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:42,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:42,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1261966.6666666667, ans=0.125 2023-10-03 12:09:44,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1261966.6666666667, ans=0.1 2023-10-03 12:09:46,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:09:48,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:50,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:50,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:53,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:55,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 12:09:55,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:09:55,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 12:09:55,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:09:55,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 12:09:58,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:58,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:59,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1262033.3333333333, ans=0.0 2023-10-03 12:10:05,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.46 vs. limit=22.5 2023-10-03 12:10:05,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:10:07,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 12:10:07,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:10:07,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1262033.3333333333, ans=0.1 2023-10-03 12:10:08,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:10:08,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:10:15,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:10:16,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 12:10:17,083 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.33 vs. limit=22.5 2023-10-03 12:10:17,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:10:17,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:10:19,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:10:19,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 12:10:20,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:10:20,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 12:10:23,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:10:24,771 INFO [train.py:1046] (2/4) Epoch 36, batch 3400, loss[loss=0.1572, simple_loss=0.2532, pruned_loss=0.03058, over 24320.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.239, pruned_loss=0.03999, over 4723255.06 frames. ], batch size: 74, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:10:24,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:10:26,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:10:26,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1262166.6666666667, ans=0.125 2023-10-03 12:10:27,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:10:27,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 12:10:33,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 12:10:34,913 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 12:10:34,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:10:35,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1262166.6666666667, ans=0.5 2023-10-03 12:10:38,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:10:38,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:10:39,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:10:39,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:10:47,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:10:48,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 12:10:51,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1262233.3333333333, ans=0.0 2023-10-03 12:10:52,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:10:53,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:10:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:10:55,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:11:00,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:11:01,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1262300.0, ans=0.125 2023-10-03 12:11:05,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 12:11:11,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:11:12,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:11:12,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 12:11:14,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:11:14,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:11:15,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:11:16,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:11:19,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:11:19,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1262366.6666666667, ans=0.125 2023-10-03 12:11:22,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:11:22,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:11:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:11:29,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 12:11:33,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:11:33,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1262433.3333333333, ans=0.125 2023-10-03 12:11:37,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 12:11:38,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1262500.0, ans=0.125 2023-10-03 12:11:39,314 INFO [train.py:1046] (2/4) Epoch 36, batch 3450, loss[loss=0.1582, simple_loss=0.2497, pruned_loss=0.0334, over 24455.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2393, pruned_loss=0.03989, over 4734271.83 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:11:40,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 12:11:40,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:11:42,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:11:42,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 12:11:44,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:11:47,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:11:53,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:11:53,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:11:53,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1262566.6666666667, ans=0.2 2023-10-03 12:11:54,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:11:54,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:11:56,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:12:01,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 12:12:05,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1262566.6666666667, ans=0.1 2023-10-03 12:12:08,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 12:12:08,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:12:09,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:12:10,549 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.861e+02 2.014e+02 2.167e+02 2.671e+02, threshold=4.028e+02, percent-clipped=0.0 2023-10-03 12:12:10,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:10,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1262633.3333333333, ans=0.125 2023-10-03 12:12:12,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1262633.3333333333, ans=0.125 2023-10-03 12:12:15,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 12:12:16,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:12:20,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:12:20,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:12:20,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1262633.3333333333, ans=0.125 2023-10-03 12:12:20,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1262633.3333333333, ans=0.1 2023-10-03 12:12:22,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:12:24,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:12:25,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 12:12:25,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:12:26,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:12:29,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:12:32,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 12:12:35,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:12:36,022 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:12:40,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1262766.6666666667, ans=0.07 2023-10-03 12:12:41,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:12:42,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:46,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:12:49,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:49,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:12:51,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:12:51,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:12:53,914 INFO [train.py:1046] (2/4) Epoch 36, batch 3500, loss[loss=0.1504, simple_loss=0.227, pruned_loss=0.03692, over 24610.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.0394, over 4734919.55 frames. ], batch size: 60, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:12:54,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1262833.3333333333, ans=0.0 2023-10-03 12:12:54,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1262833.3333333333, ans=0.2 2023-10-03 12:12:55,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:12:58,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:12:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 12:12:59,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1262833.3333333333, ans=0.125 2023-10-03 12:13:02,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:13:04,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:13:08,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:13:08,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 12:13:13,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:13:14,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:13:14,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:13:16,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:13:16,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:13:17,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:17,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:13:17,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 12:13:20,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:20,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:13:23,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:13:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:26,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 12:13:26,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:13:29,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:13:30,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:13:31,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:33,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:13:33,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:13:34,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 12:13:36,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 12:13:36,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 12:13:38,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:13:39,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:39,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:13:40,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:13:44,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:13:45,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:13:48,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.42 vs. limit=15.0 2023-10-03 12:13:50,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:13:51,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 12:13:51,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 12:13:51,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:13:53,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:13:53,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:13:56,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 12:13:59,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:14:00,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:14:01,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 12:14:02,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1263100.0, ans=0.1 2023-10-03 12:14:03,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 12:14:03,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1263100.0, ans=0.125 2023-10-03 12:14:06,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:07,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.66 vs. limit=15.0 2023-10-03 12:14:07,920 INFO [train.py:1046] (2/4) Epoch 36, batch 3550, loss[loss=0.1588, simple_loss=0.2476, pruned_loss=0.03504, over 24332.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2352, pruned_loss=0.03929, over 4724629.11 frames. ], batch size: 74, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:14:08,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:14:08,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:10,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:14:16,291 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:14:18,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:20,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 12:14:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:14:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:14:26,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:27,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:14:27,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:14:29,755 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.25 vs. limit=15.0 2023-10-03 12:14:30,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:14:30,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:14:30,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1263233.3333333333, ans=0.2 2023-10-03 12:14:32,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:32,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:14:32,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:14:37,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:14:38,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:14:39,163 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.892e+02 2.093e+02 2.381e+02 3.257e+02, threshold=4.186e+02, percent-clipped=0.0 2023-10-03 12:14:39,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:14:39,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:39,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:14:40,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 12:14:40,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:42,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:45,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:14:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:49,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:14:51,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:52,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 12:14:52,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:14:54,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 12:14:54,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:14:57,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:14:57,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:14:59,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1263366.6666666667, ans=0.2 2023-10-03 12:15:01,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 12:15:03,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:05,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=12.0 2023-10-03 12:15:09,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:10,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 12:15:10,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:14,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:15:14,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1263433.3333333333, ans=0.125 2023-10-03 12:15:16,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 12:15:22,075 INFO [train.py:1046] (2/4) Epoch 36, batch 3600, loss[loss=0.1515, simple_loss=0.2398, pruned_loss=0.03156, over 24453.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2352, pruned_loss=0.0393, over 4716145.81 frames. ], batch size: 66, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:15:22,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1263500.0, ans=0.125 2023-10-03 12:15:23,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 12:15:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:15:23,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:15:25,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:25,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1263500.0, ans=0.95 2023-10-03 12:15:26,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:26,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:15:29,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:15:31,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:34,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:15:34,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:15:34,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:34,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 12:15:34,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1263500.0, ans=0.125 2023-10-03 12:15:35,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1263566.6666666667, ans=0.05 2023-10-03 12:15:37,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:15:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:40,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1263566.6666666667, ans=0.0 2023-10-03 12:15:41,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:15:42,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:15:44,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:15:45,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:15:45,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 12:15:46,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:15:48,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:15:53,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:54,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:15:54,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1263633.3333333333, ans=0.1 2023-10-03 12:15:55,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:15:56,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1263633.3333333333, ans=0.125 2023-10-03 12:15:57,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 12:16:03,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:04,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:16:05,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 12:16:09,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:16:14,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:17,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1263700.0, ans=0.2 2023-10-03 12:16:19,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:25,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:16:25,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:16:25,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 12:16:26,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 12:16:28,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 12:16:29,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:16:29,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:16:31,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 12:16:32,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:16:32,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:16:32,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:33,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 12:16:34,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 12:16:35,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-03 12:16:36,401 INFO [train.py:1046] (2/4) Epoch 36, batch 3650, loss[loss=0.1654, simple_loss=0.2424, pruned_loss=0.04422, over 23763.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2359, pruned_loss=0.03943, over 4708923.81 frames. ], batch size: 179, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:16:37,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:39,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 12:16:43,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 12:16:44,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:16:49,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 12:16:50,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1263900.0, ans=0.95 2023-10-03 12:16:51,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 12:16:54,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:16:54,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:16:56,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:16:58,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:16:58,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:59,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 12:17:01,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:17:01,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:01,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 12:17:04,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:17:04,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:17:04,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:07,636 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.919e+02 2.181e+02 2.473e+02 3.454e+02, threshold=4.361e+02, percent-clipped=0.0 2023-10-03 12:17:07,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:17:09,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 12:17:10,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 12:17:10,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:17:13,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 12:17:15,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:17:15,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:17:16,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1263966.6666666667, ans=0.125 2023-10-03 12:17:19,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:17:20,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:20,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:17:22,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:17:25,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:17:28,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:17:29,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:31,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:17:31,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:17:32,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:34,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:17:40,237 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 12:17:44,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:17:44,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:17:44,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:17:45,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:45,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:17:47,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:47,748 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:17:48,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 12:17:48,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:50,160 INFO [train.py:1046] (2/4) Epoch 36, batch 3700, loss[loss=0.178, simple_loss=0.2413, pruned_loss=0.05732, over 23712.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2376, pruned_loss=0.03986, over 4714327.29 frames. ], batch size: 164, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:17:50,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:17:51,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:51,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1264166.6666666667, ans=0.1 2023-10-03 12:17:52,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:17:56,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:56,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 12:17:56,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:58,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:17:58,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:18:04,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:18:07,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:08,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:09,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:18:09,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:18:10,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:18:13,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 12:18:20,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:18:20,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:18:20,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1264300.0, ans=0.0 2023-10-03 12:18:21,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:18:21,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 12:18:22,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:18:24,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:25,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 12:18:27,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:27,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:18:31,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:31,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:18:31,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1264300.0, ans=0.2 2023-10-03 12:18:34,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:18:38,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:18:38,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 12:18:40,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:40,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 12:18:43,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1264366.6666666667, ans=22.5 2023-10-03 12:18:44,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:18:44,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:18:46,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:47,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 12:18:49,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:18:50,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:18:50,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:18:50,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:53,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1264433.3333333333, ans=0.05 2023-10-03 12:18:55,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:18:55,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 12:18:57,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 12:18:57,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:18:57,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:19:00,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1264433.3333333333, ans=0.125 2023-10-03 12:19:02,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:19:02,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1264433.3333333333, ans=0.0 2023-10-03 12:19:05,615 INFO [train.py:1046] (2/4) Epoch 36, batch 3750, loss[loss=0.1679, simple_loss=0.2545, pruned_loss=0.04069, over 24404.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2386, pruned_loss=0.04004, over 4720248.15 frames. ], batch size: 77, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:19:05,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:19:05,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1264500.0, ans=0.0 2023-10-03 12:19:07,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:19:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:19:08,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 12:19:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 12:19:11,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:19:13,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 12:19:13,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:19:14,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:16,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:17,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:19:20,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:19:24,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:19:24,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:19:27,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:19:30,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:19:30,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 12:19:30,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1264566.6666666667, ans=0.125 2023-10-03 12:19:32,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:19:34,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:19:35,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:19:37,204 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.905e+02 2.182e+02 2.625e+02 6.484e+02, threshold=4.364e+02, percent-clipped=1.0 2023-10-03 12:19:38,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 12:19:41,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 12:19:43,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1264633.3333333333, ans=0.125 2023-10-03 12:19:44,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:19:44,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:19:46,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:19:49,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1264700.0, ans=0.125 2023-10-03 12:19:49,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1264700.0, ans=0.1 2023-10-03 12:19:50,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:19:51,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 12:19:55,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 12:19:56,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1264700.0, ans=0.125 2023-10-03 12:19:58,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:02,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:20:02,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:20:05,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:20:06,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.17 vs. limit=22.5 2023-10-03 12:20:07,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:20:08,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:20:11,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:20:12,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:20:15,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:20:19,910 INFO [train.py:1046] (2/4) Epoch 36, batch 3800, loss[loss=0.1629, simple_loss=0.2525, pruned_loss=0.0366, over 24398.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.239, pruned_loss=0.03999, over 4722794.97 frames. ], batch size: 77, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:20:24,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:20:28,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:28,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:20:29,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 12:20:31,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:31,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1264833.3333333333, ans=0.125 2023-10-03 12:20:32,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:20:34,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:20:36,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 12:20:36,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:37,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:20:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:39,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:20:40,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:41,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 12:20:44,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 12:20:44,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:20:47,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:20:49,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:20:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:20:52,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:20:52,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:55,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1264966.6666666667, ans=0.0 2023-10-03 12:20:56,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:21:00,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:21:00,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 12:21:03,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:21:07,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:21:11,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:21:14,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 12:21:17,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 12:21:18,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:21:20,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:21:20,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:20,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1265100.0, ans=0.2 2023-10-03 12:21:22,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 12:21:25,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 12:21:25,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 12:21:25,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:27,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:21:32,792 INFO [train.py:1046] (2/4) Epoch 36, batch 3850, loss[loss=0.155, simple_loss=0.223, pruned_loss=0.04353, over 23782.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.238, pruned_loss=0.03979, over 4718627.00 frames. ], batch size: 212, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:21:32,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:21:32,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:21:39,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:21:41,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 12:21:41,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:21:42,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:47,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:21:48,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:21:51,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:21:52,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 12:21:52,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1265233.3333333333, ans=0.0 2023-10-03 12:21:58,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:21:59,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:59,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1265233.3333333333, ans=0.1 2023-10-03 12:22:01,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:02,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:22:04,229 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.990e+02 2.175e+02 2.458e+02 3.348e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-03 12:22:05,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:07,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:22:07,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:07,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:22:08,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=22.5 2023-10-03 12:22:08,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:12,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:12,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:12,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:22:12,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 12:22:12,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 12:22:13,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:13,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:16,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:16,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:17,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 12:22:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 12:22:21,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:23,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 12:22:24,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:22:27,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1265366.6666666667, ans=0.1 2023-10-03 12:22:28,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:28,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1265366.6666666667, ans=0.125 2023-10-03 12:22:30,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:33,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:33,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 12:22:36,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1265433.3333333333, ans=10.0 2023-10-03 12:22:37,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 12:22:39,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:41,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:43,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:22:43,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:22:45,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:45,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:45,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:22:45,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 12:22:46,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:48,160 INFO [train.py:1046] (2/4) Epoch 36, batch 3900, loss[loss=0.1702, simple_loss=0.2501, pruned_loss=0.04519, over 23285.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03958, over 4718159.47 frames. ], batch size: 93, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:22:48,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 12:22:48,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:48,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:50,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:22:50,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:52,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:22:54,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:54,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:55,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:22:55,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 12:22:55,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:58,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:22:59,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:22:59,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:23:01,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:23:04,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:23:04,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:23:04,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:23:05,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 12:23:05,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:23:09,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 12:23:09,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:23:10,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 12:23:12,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 12:23:15,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:23:16,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:23:16,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:23:16,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:19,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:23:22,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:23:25,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:23:25,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:23:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:23:29,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1265633.3333333333, ans=0.0 2023-10-03 12:23:31,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:23:31,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:23:39,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:23:40,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:23:44,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1265700.0, ans=0.1 2023-10-03 12:23:49,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:23:50,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:50,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 12:23:51,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 12:23:51,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:52,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 12:23:52,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1265766.6666666667, ans=0.0 2023-10-03 12:23:52,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1265766.6666666667, ans=0.0 2023-10-03 12:23:54,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:23:54,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 12:24:01,815 INFO [train.py:1046] (2/4) Epoch 36, batch 3950, loss[loss=0.1518, simple_loss=0.235, pruned_loss=0.03433, over 23513.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.237, pruned_loss=0.03955, over 4723894.43 frames. ], batch size: 149, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:24:03,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:24:03,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 12:24:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:24:06,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:24:08,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:24:08,655 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.33 vs. limit=22.5 2023-10-03 12:24:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 12:24:15,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:24:15,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1265900.0, ans=0.0 2023-10-03 12:24:16,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 12:24:16,984 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 12:24:17,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:24:19,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:24:19,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:24:19,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:24:23,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 12:24:25,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:24:25,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:24:25,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:24:27,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:24:28,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:24:31,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1265966.6666666667, ans=0.0 2023-10-03 12:24:32,504 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.840e+02 2.075e+02 2.335e+02 2.837e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 12:24:37,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:24:37,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:24:41,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1265966.6666666667, ans=0.0 2023-10-03 12:24:45,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 12:24:50,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 12:24:50,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 12:24:50,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:24:54,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:25:01,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:25:01,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:25:02,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:02,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:25:02,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 12:25:08,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:25:08,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:25:13,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 12:25:16,618 INFO [train.py:1046] (2/4) Epoch 36, batch 4000, loss[loss=0.1887, simple_loss=0.2583, pruned_loss=0.05961, over 19774.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2379, pruned_loss=0.03978, over 4721040.14 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:25:22,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:27,848 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-10-03 12:25:29,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:33,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:25:33,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:25:35,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:35,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 12:25:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:25:36,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 12:25:36,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:25:36,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 12:25:38,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:25:40,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1266233.3333333333, ans=0.0 2023-10-03 12:25:41,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:25:41,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:25:41,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:25:42,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:42,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:25:43,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1266233.3333333333, ans=0.125 2023-10-03 12:25:44,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:25:46,008 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 12:25:47,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:25:47,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:25:47,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1266300.0, ans=0.0 2023-10-03 12:25:50,707 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 12:25:52,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:25:52,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:25:58,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 12:25:58,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:59,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:26:01,091 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 12:26:02,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:26:03,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 12:26:03,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:26:04,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.21 vs. limit=15.0 2023-10-03 12:26:05,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:26:06,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:26:06,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:26:07,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:26:09,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:26:11,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 12:26:11,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:26:12,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 12:26:18,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:26:18,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1266433.3333333333, ans=0.1 2023-10-03 12:26:21,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 12:26:25,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:26:25,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:26:25,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:26:27,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:26:30,020 INFO [train.py:1046] (2/4) Epoch 36, batch 4050, loss[loss=0.1366, simple_loss=0.2137, pruned_loss=0.02981, over 24489.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2382, pruned_loss=0.04017, over 4721944.25 frames. ], batch size: 58, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:26:33,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:26:34,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:26:36,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 12:26:38,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:26:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:26:39,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1266500.0, ans=0.1 2023-10-03 12:26:40,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:26:41,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:26:42,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:26:43,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.29 vs. limit=15.0 2023-10-03 12:26:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:26:49,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:26:49,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:26:51,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:26:51,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:26:55,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:26:56,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1266566.6666666667, ans=0.125 2023-10-03 12:26:57,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:26:58,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1266633.3333333333, ans=0.125 2023-10-03 12:27:00,595 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.848e+02 2.001e+02 2.141e+02 3.089e+02, threshold=4.003e+02, percent-clipped=0.0 2023-10-03 12:27:00,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1266633.3333333333, ans=0.125 2023-10-03 12:27:02,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 12:27:03,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 12:27:03,536 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 12:27:05,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:27:08,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1266633.3333333333, ans=0.2 2023-10-03 12:27:10,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.73 vs. limit=10.0 2023-10-03 12:27:11,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 12:27:12,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:27:16,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:27:19,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:27:19,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:27:19,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:27:22,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:27:22,988 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.40 vs. limit=15.0 2023-10-03 12:27:25,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 12:27:25,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:27:27,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:27:28,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 12:27:32,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:27:38,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 12:27:39,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1266766.6666666667, ans=0.125 2023-10-03 12:27:40,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:27:40,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:27:41,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 12:27:41,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 12:27:41,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:27:43,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:27:44,926 INFO [train.py:1046] (2/4) Epoch 36, batch 4100, loss[loss=0.1812, simple_loss=0.2665, pruned_loss=0.04801, over 24378.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2394, pruned_loss=0.04035, over 4722362.28 frames. ], batch size: 77, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:27:44,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:45,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:27:48,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1266833.3333333333, ans=0.0 2023-10-03 12:27:52,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 12:27:53,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 12:27:56,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 12:27:56,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 12:27:56,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:27:58,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:58,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:58,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:27:59,883 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 12:28:02,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:28:03,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:28:03,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:28:04,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:28:07,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:28:08,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:28:10,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:28:10,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 12:28:10,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:28:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:28:11,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:28:11,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:28:11,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 12:28:14,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:15,545 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-10-03 12:28:16,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 12:28:18,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:28:19,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:28:19,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 12:28:21,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:28:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:28:22,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:28:23,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 12:28:25,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:28:25,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:28:28,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 12:28:29,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:28:29,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:28:33,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1267033.3333333333, ans=0.125 2023-10-03 12:28:34,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:34,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-03 12:28:37,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1267033.3333333333, ans=0.1 2023-10-03 12:28:38,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:28:40,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1267033.3333333333, ans=0.1 2023-10-03 12:28:41,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1267033.3333333333, ans=0.125 2023-10-03 12:28:42,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:28:44,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:28:46,482 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.35 vs. limit=15.0 2023-10-03 12:28:51,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:28:51,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:54,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:28:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:28:58,820 INFO [train.py:1046] (2/4) Epoch 36, batch 4150, loss[loss=0.1574, simple_loss=0.2356, pruned_loss=0.03955, over 23412.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2399, pruned_loss=0.04051, over 4717816.74 frames. ], batch size: 119, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:28:58,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:29:00,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:29:00,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:29:00,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:29:04,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 12:29:04,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:29:04,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 12:29:05,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 12:29:06,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 12:29:06,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:29:06,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1267166.6666666667, ans=0.0 2023-10-03 12:29:08,695 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.74 vs. limit=15.0 2023-10-03 12:29:12,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:29:12,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:29:15,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:29:17,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:29:19,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:29:20,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:29:20,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:29:21,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:29:24,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:29:28,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1267300.0, ans=0.125 2023-10-03 12:29:29,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:29:29,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 12:29:32,465 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.928e+02 2.046e+02 2.330e+02 3.418e+02, threshold=4.092e+02, percent-clipped=0.0 2023-10-03 12:29:32,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 12:29:32,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:29:34,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 12:29:34,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:29:35,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:29:39,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:29:39,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:29:43,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 12:29:46,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:29:48,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:29:50,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 12:29:51,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:29:51,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 12:29:52,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:29:53,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1267366.6666666667, ans=0.125 2023-10-03 12:29:54,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:29:55,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:29:57,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 12:29:57,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:29:57,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:29:58,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:30:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 12:30:03,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:30:03,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:30:03,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:30:05,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 12:30:05,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:30:05,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:30:06,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:30:07,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:30:07,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 12:30:07,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:30:11,172 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.69 vs. limit=10.0 2023-10-03 12:30:13,247 INFO [train.py:1046] (2/4) Epoch 36, batch 4200, loss[loss=0.1748, simple_loss=0.2527, pruned_loss=0.04849, over 23457.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2395, pruned_loss=0.04031, over 4702351.29 frames. ], batch size: 93, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:30:13,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:30:14,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 12:30:16,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:30:18,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:30:19,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:30:19,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1267500.0, ans=0.05 2023-10-03 12:30:20,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:30:20,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:30:21,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 12:30:25,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 12:30:25,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:26,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:30:28,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:30:32,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:30:33,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:30:33,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:34,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 12:30:34,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:30:34,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:34,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:30:36,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:30:37,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:30:40,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 12:30:40,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:42,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:30:46,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:30:47,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:30:49,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:30:50,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:30:50,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 12:30:50,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:30:51,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1267633.3333333333, ans=0.125 2023-10-03 12:30:53,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:30:56,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1267700.0, ans=0.125 2023-10-03 12:30:58,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:31:01,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:31:01,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1267700.0, ans=0.125 2023-10-03 12:31:05,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:31:08,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 12:31:09,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:31:13,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:31:15,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:16,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 12:31:22,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:31:26,596 INFO [train.py:1046] (2/4) Epoch 36, batch 4250, loss[loss=0.1346, simple_loss=0.2177, pruned_loss=0.02582, over 24327.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2373, pruned_loss=0.04018, over 4695334.99 frames. ], batch size: 61, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:31:28,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:31:28,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:31:31,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:37,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:31:37,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 12:31:37,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:31:40,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:41,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1267900.0, ans=0.125 2023-10-03 12:31:44,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:31:46,379 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.57 vs. limit=22.5 2023-10-03 12:31:47,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1267900.0, ans=0.125 2023-10-03 12:31:48,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:48,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:31:52,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:31:52,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:31:52,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:53,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:31:53,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:58,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:31:59,362 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.900e+02 2.182e+02 2.555e+02 3.786e+02, threshold=4.364e+02, percent-clipped=0.0 2023-10-03 12:31:59,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:00,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 12:32:05,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 12:32:05,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:32:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:32:06,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:32:07,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:32:07,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:07,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:32:10,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:32:11,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:32:16,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:32:18,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:20,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 12:32:20,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:32:21,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 12:32:22,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:32:24,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:32:26,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:26,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:32:26,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1268100.0, ans=0.0 2023-10-03 12:32:28,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 12:32:29,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:32:29,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:32:34,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:37,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:38,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:32:38,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:32:41,281 INFO [train.py:1046] (2/4) Epoch 36, batch 4300, loss[loss=0.1533, simple_loss=0.229, pruned_loss=0.03882, over 23451.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2367, pruned_loss=0.03989, over 4698816.61 frames. ], batch size: 285, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:32:41,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:32:42,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:32:42,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:32:42,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 12:32:44,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:32:48,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:32:49,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:32:52,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:33:00,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.96 vs. limit=22.5 2023-10-03 12:33:00,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:33:00,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 12:33:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:33:04,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:33:04,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:33:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 12:33:04,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1268233.3333333333, ans=0.1 2023-10-03 12:33:07,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:33:08,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:33:11,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 12:33:11,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:33:11,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 12:33:14,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:33:15,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:33:18,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:33:18,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:33:20,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:33:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:33:23,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:33:23,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 12:33:24,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 12:33:26,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:33:28,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:28,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:33:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:30,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:33:30,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 12:33:30,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 12:33:30,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 12:33:31,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:33:31,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=1268366.6666666667, ans=0.2 2023-10-03 12:33:32,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 12:33:32,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 12:33:37,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:33:38,758 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 12:33:38,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:33:40,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:33:40,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:33:43,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 12:33:44,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:33:44,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:44,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:33:44,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:33:45,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:33:47,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:33:49,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:33:51,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:51,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:33:55,771 INFO [train.py:1046] (2/4) Epoch 36, batch 4350, loss[loss=0.1931, simple_loss=0.2587, pruned_loss=0.06375, over 19383.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2373, pruned_loss=0.04007, over 4701003.16 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:33:55,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 12:33:57,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:34:01,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:04,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:34:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:34:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:34:10,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:34:16,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:34:17,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:34:17,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:34:21,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:34:23,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:34:24,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:34:27,499 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.937e+02 2.130e+02 2.314e+02 3.607e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 12:34:30,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 12:34:31,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:32,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:36,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:38,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 12:34:41,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:34:42,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:34:45,704 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 12:34:47,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1268700.0, ans=0.125 2023-10-03 12:34:48,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:34:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:34:49,699 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 12:34:49,751 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 12:34:49,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:34:49,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:51,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:34:51,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:34:51,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1268700.0, ans=0.1 2023-10-03 12:34:52,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:34:52,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:34:55,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 12:34:55,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:55,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:34:55,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:57,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 12:34:58,872 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 12:34:58,883 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 12:34:58,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 12:35:02,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:35:02,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:35:02,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:03,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:35:05,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 12:35:07,678 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 12:35:07,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:09,582 INFO [train.py:1046] (2/4) Epoch 36, batch 4400, loss[loss=0.1716, simple_loss=0.2459, pruned_loss=0.04864, over 23752.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2388, pruned_loss=0.04052, over 4696279.82 frames. ], batch size: 179, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:35:11,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1268833.3333333333, ans=0.125 2023-10-03 12:35:12,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:35:12,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:13,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:35:14,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1268833.3333333333, ans=0.125 2023-10-03 12:35:15,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 12:35:15,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 12:35:15,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 12:35:16,647 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 12:35:16,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:35:16,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:35:18,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1268833.3333333333, ans=0.125 2023-10-03 12:35:19,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 12:35:20,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:21,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1268833.3333333333, ans=0.1 2023-10-03 12:35:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:22,376 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 12:35:25,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:25,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 12:35:27,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 12:35:29,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 12:35:29,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 12:35:30,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 12:35:31,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:32,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:35:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:35:33,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:35:36,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 12:35:36,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 12:35:37,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:40,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:35:40,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:43,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:43,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:43,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 12:35:44,948 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 12:35:46,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=1268966.6666666667, ans=0.2 2023-10-03 12:35:49,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:56,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:35:57,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1269033.3333333333, ans=0.125 2023-10-03 12:35:58,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 12:36:02,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:36:05,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:36:06,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:36:08,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 12:36:08,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:36:08,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:36:08,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:36:10,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:36:15,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 12:36:17,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 12:36:20,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 12:36:20,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:20,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 12:36:20,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:36:23,133 INFO [train.py:1046] (2/4) Epoch 36, batch 4450, loss[loss=0.1389, simple_loss=0.2155, pruned_loss=0.03117, over 18699.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2395, pruned_loss=0.04051, over 4684328.59 frames. ], batch size: 40, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:36:23,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:36:23,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1269166.6666666667, ans=0.125 2023-10-03 12:36:24,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 12:36:26,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1269166.6666666667, ans=0.125 2023-10-03 12:36:28,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:36:31,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:31,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:36:38,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:36:38,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:36:40,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1269233.3333333333, ans=0.1 2023-10-03 12:36:41,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:43,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:36:44,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:36:44,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:46,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 12:36:46,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:36:47,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:47,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:36:47,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:36:50,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:36:53,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1269300.0, ans=0.125 2023-10-03 12:36:53,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1269300.0, ans=0.125 2023-10-03 12:36:56,045 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.913e+02 2.111e+02 2.301e+02 3.200e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 12:36:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:36:56,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:36:57,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:36:57,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:59,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:37:03,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 12:37:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 12:37:06,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 12:37:06,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:37:08,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:37:09,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 12:37:13,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:37:16,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:37:16,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 12:37:16,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:16,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:37:16,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:37:16,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:37:17,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:37:20,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:37:22,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 12:37:24,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:37:26,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:37:27,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:37:27,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:27,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:37:29,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:37:32,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 12:37:34,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:37:38,741 INFO [train.py:1046] (2/4) Epoch 36, batch 4500, loss[loss=0.1736, simple_loss=0.2405, pruned_loss=0.05335, over 23792.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2404, pruned_loss=0.04085, over 4683832.40 frames. ], batch size: 212, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:37:40,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:37:41,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 12:37:41,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 12:37:43,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:37:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:37:50,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:37:52,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:37:52,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:37:53,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:38:02,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:38:03,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:38:05,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:38:07,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:38:08,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:38:13,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:38:18,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:38:23,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:38:25,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:38:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 12:38:26,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:27,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:38:29,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:38:29,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:38:32,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:38:32,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 12:38:32,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:38:32,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:36,117 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=15.0 2023-10-03 12:38:36,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:38:36,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:38:40,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:42,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:38:42,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:38:44,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 12:38:45,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 12:38:45,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 12:38:48,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1269766.6666666667, ans=0.0 2023-10-03 12:38:49,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 12:38:52,186 INFO [train.py:1046] (2/4) Epoch 36, batch 4550, loss[loss=0.1383, simple_loss=0.1929, pruned_loss=0.04189, over 19297.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2388, pruned_loss=0.04037, over 4686893.07 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:38:52,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 12:38:52,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:38:57,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:38:57,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:39:00,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:02,129 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.11 vs. limit=10.0 2023-10-03 12:39:04,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:39:06,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-10-03 12:39:07,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:39:08,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:08,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:39:08,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:10,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1269900.0, ans=0.2 2023-10-03 12:39:10,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1269900.0, ans=0.2 2023-10-03 12:39:13,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:13,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:39:16,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:39:17,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 12:39:19,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 12:39:19,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:39:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 12:39:22,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1269966.6666666667, ans=0.125 2023-10-03 12:39:24,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 12:39:25,972 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.879e+02 2.066e+02 2.299e+02 3.391e+02, threshold=4.132e+02, percent-clipped=0.0 2023-10-03 12:39:26,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:39:28,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 12:39:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:39:30,015 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:39:31,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:31,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:33,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:39:36,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 12:39:37,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:39:40,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:40,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:39:43,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:43,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 12:39:44,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 12:39:44,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:39:46,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 12:39:49,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 12:39:49,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:50,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:50,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:39:50,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:50,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:39:52,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:39:53,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 12:39:54,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:39:54,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 12:39:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 12:39:54,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:39:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 12:39:58,566 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.09 vs. limit=10.0 2023-10-03 12:39:59,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:39:59,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:40:00,328 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.55 vs. limit=15.0 2023-10-03 12:40:00,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:40:01,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:40:02,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:40:02,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:40:05,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:40:06,945 INFO [train.py:1046] (2/4) Epoch 36, batch 4600, loss[loss=0.1785, simple_loss=0.2671, pruned_loss=0.04491, over 24445.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2366, pruned_loss=0.04028, over 4681246.40 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:40:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:08,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:40:13,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:40:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:40:13,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:15,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 12:40:15,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:40:20,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:40:20,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:22,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:24,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1270233.3333333333, ans=0.0 2023-10-03 12:40:28,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1270233.3333333333, ans=0.125 2023-10-03 12:40:30,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 12:40:31,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:34,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:36,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:40:36,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:38,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1270300.0, ans=0.125 2023-10-03 12:40:41,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 12:40:41,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:40:43,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:40:48,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:48,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:40:49,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:40:52,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 12:40:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:40:58,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:00,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:00,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 12:41:02,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:02,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 12:41:03,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1270366.6666666667, ans=0.0 2023-10-03 12:41:05,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:05,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 12:41:05,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1270433.3333333333, ans=0.1 2023-10-03 12:41:06,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:06,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:07,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:09,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:41:09,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:09,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 12:41:10,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 12:41:10,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 12:41:10,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:13,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:41:13,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:13,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:20,677 INFO [train.py:1046] (2/4) Epoch 36, batch 4650, loss[loss=0.1725, simple_loss=0.2461, pruned_loss=0.04948, over 23506.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2359, pruned_loss=0.03972, over 4691233.64 frames. ], batch size: 106, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:41:24,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:41:26,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:41:27,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:27,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:41:27,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:27,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:41:30,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:33,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 12:41:38,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:41:39,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 12:41:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:41:41,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 12:41:41,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:41:41,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 12:41:42,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 12:41:42,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:42,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:41:46,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:41:46,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1270566.6666666667, ans=0.2 2023-10-03 12:41:47,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:47,441 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 12:41:51,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:52,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 12:41:54,101 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.890e+02 2.118e+02 2.487e+02 4.002e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 12:41:55,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:55,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:41:55,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1270633.3333333333, ans=0.1 2023-10-03 12:41:57,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 12:41:57,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:41:59,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:42:02,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:07,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:09,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:42:10,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:10,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:42:13,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 12:42:14,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 12:42:16,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 12:42:16,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 12:42:16,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1270700.0, ans=0.125 2023-10-03 12:42:17,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:24,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:42:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:42:24,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 12:42:24,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:42:26,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:42:28,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:42:30,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:42:30,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:42:31,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:42:33,002 INFO [train.py:1046] (2/4) Epoch 36, batch 4700, loss[loss=0.1609, simple_loss=0.248, pruned_loss=0.03695, over 24670.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2368, pruned_loss=0.04, over 4700542.60 frames. ], batch size: 73, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:42:35,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:35,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:42:35,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:42:37,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 12:42:39,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:42:39,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 12:42:46,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:47,348 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.40 vs. limit=22.5 2023-10-03 12:42:48,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:48,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:42:49,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:42:51,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:42:54,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 12:42:55,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 12:42:56,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:58,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:42:58,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:43:02,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:43:03,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1270966.6666666667, ans=0.1 2023-10-03 12:43:08,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:43:10,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:43:11,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:43:13,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1270966.6666666667, ans=0.125 2023-10-03 12:43:16,038 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:43:17,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 12:43:19,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:43:20,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:23,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 12:43:24,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:43:28,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:43:30,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 12:43:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:31,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:43:33,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1271100.0, ans=0.125 2023-10-03 12:43:34,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:43:34,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1271100.0, ans=0.05 2023-10-03 12:43:35,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:43:35,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 12:43:37,347 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 12:43:39,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:43:42,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:42,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:42,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 12:43:42,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:46,291 INFO [train.py:1046] (2/4) Epoch 36, batch 4750, loss[loss=0.1559, simple_loss=0.2422, pruned_loss=0.03478, over 24591.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2373, pruned_loss=0.04004, over 4704033.85 frames. ], batch size: 71, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:43:47,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 12:43:50,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:43:52,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:43:53,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1271166.6666666667, ans=0.2 2023-10-03 12:43:54,825 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.41 vs. limit=22.5 2023-10-03 12:43:55,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:43:56,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:43:58,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 12:43:58,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:00,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 12:44:02,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:44:02,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:44:03,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:08,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 12:44:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:44:15,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 12:44:16,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:20,882 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.859e+02 2.065e+02 2.331e+02 3.483e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-03 12:44:21,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:44:21,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:44:21,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:44:21,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1271300.0, ans=0.0 2023-10-03 12:44:22,303 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 12:44:22,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 12:44:26,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1271300.0, ans=0.125 2023-10-03 12:44:27,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 12:44:28,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:28,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1271366.6666666667, ans=0.125 2023-10-03 12:44:30,256 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:44:31,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:44:32,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:44:32,692 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 12:44:32,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:44:32,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1271366.6666666667, ans=0.0 2023-10-03 12:44:35,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:44:38,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:44:40,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 12:44:40,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 12:44:41,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:44:41,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:44:41,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:41,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1271366.6666666667, ans=0.0 2023-10-03 12:44:44,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:44:44,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 12:44:47,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 12:44:48,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:44:52,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:44:52,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 12:44:52,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:53,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:44:55,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:44:56,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:44:56,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:44:58,361 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.24 vs. limit=15.0 2023-10-03 12:44:58,858 INFO [train.py:1046] (2/4) Epoch 36, batch 4800, loss[loss=0.1714, simple_loss=0.2517, pruned_loss=0.04554, over 23318.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2383, pruned_loss=0.04019, over 4710681.35 frames. ], batch size: 105, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:45:00,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:00,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 12:45:01,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 12:45:01,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 12:45:01,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1271500.0, ans=0.125 2023-10-03 12:45:03,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1271500.0, ans=0.0 2023-10-03 12:45:04,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:45:04,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:04,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 12:45:10,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:10,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:12,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1271566.6666666667, ans=0.0 2023-10-03 12:45:15,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:45:18,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:18,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:18,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 12:45:20,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:45:20,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:45:20,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:45:26,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:27,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:27,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:45:27,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:27,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 12:45:27,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:30,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:31,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:34,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:36,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:36,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:45:37,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:45:38,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:40,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 12:45:40,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 12:45:41,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1271633.3333333333, ans=10.0 2023-10-03 12:45:42,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:42,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:45:42,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:45:42,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:45:42,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:45:44,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:45:45,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:45:50,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:53,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:45:57,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 12:45:59,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:59,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:59,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:46:00,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:46:06,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:46:07,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:46:07,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:07,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:46:07,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:46:08,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:46:12,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:12,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:12,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:46:12,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 12:46:13,506 INFO [train.py:1046] (2/4) Epoch 36, batch 4850, loss[loss=0.161, simple_loss=0.2484, pruned_loss=0.03678, over 24460.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04003, over 4714413.98 frames. ], batch size: 63, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:46:15,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 12:46:15,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:46:15,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:46:17,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:46:17,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:20,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:46:26,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 12:46:27,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:30,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:46:32,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:46:32,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:34,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:36,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1271900.0, ans=0.125 2023-10-03 12:46:37,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:46:37,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:46:37,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 12:46:38,358 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=15.0 2023-10-03 12:46:42,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:46:44,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:46:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:46:44,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:46:45,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 12:46:48,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:46:48,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:46:52,111 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.942e+02 2.103e+02 2.430e+02 3.861e+02, threshold=4.206e+02, percent-clipped=0.0 2023-10-03 12:46:52,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:46:52,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 12:46:53,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 12:46:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:47:01,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:47:03,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 12:47:04,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:47:04,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:47:05,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:47:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 12:47:08,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:47:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 12:47:10,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:11,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:47:13,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 12:47:14,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1272100.0, ans=0.125 2023-10-03 12:47:18,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1272100.0, ans=0.0 2023-10-03 12:47:19,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:47:25,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:47:25,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:47:28,169 INFO [train.py:1046] (2/4) Epoch 36, batch 4900, loss[loss=0.1778, simple_loss=0.262, pruned_loss=0.04678, over 23964.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2387, pruned_loss=0.03958, over 4716401.73 frames. ], batch size: 80, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:47:30,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 12:47:30,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:47:33,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:47:35,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:35,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:47:36,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1272166.6666666667, ans=0.0 2023-10-03 12:47:39,104 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.54 vs. limit=22.5 2023-10-03 12:47:39,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 12:47:45,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 12:47:48,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 12:47:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 12:47:50,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:47:50,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:47:50,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:47:50,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:47:52,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 12:47:55,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 12:47:55,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:47:56,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:47:56,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:47:57,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:47:59,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:00,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:00,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 12:48:02,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:48:03,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:48:03,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 12:48:03,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 12:48:06,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 12:48:09,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:48:09,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:48:09,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:48:10,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:10,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 12:48:10,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:48:12,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 12:48:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:16,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:48:19,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:48:21,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 12:48:21,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1272366.6666666667, ans=0.0 2023-10-03 12:48:23,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:48:23,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 12:48:23,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1272366.6666666667, ans=0.0 2023-10-03 12:48:24,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 12:48:30,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:48:31,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:48:32,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 12:48:34,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:48:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:48:35,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:40,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:48:40,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:48:40,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:48:40,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 12:48:41,767 INFO [train.py:1046] (2/4) Epoch 36, batch 4950, loss[loss=0.1489, simple_loss=0.2226, pruned_loss=0.03761, over 24340.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2372, pruned_loss=0.03904, over 4722547.40 frames. ], batch size: 56, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:48:41,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:48:44,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:48:44,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:48:45,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.48 vs. limit=15.0 2023-10-03 12:48:47,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 12:48:47,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 12:48:47,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:48:49,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 12:48:49,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:49,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:48:51,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:48:51,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:48:53,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:53,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:48:55,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:48:56,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:48:59,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:59,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:48:59,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1272566.6666666667, ans=0.125 2023-10-03 12:49:03,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:49:06,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1272566.6666666667, ans=0.125 2023-10-03 12:49:07,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:09,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:49:11,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:12,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:13,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:49:14,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 12:49:16,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 12:49:17,641 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.932e+02 2.177e+02 2.793e+02 4.403e+02, threshold=4.354e+02, percent-clipped=2.0 2023-10-03 12:49:17,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:19,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:49:19,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:49:21,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:49:21,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:49:21,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:49:23,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:49:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:49:28,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:49:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:30,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:30,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 12:49:30,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:49:31,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:49:31,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1272700.0, ans=0.0 2023-10-03 12:49:33,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1272700.0, ans=0.125 2023-10-03 12:49:34,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:49:35,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:49:35,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:49:36,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:37,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:49:39,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:49:39,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:49:40,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:49:40,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:49:41,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 12:49:46,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:49:46,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1272766.6666666667, ans=0.125 2023-10-03 12:49:52,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 12:49:52,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:49:52,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.41 vs. limit=15.0 2023-10-03 12:49:55,798 INFO [train.py:1046] (2/4) Epoch 36, batch 5000, loss[loss=0.1612, simple_loss=0.2186, pruned_loss=0.05196, over 19417.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03918, over 4733716.95 frames. ], batch size: 389, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:49:58,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:58,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:50:00,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 12:50:01,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 12:50:02,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:50:04,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 12:50:04,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:50:04,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:50:05,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 12:50:05,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:07,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:50:07,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 12:50:07,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:50:08,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.70 vs. limit=10.0 2023-10-03 12:50:08,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:50:10,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 12:50:11,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 12:50:11,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:50:13,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 12:50:13,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:50:13,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:14,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:50:14,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 12:50:14,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 12:50:15,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 12:50:15,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:17,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:18,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1272900.0, ans=0.125 2023-10-03 12:50:19,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 12:50:19,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:50:21,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:22,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:50:24,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 12:50:25,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 12:50:25,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:50:28,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:50:31,349 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 12:50:34,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:50:36,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:36,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:50:40,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 12:50:41,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:41,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1273033.3333333333, ans=0.125 2023-10-03 12:50:43,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:50:43,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:50:44,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 12:50:44,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:50:48,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:50:50,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:50:55,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1273100.0, ans=0.125 2023-10-03 12:50:56,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 12:50:59,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:06,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:51:08,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:08,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:51:08,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:51:08,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:51:09,427 INFO [train.py:1046] (2/4) Epoch 36, batch 5050, loss[loss=0.1712, simple_loss=0.2556, pruned_loss=0.04334, over 24024.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2377, pruned_loss=0.03951, over 4714820.24 frames. ], batch size: 86, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:51:09,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:51:09,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:12,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:12,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 12:51:12,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:51:15,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:51:16,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:51:16,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 12:51:18,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:51:18,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:51:21,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:51:21,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:51:22,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:51:31,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.01 vs. limit=15.0 2023-10-03 12:51:33,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 12:51:33,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:51:34,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:51:34,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1273233.3333333333, ans=0.125 2023-10-03 12:51:36,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 12:51:36,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:51:37,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:38,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:51:40,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:51:40,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 12:51:40,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 12:51:40,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.66 vs. limit=15.0 2023-10-03 12:51:43,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:43,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:51:46,454 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 1.880e+02 2.077e+02 2.440e+02 4.009e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 12:51:46,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:47,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 12:51:49,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:51:50,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 12:51:52,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:51:53,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:51:54,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:51:55,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:51:57,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:51:59,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:52:00,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:01,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:52:01,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:52:01,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 12:52:02,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:52:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:52:07,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:52:07,037 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 12:52:07,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:52:07,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1273433.3333333333, ans=0.125 2023-10-03 12:52:08,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:52:09,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:11,129 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 12:52:13,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:52:13,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 12:52:13,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:19,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:52:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:19,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 12:52:21,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 12:52:22,589 INFO [train.py:1046] (2/4) Epoch 36, batch 5100, loss[loss=0.1575, simple_loss=0.2347, pruned_loss=0.04016, over 23218.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2381, pruned_loss=0.03976, over 4705847.41 frames. ], batch size: 105, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:52:22,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:22,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:52:24,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:52:24,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1273500.0, ans=0.125 2023-10-03 12:52:26,091 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 12:52:29,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:52:32,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 12:52:32,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 12:52:33,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:36,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:52:39,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:52:39,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 12:52:39,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 12:52:43,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:52:43,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:52:47,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:50,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 12:52:51,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:52:52,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1273633.3333333333, ans=0.2 2023-10-03 12:52:53,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:53,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:52:56,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:52:56,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:52:56,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 12:52:59,528 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 12:53:01,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:53:01,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 12:53:01,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 12:53:05,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:53:12,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:15,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 12:53:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 12:53:15,589 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 12:53:17,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 12:53:17,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:53:20,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 12:53:24,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 12:53:26,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:53:27,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:53:31,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1273766.6666666667, ans=0.0 2023-10-03 12:53:32,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 12:53:34,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:53:35,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 12:53:36,860 INFO [train.py:1046] (2/4) Epoch 36, batch 5150, loss[loss=0.1541, simple_loss=0.2406, pruned_loss=0.0338, over 24692.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2391, pruned_loss=0.03994, over 4711706.68 frames. ], batch size: 65, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:53:37,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-10-03 12:53:38,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1273833.3333333333, ans=0.125 2023-10-03 12:53:39,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:53:39,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:53:39,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:53:41,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:53:42,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:53:42,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:53:42,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1273833.3333333333, ans=0.125 2023-10-03 12:53:43,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 12:53:43,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 12:53:43,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 12:53:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:53:43,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 12:53:45,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:46,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 12:53:48,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:53:49,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:53:53,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:53:53,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 12:53:55,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:55,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:53:57,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:53:57,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:53:57,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:53:58,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:53:58,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:53:58,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 12:54:00,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:54:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:54:04,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:54:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 12:54:08,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:54:11,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:54:11,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1273966.6666666667, ans=0.125 2023-10-03 12:54:12,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 12:54:13,777 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.870e+02 2.045e+02 2.275e+02 3.519e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 12:54:15,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:54:21,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:54:22,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:54:27,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:54:27,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:54:30,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 12:54:31,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1274033.3333333333, ans=0.125 2023-10-03 12:54:33,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:54:33,845 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.29 vs. limit=15.0 2023-10-03 12:54:34,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:54:34,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:54:37,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:54:38,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:54:39,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 12:54:43,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:54:45,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:54:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:54:47,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:54:47,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:54:47,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:54:47,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:54:49,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:54:50,974 INFO [train.py:1046] (2/4) Epoch 36, batch 5200, loss[loss=0.1433, simple_loss=0.2177, pruned_loss=0.03451, over 24549.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2388, pruned_loss=0.04033, over 4705326.72 frames. ], batch size: 60, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:54:52,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:54:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:54:57,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:01,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 12:55:03,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:55:03,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:05,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:05,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:55:05,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:06,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1274233.3333333333, ans=0.125 2023-10-03 12:55:07,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 12:55:10,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:55:10,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:13,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 12:55:13,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.51 vs. limit=15.0 2023-10-03 12:55:15,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:55:16,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:55:17,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 12:55:17,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 12:55:20,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 12:55:20,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:20,148 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 12:55:20,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:20,428 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:55:22,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:24,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:55:25,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 12:55:25,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:55:28,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:28,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 12:55:30,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 12:55:30,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 12:55:34,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 12:55:35,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:55:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:55:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:55:44,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 12:55:44,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:44,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:55:44,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:46,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:55:50,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:55:50,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:55:55,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:56,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:55:56,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:02,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:56:03,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 12:56:05,239 INFO [train.py:1046] (2/4) Epoch 36, batch 5250, loss[loss=0.1644, simple_loss=0.2468, pruned_loss=0.04106, over 24019.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2394, pruned_loss=0.04009, over 4717370.90 frames. ], batch size: 80, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:56:05,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:56:05,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:56:06,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:06,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:56:06,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:56:08,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:56:11,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:56:12,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:56:14,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:56:18,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:56:19,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:56:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:56:24,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:56:25,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 12:56:25,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:56:27,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:31,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1274566.6666666667, ans=0.125 2023-10-03 12:56:39,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1274633.3333333333, ans=0.0 2023-10-03 12:56:40,460 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 1.965e+02 2.175e+02 2.587e+02 3.682e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-03 12:56:41,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1274633.3333333333, ans=0.0 2023-10-03 12:56:49,745 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:56:57,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1274700.0, ans=0.125 2023-10-03 12:57:13,149 INFO [train.py:1046] (2/4) Epoch 36, batch 5300, loss[loss=0.1692, simple_loss=0.2503, pruned_loss=0.04403, over 24073.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2378, pruned_loss=0.04035, over 4694692.34 frames. ], batch size: 80, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:57:27,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:57:27,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 12:57:27,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 12:57:27,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:27,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:27,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:27,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:27,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:27,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:57:28,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:28,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:57:28,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:57:28,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 12:57:28,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 12:57:28,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 12:57:28,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:57:28,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 12:57:28,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 12:57:28,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:29,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:29,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:57:29,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:57:29,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:57:30,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:57:30,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:30,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:30,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:57:30,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:30,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:57:30,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:30,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:57:30,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 12:57:30,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:57:31,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:31,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 12:57:31,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 12:57:31,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:57:31,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:57:31,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 12:57:31,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 12:57:31,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:57:32,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:57:32,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:57:32,519 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 12:57:32,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 12:57:32,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:57:32,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:32,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 12:57:32,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 12:57:32,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 12:57:33,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:57:39,478 INFO [train.py:1046] (2/4) Epoch 37, batch 0, loss[loss=0.2159, simple_loss=0.2835, pruned_loss=0.07417, over 19127.00 frames. ], tot_loss[loss=0.2159, simple_loss=0.2835, pruned_loss=0.07417, over 19127.00 frames. ], batch size: 388, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 12:57:39,478 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 12:57:51,185 INFO [train.py:1078] (2/4) Epoch 37, validation: loss=0.3206, simple_loss=0.2712, pruned_loss=0.185, over 1125622.00 frames. 2023-10-03 12:57:51,186 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 12:57:52,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 12:57:54,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:57:54,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1274913.3333333333, ans=0.125 2023-10-03 12:57:55,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:57:55,999 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.40 vs. limit=15.0 2023-10-03 12:57:57,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1274913.3333333333, ans=0.125 2023-10-03 12:58:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:01,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:58:02,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:02,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 12:58:03,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 12:58:06,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:06,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1274980.0, ans=0.125 2023-10-03 12:58:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:10,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:11,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:12,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:58:12,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:58:13,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 12:58:13,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1274980.0, ans=0.0 2023-10-03 12:58:14,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:58:22,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:58:22,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:26,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 12:58:26,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1275046.6666666667, ans=0.09899494936611666 2023-10-03 12:58:29,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:58:29,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:58:30,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:58:33,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1275113.3333333333, ans=0.1 2023-10-03 12:58:34,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:58:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:58:42,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 12:58:45,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 12:58:47,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:58:47,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:48,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:58:48,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:52,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 12:58:53,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:55,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:59,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:59:00,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=1275180.0, ans=0.02 2023-10-03 12:59:02,459 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 12:59:03,738 INFO [train.py:1046] (2/4) Epoch 37, batch 50, loss[loss=0.1516, simple_loss=0.2309, pruned_loss=0.03609, over 24303.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2413, pruned_loss=0.03947, over 1070234.87 frames. ], batch size: 56, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 12:59:03,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:59:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:59:09,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:59:09,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 12:59:10,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:59:10,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:59:12,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:59:13,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:59:16,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:59:19,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 12:59:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:19,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1275313.3333333333, ans=0.2 2023-10-03 12:59:21,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1275313.3333333333, ans=0.125 2023-10-03 12:59:21,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1275313.3333333333, ans=0.125 2023-10-03 12:59:22,971 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.909e+02 2.063e+02 2.312e+02 4.693e+02, threshold=4.126e+02, percent-clipped=2.0 2023-10-03 12:59:25,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:59:25,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 12:59:28,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 12:59:30,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:59:31,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:59:31,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:33,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:59:34,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:59:34,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:59:34,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:41,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:59:44,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:59:44,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:59:45,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 12:59:47,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:59:48,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:59:48,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 12:59:49,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:59:51,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 12:59:52,367 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-10-03 12:59:52,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.05 vs. limit=15.0 2023-10-03 12:59:59,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:59:59,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:00:00,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1275446.6666666667, ans=0.1 2023-10-03 13:00:01,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:02,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:00:02,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:00:05,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.14 vs. limit=15.0 2023-10-03 13:00:05,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 13:00:07,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 13:00:07,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:08,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:00:08,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:00:08,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:00:08,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 13:00:08,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1275513.3333333333, ans=0.125 2023-10-03 13:00:09,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 13:00:11,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 13:00:12,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:12,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:00:14,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 13:00:14,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 13:00:15,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:15,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:00:17,028 INFO [train.py:1046] (2/4) Epoch 37, batch 100, loss[loss=0.1396, simple_loss=0.2171, pruned_loss=0.03108, over 24310.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2409, pruned_loss=0.03945, over 1888397.60 frames. ], batch size: 56, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:00:17,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:00:17,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:00:19,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:00:23,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:00:26,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:00:27,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 13:00:27,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:29,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:00:30,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:00:30,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:00:30,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:00:30,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:00:32,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 13:00:34,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:00:34,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:35,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:00:35,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:00:38,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 13:00:39,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:40,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:00:42,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:00:42,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1275646.6666666667, ans=0.09899494936611666 2023-10-03 13:00:43,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:00:48,033 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 13:00:48,056 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 13:00:49,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:00:49,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:00:55,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:00:57,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:58,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:03,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:04,829 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 13:01:06,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:01:09,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:01:10,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:01:12,572 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=15.0 2023-10-03 13:01:13,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:17,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:18,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:01:21,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:01:23,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:25,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:25,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:25,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:01:26,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:28,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 13:01:28,270 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 13:01:28,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:28,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:01:29,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:29,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:29,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:01:31,003 INFO [train.py:1046] (2/4) Epoch 37, batch 150, loss[loss=0.1632, simple_loss=0.2314, pruned_loss=0.04753, over 23794.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2404, pruned_loss=0.03941, over 2513863.35 frames. ], batch size: 179, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:01:31,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:01:31,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:01:31,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:31,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1275913.3333333333, ans=0.1 2023-10-03 13:01:32,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:33,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:33,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:01:35,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:01:37,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:40,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:01:40,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:01:41,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:43,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:43,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:46,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:01:46,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:48,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 13:01:49,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 13:01:49,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 13:01:50,627 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.933e+02 2.120e+02 2.360e+02 3.157e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-03 13:01:52,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:01:52,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:01:53,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:01:55,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:55,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:57,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:57,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:58,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1275980.0, ans=0.1 2023-10-03 13:01:59,936 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 13:02:01,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:02:05,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.74 vs. limit=15.0 2023-10-03 13:02:05,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:02:08,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:02:09,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 13:02:09,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1276046.6666666667, ans=0.125 2023-10-03 13:02:11,071 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.56 vs. limit=22.5 2023-10-03 13:02:11,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:02:13,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:02:13,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:02:15,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:02:17,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:02:18,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:02:19,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:19,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 13:02:22,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1276113.3333333333, ans=0.125 2023-10-03 13:02:24,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:25,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:25,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:02:25,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:02:29,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:29,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1276180.0, ans=0.2 2023-10-03 13:02:31,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 13:02:34,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:02:35,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:02:35,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:02:37,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:02:37,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 13:02:39,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:02:39,187 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 13:02:42,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:02:45,194 INFO [train.py:1046] (2/4) Epoch 37, batch 200, loss[loss=0.1361, simple_loss=0.2155, pruned_loss=0.02832, over 24285.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2415, pruned_loss=0.04032, over 3002627.10 frames. ], batch size: 56, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:02:45,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:02:46,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:02:46,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1276246.6666666667, ans=0.0 2023-10-03 13:02:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 13:02:50,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:02:50,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:53,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 13:02:55,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:02:56,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:58,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:01,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:03:01,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:03:01,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:09,282 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.85 vs. limit=22.5 2023-10-03 13:03:20,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:03:21,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:03:21,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:03:22,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:03:24,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:03:24,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:03:24,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:25,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:03:25,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:03:25,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:03:27,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 13:03:27,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:03:29,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:31,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1276446.6666666667, ans=0.1 2023-10-03 13:03:33,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:03:37,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:03:44,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:44,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:03:50,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:50,782 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.63 vs. limit=10.0 2023-10-03 13:03:52,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 13:03:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:54,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:03:54,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:03:54,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:03:55,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 13:03:56,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:03:57,736 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 13:03:59,121 INFO [train.py:1046] (2/4) Epoch 37, batch 250, loss[loss=0.1467, simple_loss=0.2216, pruned_loss=0.03591, over 24433.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2408, pruned_loss=0.04049, over 3379219.70 frames. ], batch size: 58, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:03:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:00,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:04:02,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:03,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:04:06,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:04:06,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:07,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:04:09,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:04:12,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1276646.6666666667, ans=0.0 2023-10-03 13:04:19,364 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.868e+02 1.988e+02 2.171e+02 2.742e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 13:04:20,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:04:23,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:04:23,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:04:30,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:04:31,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:04:31,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:04:32,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:04:34,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:04:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:04:35,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:04:37,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:04:39,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 13:04:39,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:04:39,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:04:41,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:04:41,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:04:41,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:04:41,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:04:41,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:04:44,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:04:44,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:04:44,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:04:48,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:04:49,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1276780.0, ans=0.0 2023-10-03 13:04:52,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:04:54,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.74 vs. limit=22.5 2023-10-03 13:04:56,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:04:57,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1276846.6666666667, ans=0.125 2023-10-03 13:05:00,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:05:02,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:05:06,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 13:05:08,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:05:08,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:05:09,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 13:05:09,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:05:11,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:05:11,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 13:05:12,713 INFO [train.py:1046] (2/4) Epoch 37, batch 300, loss[loss=0.1496, simple_loss=0.2182, pruned_loss=0.04047, over 23861.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2387, pruned_loss=0.03995, over 3671954.09 frames. ], batch size: 212, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:05:16,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:05:16,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:05:20,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:05:20,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 13:05:22,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:05:22,491 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:05:23,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:05:23,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 13:05:23,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:05:25,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:05:29,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:05:29,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 13:05:32,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 13:05:34,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:36,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:05:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 13:05:38,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:05:39,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:05:40,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1276980.0, ans=0.05 2023-10-03 13:05:43,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:05:43,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:05:49,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:05:49,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 13:05:49,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1277046.6666666667, ans=0.125 2023-10-03 13:05:50,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:05:51,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:53,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1277046.6666666667, ans=0.0 2023-10-03 13:05:54,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 13:05:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:05:55,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.09 vs. limit=22.5 2023-10-03 13:05:59,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:06:01,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:06:01,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 13:06:06,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:06:09,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:10,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:06:10,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 13:06:10,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:06:12,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:14,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 13:06:15,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:15,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:16,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:06:18,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:18,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:23,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1277180.0, ans=0.125 2023-10-03 13:06:25,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:06:25,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 13:06:26,764 INFO [train.py:1046] (2/4) Epoch 37, batch 350, loss[loss=0.1792, simple_loss=0.2623, pruned_loss=0.0481, over 24426.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03961, over 3903252.26 frames. ], batch size: 77, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:06:26,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:31,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-10-03 13:06:33,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:06:36,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:36,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:38,603 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.30 vs. limit=22.5 2023-10-03 13:06:39,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 13:06:40,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.39 vs. limit=22.5 2023-10-03 13:06:41,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:06:41,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 13:06:42,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1277313.3333333333, ans=0.125 2023-10-03 13:06:44,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:46,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 13:06:48,106 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.877e+02 2.152e+02 2.421e+02 3.801e+02, threshold=4.304e+02, percent-clipped=0.0 2023-10-03 13:06:48,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:48,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1277313.3333333333, ans=0.125 2023-10-03 13:06:49,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 13:06:49,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1277313.3333333333, ans=0.2 2023-10-03 13:06:51,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:06:52,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:53,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:06:55,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:06:55,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:06:55,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:06:55,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:55,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:06:57,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:06:57,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:07:06,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:06,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:07:06,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:07:06,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:11,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 13:07:11,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:07:14,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:14,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:14,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:07:16,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 13:07:18,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:19,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 13:07:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 13:07:21,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:24,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:07:24,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 13:07:25,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1277513.3333333333, ans=0.0 2023-10-03 13:07:28,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:30,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:07:32,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:32,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:32,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:35,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:39,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:07:40,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:07:41,931 INFO [train.py:1046] (2/4) Epoch 37, batch 400, loss[loss=0.1604, simple_loss=0.2542, pruned_loss=0.03326, over 24322.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03957, over 4070360.86 frames. ], batch size: 74, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:07:42,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 13:07:42,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:42,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:43,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:07:44,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:47,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:49,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:50,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 13:07:52,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 13:07:52,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:52,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1277580.0, ans=0.0 2023-10-03 13:07:53,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 13:07:53,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:59,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:07:59,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:59,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 13:07:59,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:07:59,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:59,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:08:01,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:08:03,795 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 13:08:03,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 13:08:08,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:08:09,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:08:09,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 13:08:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 13:08:12,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:08:13,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.84 vs. limit=15.0 2023-10-03 13:08:13,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:08:22,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 13:08:22,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1277713.3333333333, ans=0.125 2023-10-03 13:08:25,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:08:25,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 13:08:27,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:08:28,235 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:08:28,739 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-10-03 13:08:29,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:08:29,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 13:08:34,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:08:35,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1277780.0, ans=0.0 2023-10-03 13:08:37,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:08:39,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:08:41,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:08:41,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 13:08:44,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 13:08:45,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 13:08:47,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:08:47,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:08:48,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1277846.6666666667, ans=0.125 2023-10-03 13:08:51,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 13:08:52,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:08:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:08:54,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:08:55,620 INFO [train.py:1046] (2/4) Epoch 37, batch 450, loss[loss=0.1426, simple_loss=0.2189, pruned_loss=0.03314, over 24477.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2387, pruned_loss=0.04016, over 4206881.55 frames. ], batch size: 58, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:08:55,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 13:08:55,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:08:57,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:08:57,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:08:57,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 13:08:58,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:08:58,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1277913.3333333333, ans=0.125 2023-10-03 13:09:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:09:01,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:09:09,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:10,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:09:11,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 13:09:11,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 13:09:15,841 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.823e+02 2.027e+02 2.290e+02 3.468e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 13:09:17,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:09:20,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:21,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:09:27,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:09:27,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:09:27,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1278046.6666666667, ans=0.0 2023-10-03 13:09:28,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 13:09:30,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 13:09:31,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 13:09:31,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:09:32,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:09:33,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:09:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 13:09:34,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 13:09:36,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:37,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:09:39,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 13:09:42,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:09:42,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:09:43,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:09:44,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 13:09:47,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:09:49,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:09:50,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:09:52,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 13:09:55,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:09:56,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 13:09:57,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 13:09:59,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:10:05,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:10:06,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:10:06,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1278246.6666666667, ans=0.125 2023-10-03 13:10:08,434 INFO [train.py:1046] (2/4) Epoch 37, batch 500, loss[loss=0.1707, simple_loss=0.253, pruned_loss=0.04423, over 24402.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2393, pruned_loss=0.03996, over 4323717.12 frames. ], batch size: 77, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:10:08,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:10:08,527 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 13:10:12,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:10:12,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:10:12,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:10:12,614 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 13:10:15,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 13:10:15,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:10:18,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:10:22,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 13:10:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:10:27,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:10:27,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:10:27,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:34,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1278313.3333333333, ans=0.125 2023-10-03 13:10:35,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:35,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:10:37,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:10:37,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:37,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 13:10:37,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:10:41,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:10:42,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:10:42,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:10:42,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:43,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 13:10:45,215 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 13:10:47,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:10:49,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:10:53,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 13:10:56,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:10:56,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:10:59,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1278446.6666666667, ans=0.2 2023-10-03 13:11:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:03,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:11:08,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:11:11,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 13:11:11,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:11,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:11:15,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 13:11:15,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:11:17,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:20,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1278580.0, ans=0.125 2023-10-03 13:11:21,322 INFO [train.py:1046] (2/4) Epoch 37, batch 550, loss[loss=0.1471, simple_loss=0.2231, pruned_loss=0.03557, over 24317.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2399, pruned_loss=0.04003, over 4423954.91 frames. ], batch size: 56, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:11:22,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 13:11:24,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 13:11:24,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:24,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 13:11:26,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:11:26,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:26,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:26,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:26,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:11:28,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:11:29,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:30,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 13:11:30,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:11:32,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.52 vs. limit=15.0 2023-10-03 13:11:35,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:11:36,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:36,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1278646.6666666667, ans=0.05 2023-10-03 13:11:41,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:11:41,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:42,344 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.91 vs. limit=22.5 2023-10-03 13:11:42,924 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.845e+02 2.021e+02 2.216e+02 2.763e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 13:11:45,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 13:11:47,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 13:11:47,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1278646.6666666667, ans=0.1 2023-10-03 13:11:47,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1278646.6666666667, ans=0.0 2023-10-03 13:11:48,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:11:51,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:11:52,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:11:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:11:57,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:11:57,592 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 13:11:58,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:58,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:12:00,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1278713.3333333333, ans=0.125 2023-10-03 13:12:01,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:12:03,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:12:03,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:12:04,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:05,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 13:12:05,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 13:12:07,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:07,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:12:09,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:12:09,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:12:10,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:12:12,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:12:15,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:12:16,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:16,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 13:12:18,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:12:19,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:19,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:12:21,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:21,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:12:21,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:12:28,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 13:12:31,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1278846.6666666667, ans=0.2 2023-10-03 13:12:32,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 13:12:32,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1278846.6666666667, ans=0.2 2023-10-03 13:12:34,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:12:34,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:12:35,424 INFO [train.py:1046] (2/4) Epoch 37, batch 600, loss[loss=0.1608, simple_loss=0.2375, pruned_loss=0.04207, over 23318.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2399, pruned_loss=0.04043, over 4475467.13 frames. ], batch size: 119, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:12:35,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:42,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:12:43,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:12:43,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1278913.3333333333, ans=0.125 2023-10-03 13:12:45,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 13:12:47,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:12:47,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:12:50,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:50,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1278980.0, ans=0.2 2023-10-03 13:12:51,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 13:12:52,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:12:58,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 13:13:00,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:13:00,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:13:02,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:13:07,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:13:07,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:13:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:13:07,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.54 vs. limit=22.5 2023-10-03 13:13:16,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:13:21,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:13:23,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:13:23,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:13:30,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 13:13:34,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:13:34,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:13:36,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1279180.0, ans=0.2 2023-10-03 13:13:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 13:13:38,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:13:39,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1279180.0, ans=0.1 2023-10-03 13:13:42,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 13:13:42,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:13:44,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:13:49,928 INFO [train.py:1046] (2/4) Epoch 37, batch 650, loss[loss=0.1473, simple_loss=0.2259, pruned_loss=0.03435, over 24670.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2386, pruned_loss=0.04024, over 4522145.95 frames. ], batch size: 65, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:13:50,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 13:13:52,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:13:54,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:13:55,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:13:56,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:00,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 13:14:00,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:14:04,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:14:04,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:08,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:12,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 13:14:13,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.868e+02 2.019e+02 2.279e+02 4.165e+02, threshold=4.037e+02, percent-clipped=1.0 2023-10-03 13:14:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:14:15,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:18,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:14:18,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1279380.0, ans=0.125 2023-10-03 13:14:19,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:14:21,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:21,342 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:14:22,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:22,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:14:23,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:25,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:14:26,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:14:26,563 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 13:14:26,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:28,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:14:31,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:31,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:14:32,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:14:32,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:14:33,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 13:14:35,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:14:35,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:14:35,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:14:35,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:14:36,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:14:38,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 13:14:40,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 13:14:40,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:40,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:14:40,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:14:42,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:14:44,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:48,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:50,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:14:51,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:51,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1279513.3333333333, ans=0.0 2023-10-03 13:14:54,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:14:54,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:14:55,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:15:01,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.62 vs. limit=22.5 2023-10-03 13:15:02,709 INFO [train.py:1046] (2/4) Epoch 37, batch 700, loss[loss=0.172, simple_loss=0.2479, pruned_loss=0.04806, over 24020.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2369, pruned_loss=0.03966, over 4558322.53 frames. ], batch size: 80, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:15:02,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:15:02,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:02,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:04,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:07,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 13:15:07,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 13:15:09,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 13:15:11,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:13,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:15:14,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 13:15:17,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1279646.6666666667, ans=0.125 2023-10-03 13:15:18,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:20,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:15:23,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:23,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:15:25,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:15:28,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:28,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1279646.6666666667, ans=0.125 2023-10-03 13:15:31,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 13:15:31,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:15:33,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 13:15:34,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 13:15:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:15:38,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:15:38,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1279713.3333333333, ans=0.125 2023-10-03 13:15:38,607 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:15:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:15:39,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1279713.3333333333, ans=0.2 2023-10-03 13:15:44,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:15:45,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 13:15:51,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:51,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:15:51,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 13:15:51,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.49 vs. limit=10.0 2023-10-03 13:15:55,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:57,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:58,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1279780.0, ans=0.0 2023-10-03 13:16:00,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:06,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:16:06,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 13:16:09,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1279846.6666666667, ans=0.2 2023-10-03 13:16:09,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1279846.6666666667, ans=0.1 2023-10-03 13:16:10,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 13:16:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 13:16:11,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:12,273 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.40 vs. limit=15.0 2023-10-03 13:16:15,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:16:15,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:16:16,840 INFO [train.py:1046] (2/4) Epoch 37, batch 750, loss[loss=0.1543, simple_loss=0.2281, pruned_loss=0.04019, over 23409.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.236, pruned_loss=0.03937, over 4589772.54 frames. ], batch size: 119, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:16:16,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:16,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 13:16:21,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 13:16:21,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 13:16:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 13:16:22,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 13:16:22,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 13:16:24,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:16:24,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 13:16:25,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:26,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:16:28,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:16:28,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1279913.3333333333, ans=0.125 2023-10-03 13:16:31,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:31,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:16:31,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1279980.0, ans=0.0 2023-10-03 13:16:32,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:16:37,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:16:38,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:16:40,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:16:42,002 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.819e+02 1.975e+02 2.168e+02 3.086e+02, threshold=3.950e+02, percent-clipped=0.0 2023-10-03 13:16:42,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:16:42,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:43,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 13:16:44,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:16:44,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:48,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:50,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:16:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 13:16:50,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:16:51,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 13:16:51,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 13:16:51,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1280046.6666666667, ans=0.125 2023-10-03 13:16:53,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 13:16:53,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:16:53,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:16:54,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:17:01,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:17:01,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:01,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:17:05,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:17:06,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:07,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 13:17:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:17:08,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 13:17:09,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:17:12,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1280113.3333333333, ans=0.2 2023-10-03 13:17:13,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:17:13,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 13:17:14,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:19,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:21,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:17:21,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:22,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:17:26,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 13:17:26,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:17:27,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:17:30,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:17:30,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:32,783 INFO [train.py:1046] (2/4) Epoch 37, batch 800, loss[loss=0.15, simple_loss=0.2412, pruned_loss=0.02938, over 24689.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2367, pruned_loss=0.03935, over 4617096.43 frames. ], batch size: 73, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:17:33,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:33,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:17:41,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:41,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:43,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:17:43,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:44,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:44,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:47,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:50,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:50,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:17:55,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 13:17:55,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:56,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:56,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:17:56,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:17:56,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 13:17:58,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:58,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 13:18:01,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:03,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:18:05,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:18:06,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:18:08,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:08,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:08,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1280380.0, ans=0.5 2023-10-03 13:18:12,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:18:12,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:18:13,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 13:18:15,491 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 13:18:15,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 13:18:15,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:18:15,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:18:17,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:18,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:18:23,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1280446.6666666667, ans=0.0 2023-10-03 13:18:23,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1280446.6666666667, ans=0.0 2023-10-03 13:18:24,828 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 13:18:24,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 13:18:26,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:18:27,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:18:30,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:18:33,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:35,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 13:18:36,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:18:39,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 13:18:39,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1280513.3333333333, ans=0.125 2023-10-03 13:18:46,224 INFO [train.py:1046] (2/4) Epoch 37, batch 850, loss[loss=0.1488, simple_loss=0.2338, pruned_loss=0.03193, over 24305.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2379, pruned_loss=0.03984, over 4629575.40 frames. ], batch size: 61, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:18:46,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:18:48,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:18:49,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 13:18:49,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:18:49,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1280580.0, ans=0.1 2023-10-03 13:18:52,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:52,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 13:18:52,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:18:54,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:18:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:18:57,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:18:58,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:18:59,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 13:19:01,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 13:19:01,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 13:19:02,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:19:02,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:19:02,724 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:19:05,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:05,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:19:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:19:10,507 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.795e+02 1.931e+02 2.229e+02 3.193e+02, threshold=3.862e+02, percent-clipped=0.0 2023-10-03 13:19:10,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:19:12,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:12,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 13:19:16,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 13:19:16,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1280713.3333333333, ans=0.125 2023-10-03 13:19:20,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:19:20,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 13:19:23,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 13:19:25,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 13:19:27,171 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 13:19:27,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:19:27,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:19:27,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:19:31,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:31,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:31,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 13:19:32,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:19:33,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1280780.0, ans=0.125 2023-10-03 13:19:34,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:35,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:19:35,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:19:37,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:19:39,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:19:40,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 13:19:42,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:19:42,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:19:43,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:19:43,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:19:44,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:46,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:47,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:19:47,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1280846.6666666667, ans=0.125 2023-10-03 13:19:49,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:19:51,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:19:52,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:19:59,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:20:00,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1280913.3333333333, ans=0.0 2023-10-03 13:20:01,123 INFO [train.py:1046] (2/4) Epoch 37, batch 900, loss[loss=0.163, simple_loss=0.2519, pruned_loss=0.03708, over 24647.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2389, pruned_loss=0.03992, over 4655914.19 frames. ], batch size: 73, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:20:01,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:20:01,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 13:20:01,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:20:01,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:20:02,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 13:20:09,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:20:13,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:20:13,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 13:20:16,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:20:17,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 13:20:17,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 13:20:17,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:20:17,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:20:19,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:20:19,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:20:31,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:20:31,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:20:31,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:20:32,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:20:33,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1281046.6666666667, ans=0.1 2023-10-03 13:20:35,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1281046.6666666667, ans=0.0 2023-10-03 13:20:37,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 13:20:40,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:20:43,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1281046.6666666667, ans=0.125 2023-10-03 13:20:44,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:20:45,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:20:46,029 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 13:20:47,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 13:20:53,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:20:53,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:20:53,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:20:53,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1281113.3333333333, ans=0.125 2023-10-03 13:21:00,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:00,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:01,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1281180.0, ans=0.125 2023-10-03 13:21:02,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 13:21:02,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:21:02,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1281180.0, ans=0.125 2023-10-03 13:21:02,976 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.99 vs. limit=6.0 2023-10-03 13:21:03,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 13:21:05,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:21:05,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:06,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:21:06,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:08,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1281180.0, ans=0.125 2023-10-03 13:21:11,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 13:21:11,624 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 13:21:11,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:21:11,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 13:21:13,799 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.53 vs. limit=22.5 2023-10-03 13:21:14,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1281246.6666666667, ans=0.0 2023-10-03 13:21:15,626 INFO [train.py:1046] (2/4) Epoch 37, batch 950, loss[loss=0.151, simple_loss=0.2241, pruned_loss=0.03895, over 23883.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2397, pruned_loss=0.04023, over 4666603.03 frames. ], batch size: 196, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:21:15,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:19,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 13:21:22,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:21:25,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:25,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:25,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:21:28,960 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 13:21:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:31,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:21:33,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:21:34,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:21:34,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 13:21:36,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:21:37,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:38,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 13:21:38,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:39,979 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.894e+02 2.023e+02 2.225e+02 2.799e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-03 13:21:44,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:44,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:44,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:45,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 13:21:47,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 13:21:48,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:21:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:21:56,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:21:56,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:22:00,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 13:22:02,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 13:22:02,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:22:03,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:03,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:03,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:22:06,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 13:22:06,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:22:10,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:10,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:10,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 13:22:10,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:22:10,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:22:11,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 13:22:13,691 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-10-03 13:22:14,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:22:17,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:22:21,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:22:22,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 13:22:22,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 13:22:27,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:30,581 INFO [train.py:1046] (2/4) Epoch 37, batch 1000, loss[loss=0.1552, simple_loss=0.2261, pruned_loss=0.0422, over 23773.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2371, pruned_loss=0.03999, over 4656851.56 frames. ], batch size: 212, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:22:30,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 13:22:30,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1281580.0, ans=0.1 2023-10-03 13:22:31,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:22:34,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=12.0 2023-10-03 13:22:38,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:22:39,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 13:22:39,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 13:22:44,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:22:44,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:22:45,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:46,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 13:22:51,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 13:22:52,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 13:22:52,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:22:54,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 13:22:54,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1281646.6666666667, ans=0.125 2023-10-03 13:22:56,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 13:22:57,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 13:22:59,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:22:59,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:07,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1281713.3333333333, ans=0.125 2023-10-03 13:23:08,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:23:09,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:23:10,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:11,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:23:11,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 13:23:11,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:23:11,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:23:11,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:23:11,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1281713.3333333333, ans=0.015 2023-10-03 13:23:13,216 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 13:23:18,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 13:23:18,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 13:23:19,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 13:23:22,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:23:28,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:28,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:23:30,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:31,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:23:31,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 13:23:34,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:23:34,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 13:23:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 13:23:35,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:23:37,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:23:38,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:23:40,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1281846.6666666667, ans=0.2 2023-10-03 13:23:41,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:23:44,313 INFO [train.py:1046] (2/4) Epoch 37, batch 1050, loss[loss=0.1461, simple_loss=0.224, pruned_loss=0.03408, over 24321.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2362, pruned_loss=0.03964, over 4679677.37 frames. ], batch size: 56, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:23:44,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:23:47,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:23:47,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:23:49,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:23:49,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:50,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:23:53,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:23:56,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:23:58,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:23:58,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:23:59,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:23:59,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:24:01,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 13:24:02,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:24:02,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 13:24:05,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:24:05,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 13:24:06,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:24:08,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.790e+02 1.943e+02 2.142e+02 2.975e+02, threshold=3.887e+02, percent-clipped=0.0 2023-10-03 13:24:09,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1281980.0, ans=0.07 2023-10-03 13:24:13,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:24:14,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:24:14,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:24:16,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 13:24:16,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 13:24:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:24:19,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 13:24:22,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 13:24:22,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:25,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:24:26,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:24:28,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:24:28,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:24:32,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:24:37,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 13:24:38,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 13:24:38,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 13:24:38,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:24:38,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:24:40,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 13:24:41,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1282113.3333333333, ans=0.0 2023-10-03 13:24:44,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:24:45,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:24:45,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:24:47,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:24:47,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:50,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:50,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 13:24:52,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1282180.0, ans=0.125 2023-10-03 13:24:53,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:24:53,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 13:24:53,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 13:24:54,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:24:58,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:24:59,844 INFO [train.py:1046] (2/4) Epoch 37, batch 1100, loss[loss=0.1509, simple_loss=0.2371, pruned_loss=0.03239, over 24666.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2364, pruned_loss=0.03949, over 4670932.51 frames. ], batch size: 65, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:25:02,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:25:07,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:25:08,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:25:09,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:25:09,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 13:25:11,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:25:14,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:25:15,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:25:18,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:25:18,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 13:25:18,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:25:20,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:25:20,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:25:23,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:25:24,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:25:27,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1282380.0, ans=0.1 2023-10-03 13:25:30,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:25:33,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 13:25:35,242 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 13:25:35,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:38,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:39,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:25:39,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:25:41,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 13:25:42,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:25:42,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:25:42,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:25:42,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 13:25:44,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1282446.6666666667, ans=0.125 2023-10-03 13:25:49,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:25:49,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 13:25:52,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:25:56,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:25:58,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 13:25:59,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 13:26:02,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:03,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:03,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:26:05,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 13:26:06,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:26:06,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:26:06,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1282513.3333333333, ans=0.125 2023-10-03 13:26:07,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 13:26:09,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:26:09,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 13:26:10,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:26:11,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:26:12,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:26:13,798 INFO [train.py:1046] (2/4) Epoch 37, batch 1150, loss[loss=0.157, simple_loss=0.2329, pruned_loss=0.04058, over 23694.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2371, pruned_loss=0.03931, over 4681850.31 frames. ], batch size: 232, lr: 2.76e-03, grad_scale: 8.0 2023-10-03 13:26:15,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:16,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:26:18,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:18,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:26:18,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1282580.0, ans=0.05 2023-10-03 13:26:19,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 13:26:19,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:26:21,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 13:26:22,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:22,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:26:24,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1282580.0, ans=0.125 2023-10-03 13:26:25,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1282580.0, ans=0.0 2023-10-03 13:26:27,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 13:26:30,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:34,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:34,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:26:36,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 13:26:36,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:26:36,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:26:39,416 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.859e+02 2.016e+02 2.195e+02 3.712e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-03 13:26:42,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 13:26:43,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:43,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:52,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:26:58,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:27:00,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 13:27:00,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:00,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:07,456 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 13:27:10,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:16,265 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 13:27:17,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:19,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:27:19,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:27:20,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:27:21,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:27:26,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:27:26,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:27:28,277 INFO [train.py:1046] (2/4) Epoch 37, batch 1200, loss[loss=0.1466, simple_loss=0.2216, pruned_loss=0.03586, over 23746.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.238, pruned_loss=0.03975, over 4689715.62 frames. ], batch size: 149, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:27:29,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:27:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:29,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:27:30,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1282913.3333333333, ans=0.125 2023-10-03 13:27:33,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:27:34,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:27:35,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:27:35,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:39,359 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 13:27:41,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 13:27:44,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:27:47,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:27:50,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:27:51,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:27:51,720 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 13:27:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:28:01,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:28:01,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:28:01,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 13:28:03,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:28:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 13:28:05,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1283046.6666666667, ans=0.125 2023-10-03 13:28:09,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 13:28:09,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:28:10,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:28:10,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:28:12,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:28:15,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:28:15,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:28:16,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:28:16,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 13:28:16,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:28:18,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:28:18,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:28:21,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:28:21,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:28:23,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:28:26,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:28:29,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 13:28:31,750 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 13:28:35,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:28:36,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:28:37,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:28:39,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:28:42,547 INFO [train.py:1046] (2/4) Epoch 37, batch 1250, loss[loss=0.184, simple_loss=0.2558, pruned_loss=0.05606, over 22803.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2386, pruned_loss=0.04004, over 4696527.85 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:28:42,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 13:28:45,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:28:46,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.01 vs. limit=10.0 2023-10-03 13:28:46,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:28:46,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 13:28:49,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:28:49,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:28:52,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:28:53,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:28:53,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:28:55,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:28:56,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:28:59,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 13:28:59,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:29:01,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:01,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:29:03,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:06,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:06,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:29:07,846 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.893e+02 2.075e+02 2.273e+02 3.186e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-03 13:29:11,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 13:29:12,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:29:15,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:29:16,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 13:29:16,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:29:16,903 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 13:29:18,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:18,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:21,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:23,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:23,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:29:25,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 13:29:25,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 13:29:25,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 13:29:28,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:29:29,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 13:29:29,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:31,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 13:29:31,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:29:32,184 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.07 vs. limit=15.0 2023-10-03 13:29:33,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 13:29:33,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:29:33,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:29:34,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:29:34,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:29:36,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 13:29:39,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:40,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:29:41,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:29:46,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:29:49,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.55 vs. limit=6.0 2023-10-03 13:29:50,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:50,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 13:29:55,654 INFO [train.py:1046] (2/4) Epoch 37, batch 1300, loss[loss=0.1496, simple_loss=0.229, pruned_loss=0.03505, over 24456.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2393, pruned_loss=0.04063, over 4695619.38 frames. ], batch size: 63, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:29:55,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:29:55,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:29:57,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:29:58,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:30:00,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:30:00,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 13:30:05,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:30:05,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:30:07,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 13:30:12,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:30:14,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:15,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:30:17,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:30:19,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:20,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:30:21,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:30:21,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 13:30:21,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1283646.6666666667, ans=0.035 2023-10-03 13:30:28,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:30:28,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:30:29,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 13:30:29,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:30:31,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:30:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:30:36,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 13:30:36,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:30:36,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 13:30:38,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:30:42,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:30:42,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:30:46,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 13:30:46,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 13:30:47,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1283780.0, ans=0.0 2023-10-03 13:30:48,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 13:30:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:30:55,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 13:30:56,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:31:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 13:31:06,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:31:10,521 INFO [train.py:1046] (2/4) Epoch 37, batch 1350, loss[loss=0.1538, simple_loss=0.216, pruned_loss=0.04578, over 22739.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2383, pruned_loss=0.04045, over 4690456.57 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:31:12,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:13,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:31:13,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:31:17,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:31:17,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:31:21,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:31:23,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 13:31:23,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:31:23,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:31:27,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 13:31:27,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:31:29,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:31:29,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 13:31:30,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 13:31:34,549 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.514e+02 1.947e+02 2.174e+02 2.484e+02 3.426e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-03 13:31:34,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 13:31:35,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:35,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 13:31:38,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1284046.6666666667, ans=0.0 2023-10-03 13:31:48,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:58,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:58,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:31:58,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 13:32:02,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:32:02,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 13:32:03,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:32:03,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:32:06,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:32:09,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 13:32:09,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:32:18,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 13:32:19,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 13:32:23,882 INFO [train.py:1046] (2/4) Epoch 37, batch 1400, loss[loss=0.1607, simple_loss=0.2463, pruned_loss=0.03757, over 24675.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2367, pruned_loss=0.03959, over 4694097.09 frames. ], batch size: 68, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:32:24,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1284246.6666666667, ans=0.125 2023-10-03 13:32:25,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 13:32:26,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:32:29,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:32:29,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:32:33,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 13:32:33,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1284246.6666666667, ans=0.04949747468305833 2023-10-03 13:32:35,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 13:32:38,154 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:32:44,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:32:46,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:32:50,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:32:50,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:32:53,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:32:54,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 13:33:01,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:01,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:05,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 13:33:05,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:33:07,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:33:07,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:33:07,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:33:07,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1284446.6666666667, ans=0.1 2023-10-03 13:33:08,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:33:08,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:33:08,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:33:10,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 13:33:10,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:33:15,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:18,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:33:18,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1284446.6666666667, ans=0.125 2023-10-03 13:33:22,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.56 vs. limit=6.0 2023-10-03 13:33:25,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 13:33:27,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:33:27,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:33:27,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1284513.3333333333, ans=0.025 2023-10-03 13:33:31,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 13:33:31,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:33:32,000 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.99 vs. limit=15.0 2023-10-03 13:33:32,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:33:34,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1284513.3333333333, ans=0.125 2023-10-03 13:33:35,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:33:36,807 INFO [train.py:1046] (2/4) Epoch 37, batch 1450, loss[loss=0.1398, simple_loss=0.2186, pruned_loss=0.03053, over 21153.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2359, pruned_loss=0.03931, over 4695183.11 frames. ], batch size: 46, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:33:38,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:33:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:38,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 13:33:43,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:33:45,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:33:48,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:33:48,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 13:33:50,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:33:51,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 13:33:51,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:53,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:33:53,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 13:33:54,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:33:54,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:33:55,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 13:33:55,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:33:57,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:33:57,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1284646.6666666667, ans=0.125 2023-10-03 13:33:58,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:59,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:34:02,484 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.928e+02 2.172e+02 2.508e+02 3.657e+02, threshold=4.343e+02, percent-clipped=0.0 2023-10-03 13:34:03,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:34:03,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:34:05,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:34:05,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:34:09,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:34:09,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:34:09,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:34:09,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:13,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 13:34:17,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:34:20,513 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 13:34:21,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:34:22,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.79 vs. limit=15.0 2023-10-03 13:34:23,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:34:24,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:26,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 13:34:28,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:31,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 13:34:32,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 13:34:34,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:36,002 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:34:38,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:34:38,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:34:38,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 13:34:41,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 13:34:41,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 13:34:42,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:44,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:34:45,274 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=15.0 2023-10-03 13:34:51,417 INFO [train.py:1046] (2/4) Epoch 37, batch 1500, loss[loss=0.1661, simple_loss=0.2374, pruned_loss=0.04738, over 23597.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2365, pruned_loss=0.03939, over 4705467.69 frames. ], batch size: 256, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:34:53,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1284913.3333333333, ans=0.125 2023-10-03 13:34:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 13:34:57,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:34:57,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:34:58,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:59,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:34:59,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:35:01,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 13:35:02,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:35:02,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:35:02,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:35:03,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:35:04,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:35:05,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:07,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1284980.0, ans=0.2 2023-10-03 13:35:10,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:10,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 13:35:12,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:35:12,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:35:12,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:35:15,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 13:35:17,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1284980.0, ans=0.125 2023-10-03 13:35:18,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1285046.6666666667, ans=10.0 2023-10-03 13:35:20,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 13:35:21,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:35:21,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 13:35:24,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:35:25,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:35:27,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:35:27,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:35:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 13:35:29,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:35:29,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:35:29,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 13:35:31,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:35:35,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:35:35,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 13:35:39,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:35:41,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:35:45,618 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 13:35:47,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:47,501 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 13:35:47,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1285180.0, ans=0.2 2023-10-03 13:35:49,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:35:49,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:35:50,161 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 13:35:50,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1285180.0, ans=0.0 2023-10-03 13:35:51,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:35:52,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 13:35:54,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:57,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:57,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:58,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:58,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:35:59,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 13:35:59,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 13:36:01,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:36:01,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 13:36:02,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 13:36:03,870 INFO [train.py:1046] (2/4) Epoch 37, batch 1550, loss[loss=0.1459, simple_loss=0.2333, pruned_loss=0.02925, over 24460.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2372, pruned_loss=0.0395, over 4720067.60 frames. ], batch size: 63, lr: 2.76e-03, grad_scale: 8.0 2023-10-03 13:36:03,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:36:05,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:06,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:36:06,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:36:06,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1285246.6666666667, ans=0.0 2023-10-03 13:36:09,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:09,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1285246.6666666667, ans=0.0 2023-10-03 13:36:09,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1285246.6666666667, ans=0.0 2023-10-03 13:36:10,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:12,053 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 13:36:12,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1285246.6666666667, ans=0.09899494936611666 2023-10-03 13:36:13,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:13,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:36:13,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:36:18,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:36:18,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 13:36:21,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:36:21,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 13:36:22,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 13:36:22,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 13:36:23,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:25,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:26,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1285313.3333333333, ans=0.2 2023-10-03 13:36:27,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:36:29,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 13:36:29,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 13:36:30,488 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.843e+02 2.005e+02 2.170e+02 3.085e+02, threshold=4.010e+02, percent-clipped=0.0 2023-10-03 13:36:36,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:40,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:36:40,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:36:40,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:36:41,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 13:36:46,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:36:48,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:51,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:36:54,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:36:54,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:54,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 13:36:55,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:36:56,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:36:56,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:57,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 13:36:57,763 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 13:37:00,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:04,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 13:37:11,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:37:11,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:37:12,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.97 vs. limit=15.0 2023-10-03 13:37:12,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 13:37:14,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:37:14,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:37:14,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:37:15,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:37:16,939 INFO [train.py:1046] (2/4) Epoch 37, batch 1600, loss[loss=0.1562, simple_loss=0.2498, pruned_loss=0.0313, over 24298.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2383, pruned_loss=0.03959, over 4723292.07 frames. ], batch size: 74, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:37:17,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:37:20,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:22,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 13:37:23,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 13:37:25,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 13:37:26,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=12.0 2023-10-03 13:37:28,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:37:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 13:37:31,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:37:32,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:37:34,807 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.39 vs. limit=15.0 2023-10-03 13:37:38,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:37:40,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 13:37:42,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:37:42,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1285646.6666666667, ans=0.125 2023-10-03 13:37:44,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 13:37:44,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:44,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 13:37:45,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1285713.3333333333, ans=0.125 2023-10-03 13:37:49,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 13:37:58,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:38:00,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 13:38:00,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:38:00,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:38:00,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:38:03,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 13:38:07,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 13:38:07,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1285780.0, ans=0.2 2023-10-03 13:38:10,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:38:10,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:10,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:10,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1285780.0, ans=0.125 2023-10-03 13:38:11,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:38:13,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:38:13,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:38:16,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:38:22,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:22,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:38:23,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 13:38:23,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:38:25,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 13:38:30,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:38:31,959 INFO [train.py:1046] (2/4) Epoch 37, batch 1650, loss[loss=0.1416, simple_loss=0.2105, pruned_loss=0.03631, over 22623.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04004, over 4715120.05 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:38:32,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:38:33,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:38:33,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 13:38:33,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 13:38:33,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 13:38:34,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 13:38:35,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1285913.3333333333, ans=0.0 2023-10-03 13:38:36,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1285913.3333333333, ans=0.125 2023-10-03 13:38:39,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:40,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:38:40,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:38:41,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:38:43,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:38:44,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 13:38:47,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:38:47,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:38:47,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:38:47,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:38:48,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 13:38:48,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 13:38:56,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:38:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:38:58,188 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.064e+02 2.320e+02 3.499e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-03 13:39:06,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 13:39:08,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:11,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 13:39:12,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:12,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1286046.6666666667, ans=0.125 2023-10-03 13:39:14,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:39:15,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:39:15,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:18,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:39:18,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:18,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1286113.3333333333, ans=0.125 2023-10-03 13:39:19,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:39:19,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:20,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:39:22,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:39:22,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:39:22,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:39:26,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:39:27,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 13:39:29,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:39:30,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 13:39:32,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 13:39:34,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 13:39:34,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:39:34,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:39:34,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:34,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:34,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 13:39:35,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1286180.0, ans=10.0 2023-10-03 13:39:37,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:38,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:39:39,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:42,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 13:39:45,109 INFO [train.py:1046] (2/4) Epoch 37, batch 1700, loss[loss=0.1501, simple_loss=0.2349, pruned_loss=0.03269, over 24424.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2382, pruned_loss=0.03993, over 4699262.32 frames. ], batch size: 63, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:39:46,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:46,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:39:47,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 13:39:49,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:39:49,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:39:49,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:39:50,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:39:50,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:39:52,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 13:39:53,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:40:02,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:04,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:40:09,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:40:10,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:40:10,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:40:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:40:13,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 13:40:15,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:40:15,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:17,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:40:17,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:40:20,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 13:40:20,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 13:40:21,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:22,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 13:40:23,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:40:31,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:32,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:40:34,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:40:35,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:40:35,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 13:40:35,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:40:37,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:37,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 13:40:38,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:40:38,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:40:38,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:38,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:40:41,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:40:41,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:40:42,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:40:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:40:44,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:48,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:49,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 13:40:51,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:51,788 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.51 vs. limit=6.0 2023-10-03 13:40:54,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 13:40:58,903 INFO [train.py:1046] (2/4) Epoch 37, batch 1750, loss[loss=0.1627, simple_loss=0.2337, pruned_loss=0.04585, over 23790.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2376, pruned_loss=0.03997, over 4693012.28 frames. ], batch size: 212, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:41:00,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:02,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:02,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:41:03,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 13:41:03,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:41:06,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:41:06,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:11,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 13:41:12,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1286646.6666666667, ans=0.0 2023-10-03 13:41:14,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:15,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 13:41:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:41:17,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:41:21,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:41:23,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 13:41:24,677 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.864e+02 2.125e+02 2.477e+02 3.687e+02, threshold=4.251e+02, percent-clipped=0.0 2023-10-03 13:41:24,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:41:24,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 13:41:25,131 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:41:28,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1286713.3333333333, ans=0.0 2023-10-03 13:41:34,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:41:35,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:41:37,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:41:37,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-10-03 13:41:40,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:40,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:41:41,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:41:41,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1286780.0, ans=0.1 2023-10-03 13:41:44,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:45,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:41:46,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:41:48,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 13:41:49,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:41:52,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 13:41:53,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:41:54,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:55,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:41:59,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:42:00,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:42:01,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:42:03,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:42:07,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:42:08,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1286846.6666666667, ans=0.125 2023-10-03 13:42:10,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:42:11,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1286913.3333333333, ans=0.0 2023-10-03 13:42:12,020 INFO [train.py:1046] (2/4) Epoch 37, batch 1800, loss[loss=0.1333, simple_loss=0.2154, pruned_loss=0.02559, over 20367.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2364, pruned_loss=0.03963, over 4685005.73 frames. ], batch size: 44, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:42:12,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:42:12,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 13:42:12,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:42:14,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:42:14,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:14,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:42:14,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:42:16,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:42:19,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:42:19,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:42:20,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:42:23,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:42:24,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 13:42:26,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:42:29,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:42:32,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:32,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:34,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:42:35,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:42:37,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 13:42:37,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:40,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:42,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1287046.6666666667, ans=22.5 2023-10-03 13:42:42,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 13:42:45,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 13:42:45,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 13:42:45,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:42:46,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.07 vs. limit=22.5 2023-10-03 13:42:48,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:48,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:42:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:42:55,205 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 13:42:57,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:42:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:58,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 13:42:58,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 13:43:00,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:43:00,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:43:02,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:43:02,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1287113.3333333333, ans=0.125 2023-10-03 13:43:05,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 13:43:11,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:43:11,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 13:43:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:43:12,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:43:12,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:43:14,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 13:43:15,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:43:15,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:43:18,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 13:43:18,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:43:20,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1287180.0, ans=0.125 2023-10-03 13:43:21,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:43:21,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:43:23,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:43:24,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:43:24,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:43:24,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1287246.6666666667, ans=0.015 2023-10-03 13:43:25,738 INFO [train.py:1046] (2/4) Epoch 37, batch 1850, loss[loss=0.1569, simple_loss=0.2321, pruned_loss=0.04089, over 23802.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2369, pruned_loss=0.04, over 4679595.04 frames. ], batch size: 179, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:43:25,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:43:25,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:43:29,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:43:29,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1287246.6666666667, ans=0.125 2023-10-03 13:43:30,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:43:32,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1287246.6666666667, ans=0.125 2023-10-03 13:43:36,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:43:38,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 13:43:40,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 13:43:40,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1287313.3333333333, ans=0.125 2023-10-03 13:43:43,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 13:43:46,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:43:46,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 13:43:46,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 13:43:52,036 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.939e+02 2.095e+02 2.336e+02 4.113e+02, threshold=4.190e+02, percent-clipped=0.0 2023-10-03 13:43:56,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:43:57,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 13:44:01,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:44:01,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:44:07,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 13:44:07,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:07,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:44:08,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:44:10,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:44:14,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:44:16,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1287446.6666666667, ans=0.125 2023-10-03 13:44:17,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:44:17,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:17,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:44:18,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:44:21,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:44:23,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 13:44:24,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:44:28,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:44:30,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:44:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 13:44:30,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 13:44:32,313 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 13:44:33,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 13:44:35,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:44:35,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:44:35,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:44:35,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:36,910 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 13:44:36,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:44:36,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:38,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:44:39,624 INFO [train.py:1046] (2/4) Epoch 37, batch 1900, loss[loss=0.1643, simple_loss=0.2498, pruned_loss=0.03939, over 23819.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2382, pruned_loss=0.04033, over 4678521.13 frames. ], batch size: 86, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:44:39,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:44:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:44:39,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 13:44:41,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:41,210 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 13:44:41,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:44:43,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:48,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:51,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:44:53,080 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 13:44:53,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 13:44:54,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:44:55,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:44:55,939 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 13:44:57,261 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 13:45:00,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 13:45:01,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:45:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 13:45:06,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 13:45:06,898 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:45:15,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 13:45:18,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 13:45:18,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:45:18,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 13:45:18,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 13:45:18,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1287713.3333333333, ans=0.125 2023-10-03 13:45:19,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 13:45:19,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 13:45:19,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:45:24,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 13:45:27,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:45:29,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:45:29,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 13:45:31,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:45:34,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 13:45:34,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:45:39,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:45:39,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:45:39,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:45:41,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:45:41,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:45:42,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 13:45:44,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:45:45,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:45:45,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:45:49,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:45:49,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:45:49,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:45:50,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:45:54,084 INFO [train.py:1046] (2/4) Epoch 37, batch 1950, loss[loss=0.1642, simple_loss=0.2446, pruned_loss=0.04188, over 23479.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2386, pruned_loss=0.0405, over 4681345.97 frames. ], batch size: 93, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:45:54,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:45:54,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1287913.3333333333, ans=0.125 2023-10-03 13:45:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:45:56,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:45:56,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:45:59,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 13:45:59,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:46:01,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:01,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:05,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:46:05,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:05,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:07,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:46:10,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:46:11,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:46:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:46:11,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:13,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1287980.0, ans=0.2 2023-10-03 13:46:16,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:18,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:46:18,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:18,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:46:18,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 13:46:19,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:46:19,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:46:20,727 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.916e+02 2.137e+02 2.276e+02 3.150e+02, threshold=4.275e+02, percent-clipped=0.0 2023-10-03 13:46:20,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:23,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1288046.6666666667, ans=0.125 2023-10-03 13:46:24,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:46:28,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1288046.6666666667, ans=0.125 2023-10-03 13:46:28,879 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.11 vs. limit=15.0 2023-10-03 13:46:31,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:46:35,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:46:36,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:46:36,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 13:46:37,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:46:37,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1288113.3333333333, ans=0.125 2023-10-03 13:46:43,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:46:44,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:46:44,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:46:52,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:53,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:56,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:57,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:59,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:46:59,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:47:00,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 13:47:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:47:02,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:47:03,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 13:47:05,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:47:08,065 INFO [train.py:1046] (2/4) Epoch 37, batch 2000, loss[loss=0.1449, simple_loss=0.2227, pruned_loss=0.03354, over 24315.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2389, pruned_loss=0.03996, over 4701394.29 frames. ], batch size: 56, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:47:09,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:47:12,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:47:12,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:47:15,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:47:15,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:19,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 13:47:21,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:47:22,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:47:22,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1288313.3333333333, ans=0.0 2023-10-03 13:47:25,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 13:47:27,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:47:27,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:47:29,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:47:30,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 13:47:30,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1288313.3333333333, ans=0.0 2023-10-03 13:47:32,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 13:47:34,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:47:37,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 13:47:37,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:47:40,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:47:40,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:47:40,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:42,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:47:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:47:42,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1288380.0, ans=0.1 2023-10-03 13:47:44,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 13:47:46,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 13:47:46,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:47:46,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:47:52,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:54,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:47:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:47:54,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:47:57,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:47:57,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:58,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:47:58,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:59,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:01,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:48:03,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 13:48:09,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:48:10,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:12,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1288513.3333333333, ans=0.125 2023-10-03 13:48:13,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:13,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:48:16,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:19,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:48:19,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:20,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:48:21,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:48:22,343 INFO [train.py:1046] (2/4) Epoch 37, batch 2050, loss[loss=0.1406, simple_loss=0.2065, pruned_loss=0.03729, over 22623.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2378, pruned_loss=0.03943, over 4715470.94 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:48:22,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:24,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:25,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:48:27,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:31,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:48:33,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:48:34,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:34,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:48:36,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 13:48:36,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:48:37,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:48:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:48:46,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:48:46,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:49,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 13:48:50,797 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.876e+02 2.031e+02 2.404e+02 3.900e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-03 13:48:50,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:52,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 13:48:54,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:48:56,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:48:59,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:48:59,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1288713.3333333333, ans=0.125 2023-10-03 13:49:01,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:49:01,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:49:03,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:49:04,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:49:04,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1288713.3333333333, ans=0.125 2023-10-03 13:49:05,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:49:08,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:10,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:49:11,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:49:12,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:49:14,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1288780.0, ans=0.0 2023-10-03 13:49:16,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:49:19,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1288780.0, ans=0.2 2023-10-03 13:49:21,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:49:23,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 13:49:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:49:29,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:49:32,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:49:33,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 13:49:33,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1288846.6666666667, ans=0.125 2023-10-03 13:49:36,691 INFO [train.py:1046] (2/4) Epoch 37, batch 2100, loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03727, over 23613.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.03922, over 4713004.17 frames. ], batch size: 149, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:49:36,844 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 13:49:36,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:49:38,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:38,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:49:40,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:49:40,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 13:49:40,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 13:49:41,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:49:46,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:49:46,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:49:48,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:49:48,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:49:48,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 13:49:50,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:49:50,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 13:49:50,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 13:49:51,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:49:51,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:49:51,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 13:49:53,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:49:54,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1288980.0, ans=0.0 2023-10-03 13:49:59,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 13:49:59,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:50:01,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:01,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:50:05,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:50:05,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 13:50:05,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:05,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:50:08,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 13:50:08,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:08,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 13:50:08,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 13:50:10,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 13:50:11,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:50:14,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:50:16,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:50:16,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:50:17,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:18,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:18,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 13:50:18,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:18,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:20,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:20,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 13:50:20,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1289113.3333333333, ans=0.2 2023-10-03 13:50:21,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 13:50:23,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 13:50:27,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:50:27,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1289113.3333333333, ans=0.025 2023-10-03 13:50:30,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:50:30,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 13:50:35,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:37,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:50:38,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:50:38,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:50:38,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 13:50:39,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:50:41,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:41,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:50:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:50:42,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:45,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 13:50:47,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 13:50:47,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:48,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:48,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:50:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:50:50,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:50:51,380 INFO [train.py:1046] (2/4) Epoch 37, batch 2150, loss[loss=0.1675, simple_loss=0.2479, pruned_loss=0.04357, over 23331.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2356, pruned_loss=0.03914, over 4680433.26 frames. ], batch size: 93, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:50:54,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1289246.6666666667, ans=0.0 2023-10-03 13:50:55,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:50:57,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:58,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:00,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:51:00,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:00,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:51:04,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:04,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:51:04,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:51:09,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:09,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 13:51:13,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:14,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:51:15,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:15,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:15,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:17,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:51:17,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:51:18,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:51:19,871 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.856e+02 2.047e+02 2.375e+02 3.292e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 13:51:19,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:51:20,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 13:51:21,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:51:22,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:23,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:24,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:51:25,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:51:26,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:26,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:51:30,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:30,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 13:51:30,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:51:33,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:33,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:35,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:35,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:51:36,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:36,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:36,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 13:51:39,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 13:51:39,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:51:39,502 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 13:51:39,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:40,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:51:42,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 13:51:42,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1289446.6666666667, ans=0.125 2023-10-03 13:51:43,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:51:43,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 13:51:43,443 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 13:51:43,443 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 13:51:43,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 13:51:44,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:44,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:51:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:51:46,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:48,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:51:49,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1289513.3333333333, ans=0.035 2023-10-03 13:51:50,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:50,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:59,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:51:59,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 13:52:04,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:52:06,056 INFO [train.py:1046] (2/4) Epoch 37, batch 2200, loss[loss=0.156, simple_loss=0.233, pruned_loss=0.0395, over 23235.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2362, pruned_loss=0.03901, over 4692547.47 frames. ], batch size: 105, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:52:09,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:09,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:52:10,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:12,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:52:13,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:52:14,583 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.46 vs. limit=15.0 2023-10-03 13:52:14,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:52:14,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 13:52:19,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 13:52:21,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:52:26,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 13:52:29,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:32,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:52:32,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:52:35,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:52:35,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 13:52:40,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:52:41,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:41,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:52:43,807 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.63 vs. limit=22.5 2023-10-03 13:52:44,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:52:44,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:52:45,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:52:47,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:50,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 13:52:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:51,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 13:52:54,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:54,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:52:54,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:57,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:52:57,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:52:57,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:57,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:58,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:53:00,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:53:01,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 13:53:05,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:53:05,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:53:07,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:53:09,769 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 13:53:11,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:53:12,368 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 13:53:13,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:53:13,787 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 13:53:16,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:53:16,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 13:53:19,474 INFO [train.py:1046] (2/4) Epoch 37, batch 2250, loss[loss=0.1577, simple_loss=0.2444, pruned_loss=0.03544, over 24648.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2375, pruned_loss=0.03949, over 4695828.79 frames. ], batch size: 68, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 13:53:19,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:53:20,942 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 13:53:24,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:53:26,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:53:28,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1289913.3333333333, ans=0.2 2023-10-03 13:53:29,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:53:31,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:53:35,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:37,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:53:38,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:53:39,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 13:53:39,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:53:39,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:53:41,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 13:53:43,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:53:43,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:45,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:53:47,210 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.849e+02 1.952e+02 2.109e+02 2.912e+02, threshold=3.904e+02, percent-clipped=0.0 2023-10-03 13:53:48,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:53:50,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 13:53:50,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:53:50,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.52 vs. limit=15.0 2023-10-03 13:53:52,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 13:53:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:54,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1290046.6666666667, ans=0.2 2023-10-03 13:53:56,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:54:00,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:54:01,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:54:03,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:03,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:54:06,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:54:07,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:54:12,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:54:15,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:54:18,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:54:19,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:54:19,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:54:25,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:54:26,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1290180.0, ans=0.1 2023-10-03 13:54:28,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:54:28,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 13:54:28,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:28,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:54:31,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 13:54:32,570 INFO [train.py:1046] (2/4) Epoch 37, batch 2300, loss[loss=0.1468, simple_loss=0.2253, pruned_loss=0.03416, over 24367.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.238, pruned_loss=0.03936, over 4710297.75 frames. ], batch size: 56, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 13:54:35,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:54:35,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:37,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1290246.6666666667, ans=0.125 2023-10-03 13:54:37,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1290246.6666666667, ans=0.0 2023-10-03 13:54:42,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:44,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:54:47,590 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 13:54:48,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1290313.3333333333, ans=0.0 2023-10-03 13:54:49,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:53,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:54:53,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:54:55,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:54:55,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:55,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 13:54:55,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:54:57,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:54:59,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:55:01,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:55:04,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:55:06,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:55:09,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.85 vs. limit=15.0 2023-10-03 13:55:13,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:55:13,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:55:18,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:55:21,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:55:24,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:55:25,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:55:25,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:55:25,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 13:55:28,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1290446.6666666667, ans=0.0 2023-10-03 13:55:29,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:55:29,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:55:30,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:55:30,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:55:30,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:55:31,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 13:55:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:55:32,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 13:55:32,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:55:32,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:55:34,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 13:55:37,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-10-03 13:55:38,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:55:41,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:55:47,295 INFO [train.py:1046] (2/4) Epoch 37, batch 2350, loss[loss=0.1452, simple_loss=0.2248, pruned_loss=0.03276, over 24481.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.03982, over 4699965.35 frames. ], batch size: 63, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:55:47,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:55:48,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:55:48,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:55:50,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:55:50,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:55:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:55:52,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 13:55:52,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.99 vs. limit=15.0 2023-10-03 13:55:56,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:55:57,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 13:55:59,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.58 vs. limit=6.0 2023-10-03 13:56:03,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 13:56:05,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:56:08,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:08,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:08,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:56:09,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:56:11,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 13:56:14,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:56:17,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 13:56:18,749 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.950e+02 2.084e+02 2.408e+02 3.908e+02, threshold=4.167e+02, percent-clipped=1.0 2023-10-03 13:56:18,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:56:23,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:56:23,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:56:24,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:56:26,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 13:56:27,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:56:28,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:56:28,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:56:28,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:56:33,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:56:35,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 13:56:35,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:56:37,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:37,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:56:39,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 13:56:40,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.57 vs. limit=22.5 2023-10-03 13:56:41,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:56:44,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 13:56:44,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:56:48,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 13:56:48,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1290846.6666666667, ans=0.125 2023-10-03 13:56:51,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 13:56:52,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:56:52,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:56:52,950 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 13:56:54,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 13:56:56,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 13:56:58,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:57:01,896 INFO [train.py:1046] (2/4) Epoch 37, batch 2400, loss[loss=0.1504, simple_loss=0.2415, pruned_loss=0.02964, over 24509.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.03928, over 4698178.15 frames. ], batch size: 71, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:57:03,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:57:04,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:57:08,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:57:08,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 13:57:08,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1290913.3333333333, ans=0.125 2023-10-03 13:57:09,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 13:57:16,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 13:57:16,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:57:18,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 13:57:18,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:57:19,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:20,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 13:57:21,710 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.86 vs. limit=15.0 2023-10-03 13:57:25,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:25,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1290980.0, ans=15.0 2023-10-03 13:57:29,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 13:57:34,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:57:39,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 13:57:40,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:57:42,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:46,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:57:46,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 13:57:48,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:57:53,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:57:56,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:57:59,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:01,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:58:01,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:58:01,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:58:01,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:58:01,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1291180.0, ans=0.125 2023-10-03 13:58:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:58:02,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:58:04,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:58:06,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:58:06,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 13:58:07,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 13:58:09,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:58:09,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=8.34 vs. limit=12.0 2023-10-03 13:58:10,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:58:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 13:58:10,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1291180.0, ans=0.04949747468305833 2023-10-03 13:58:11,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 13:58:11,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 13:58:11,933 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 13:58:13,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 13:58:14,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:58:14,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:14,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:58:16,060 INFO [train.py:1046] (2/4) Epoch 37, batch 2450, loss[loss=0.1598, simple_loss=0.2469, pruned_loss=0.03638, over 24569.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2369, pruned_loss=0.03938, over 4696038.91 frames. ], batch size: 71, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:58:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 13:58:17,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:17,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:58:21,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:58:22,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:58:26,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:26,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:58:26,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 13:58:32,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:58:32,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:37,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:58:37,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:58:37,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:58:38,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 13:58:41,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:43,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:58:44,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:58:47,041 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.840e+02 2.041e+02 2.225e+02 3.110e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-03 13:58:47,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:58:48,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:58:50,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:58:50,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:52,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 13:58:52,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:58:59,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:00,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:59:00,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:02,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:59:04,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:05,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:59:06,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 13:59:10,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:59:10,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:59:12,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:59:12,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:17,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:59:17,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 13:59:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:59:19,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:59:19,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 13:59:19,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:59:20,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:59:20,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1291513.3333333333, ans=0.125 2023-10-03 13:59:23,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1291513.3333333333, ans=0.2 2023-10-03 13:59:24,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:59:27,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:27,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:59:29,947 INFO [train.py:1046] (2/4) Epoch 37, batch 2500, loss[loss=0.157, simple_loss=0.235, pruned_loss=0.0395, over 19922.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2365, pruned_loss=0.03929, over 4696238.89 frames. ], batch size: 43, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:59:30,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 13:59:31,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:59:36,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:59:45,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:59:45,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:46,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:59:46,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 13:59:53,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:59:53,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:59:55,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:59:55,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1291646.6666666667, ans=0.0 2023-10-03 13:59:56,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 13:59:56,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 13:59:58,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:59:58,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1291713.3333333333, ans=0.015 2023-10-03 13:59:59,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:59:59,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 14:00:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:01,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 14:00:01,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:01,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1291713.3333333333, ans=0.0 2023-10-03 14:00:07,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:00:07,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:00:07,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1291713.3333333333, ans=0.0 2023-10-03 14:00:10,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:00:11,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-10-03 14:00:12,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 14:00:12,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:00:13,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:17,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:20,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:20,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1291780.0, ans=0.125 2023-10-03 14:00:24,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:00:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:00:32,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1291846.6666666667, ans=0.1 2023-10-03 14:00:33,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 14:00:33,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:00:33,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:00:35,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:00:35,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:00:35,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1291846.6666666667, ans=0.0 2023-10-03 14:00:37,314 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 14:00:37,314 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 14:00:37,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 14:00:40,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:43,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 14:00:43,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 14:00:44,575 INFO [train.py:1046] (2/4) Epoch 37, batch 2550, loss[loss=0.1558, simple_loss=0.2477, pruned_loss=0.03196, over 24652.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.237, pruned_loss=0.03871, over 4724602.78 frames. ], batch size: 73, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:00:44,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:00:44,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 14:00:47,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 14:00:50,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:00:52,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:00:52,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:00:54,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:00:56,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 14:00:56,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:01:00,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 14:01:01,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:01:03,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:05,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:01:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 14:01:06,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.74 vs. limit=15.0 2023-10-03 14:01:06,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:01:06,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:01:06,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:01:10,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:01:10,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 14:01:10,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:01:10,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:10,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 14:01:16,194 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.844e+02 2.001e+02 2.201e+02 4.188e+02, threshold=4.002e+02, percent-clipped=1.0 2023-10-03 14:01:20,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1292046.6666666667, ans=0.1 2023-10-03 14:01:21,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:01:25,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:01:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:26,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:01:26,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:01:34,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:01:37,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:01:37,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:01:37,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:01:37,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 14:01:38,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:01:38,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1292113.3333333333, ans=0.0 2023-10-03 14:01:43,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:01:43,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:49,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:01:49,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 14:01:49,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:01:50,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:51,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:01:52,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:01:53,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1292180.0, ans=0.125 2023-10-03 14:01:54,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:01:59,004 INFO [train.py:1046] (2/4) Epoch 37, batch 2600, loss[loss=0.1517, simple_loss=0.2322, pruned_loss=0.03562, over 24450.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2385, pruned_loss=0.03914, over 4729603.15 frames. ], batch size: 63, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:02:00,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:02:01,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1292246.6666666667, ans=0.125 2023-10-03 14:02:03,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:06,350 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 14:02:06,508 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 14:02:08,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:02:08,448 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 14:02:08,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 14:02:08,535 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 14:02:10,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1292246.6666666667, ans=0.125 2023-10-03 14:02:12,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:02:12,499 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 14:02:14,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 14:02:15,768 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 14:02:17,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:02:18,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 14:02:18,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 14:02:21,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:02:21,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 14:02:24,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 14:02:24,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 14:02:31,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:02:31,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:31,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:02:31,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 14:02:34,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:02:38,494 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 14:02:38,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1292380.0, ans=0.125 2023-10-03 14:02:45,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:46,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:02:46,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 14:02:47,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:02:47,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:02:47,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 14:02:52,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:02:52,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:02:53,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:02:57,525 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 14:02:57,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:02:57,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:03:03,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:03:03,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:03:03,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 14:03:03,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1292513.3333333333, ans=0.125 2023-10-03 14:03:05,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:03:07,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:03:07,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1292513.3333333333, ans=0.125 2023-10-03 14:03:08,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:03:11,787 INFO [train.py:1046] (2/4) Epoch 37, batch 2650, loss[loss=0.1609, simple_loss=0.2522, pruned_loss=0.0348, over 24550.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2387, pruned_loss=0.03942, over 4724499.06 frames. ], batch size: 71, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:03:12,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1292580.0, ans=0.125 2023-10-03 14:03:13,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 14:03:15,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:17,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:03:20,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 14:03:20,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:21,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:03:23,072 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 14:03:23,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:03:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:26,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1292646.6666666667, ans=0.2 2023-10-03 14:03:26,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.82 vs. limit=15.0 2023-10-03 14:03:27,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:03:27,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:03:29,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.09 vs. limit=22.5 2023-10-03 14:03:30,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:03:30,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 14:03:30,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:03:32,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:03:34,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 14:03:36,208 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 14:03:38,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:03:40,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 14:03:40,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:03:42,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 14:03:43,643 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.832e+02 2.042e+02 2.355e+02 4.298e+02, threshold=4.084e+02, percent-clipped=1.0 2023-10-03 14:03:45,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:45,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:03:45,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:47,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:03:49,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1292713.3333333333, ans=0.015 2023-10-03 14:03:50,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 14:03:50,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 14:03:53,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:03:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 14:03:56,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:57,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:03:59,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:03:59,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:03:59,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:04:00,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1292780.0, ans=0.125 2023-10-03 14:04:01,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:04:04,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:04:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:04:04,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:04:05,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:04:07,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:07,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:04:08,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:11,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:04:11,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:04:15,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:17,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:04:17,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:17,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 14:04:20,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:04:21,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:21,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:22,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=15.0 2023-10-03 14:04:24,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:25,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:04:25,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:27,233 INFO [train.py:1046] (2/4) Epoch 37, batch 2700, loss[loss=0.1566, simple_loss=0.2279, pruned_loss=0.04261, over 23651.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2398, pruned_loss=0.0403, over 4711896.28 frames. ], batch size: 256, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:04:29,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:04:29,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 14:04:31,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:04:33,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 14:04:34,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:04:34,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:34,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:36,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:04:36,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:37,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:04:37,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:04:37,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 14:04:38,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:04:40,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:04:43,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:04:43,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:46,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:04:47,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 14:04:47,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:04:49,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1292980.0, ans=0.1 2023-10-03 14:04:52,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:04:52,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:04:59,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:04:59,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:04:59,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:04:59,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:05:02,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:02,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1293046.6666666667, ans=0.5 2023-10-03 14:05:05,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:05:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:05:05,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:05:05,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1293046.6666666667, ans=0.0 2023-10-03 14:05:09,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:09,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:05:14,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1293113.3333333333, ans=0.2 2023-10-03 14:05:15,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:05:15,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:05:20,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:05:20,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:24,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:26,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:27,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:05:27,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:29,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:30,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:05:31,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:05:32,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:05:32,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:05:35,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 14:05:36,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:40,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:05:40,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 14:05:41,438 INFO [train.py:1046] (2/4) Epoch 37, batch 2750, loss[loss=0.168, simple_loss=0.2528, pruned_loss=0.04163, over 24102.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2398, pruned_loss=0.04004, over 4705730.46 frames. ], batch size: 86, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:05:42,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 14:05:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:44,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:05:44,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:48,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:48,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:05:48,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:52,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:05:53,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:05:54,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:05:54,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:54,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 14:05:54,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:05:54,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:06:00,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 14:06:02,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:06:03,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:03,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:06:03,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:06:03,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1293313.3333333333, ans=0.125 2023-10-03 14:06:05,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:06,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:06:06,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:07,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:08,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.91 vs. limit=15.0 2023-10-03 14:06:11,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:06:11,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:06:12,526 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.957e+02 2.205e+02 2.466e+02 3.615e+02, threshold=4.410e+02, percent-clipped=0.0 2023-10-03 14:06:12,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:06:14,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:15,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:06:17,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1293380.0, ans=0.0 2023-10-03 14:06:20,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-03 14:06:21,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:23,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:06:23,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:27,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:27,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:06:28,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:06:33,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:06:35,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:06:35,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 14:06:38,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1293446.6666666667, ans=0.125 2023-10-03 14:06:39,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:41,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 14:06:41,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1293513.3333333333, ans=0.125 2023-10-03 14:06:44,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 14:06:45,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:06:45,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 14:06:45,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1293513.3333333333, ans=0.125 2023-10-03 14:06:46,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:06:48,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:06:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 14:06:48,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:06:54,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 14:06:54,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:06:54,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:06:54,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 14:06:55,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:56,291 INFO [train.py:1046] (2/4) Epoch 37, batch 2800, loss[loss=0.1631, simple_loss=0.2502, pruned_loss=0.03797, over 24356.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2376, pruned_loss=0.03994, over 4687784.63 frames. ], batch size: 77, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:06:56,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:57,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:59,095 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 14:06:59,096 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 14:06:59,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-10-03 14:07:01,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:07:03,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:07:03,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:07:06,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:07:09,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 14:07:10,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:07:12,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 14:07:12,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:12,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1293646.6666666667, ans=0.1 2023-10-03 14:07:13,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:07:13,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:17,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:07:17,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:17,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:07:19,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:07:19,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.51 vs. limit=6.0 2023-10-03 14:07:26,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:07:27,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1293713.3333333333, ans=0.0 2023-10-03 14:07:28,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:07:30,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:31,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:07:31,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:33,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1293713.3333333333, ans=0.0 2023-10-03 14:07:37,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:07:37,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 14:07:39,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:07:40,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:07:40,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:07:40,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1293780.0, ans=0.125 2023-10-03 14:07:42,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1293780.0, ans=0.0 2023-10-03 14:07:43,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:07:43,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:07:49,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:07:49,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:07:51,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:07:52,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:07:53,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:53,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 14:07:53,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:07:53,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:07:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:07:55,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1293846.6666666667, ans=0.125 2023-10-03 14:07:56,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 14:07:58,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:58,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:07:58,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:08:00,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 14:08:06,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:08:06,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:08:06,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:08:09,382 INFO [train.py:1046] (2/4) Epoch 37, batch 2850, loss[loss=0.1387, simple_loss=0.2166, pruned_loss=0.03043, over 24358.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2378, pruned_loss=0.03982, over 4701666.29 frames. ], batch size: 56, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:08:10,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:08:11,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.96 vs. limit=22.5 2023-10-03 14:08:12,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:08:14,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:14,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:08:17,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:18,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:08:19,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:08:19,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1293913.3333333333, ans=0.125 2023-10-03 14:08:21,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 14:08:26,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 14:08:26,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:08:28,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1293980.0, ans=0.025 2023-10-03 14:08:29,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 14:08:29,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:32,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 14:08:32,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 14:08:34,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:41,022 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.871e+02 2.042e+02 2.334e+02 3.256e+02, threshold=4.084e+02, percent-clipped=0.0 2023-10-03 14:08:45,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:47,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:08:47,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:08:48,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:08:48,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:08:48,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:08:50,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:08:50,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 14:08:53,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:08:53,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:08:53,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:55,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:56,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:57,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:58,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:00,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:09:02,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:09:02,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:04,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:09:10,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1294180.0, ans=0.0 2023-10-03 14:09:12,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:09:13,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 14:09:13,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 14:09:16,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:09:16,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:16,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 14:09:16,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:09:16,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1294180.0, ans=0.2 2023-10-03 14:09:17,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:17,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:09:19,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:09:19,131 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 14:09:19,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 14:09:19,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:09:19,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:19,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1294180.0, ans=0.035 2023-10-03 14:09:23,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:09:23,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:09:25,126 INFO [train.py:1046] (2/4) Epoch 37, batch 2900, loss[loss=0.1494, simple_loss=0.2368, pruned_loss=0.03099, over 23267.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2377, pruned_loss=0.03994, over 4699953.40 frames. ], batch size: 105, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:09:25,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:09:26,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 14:09:31,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:31,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 14:09:32,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 14:09:33,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:09:33,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:09:35,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:09:35,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:09:38,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:09:40,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:41,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:09:41,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 14:09:41,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:09:44,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:46,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 14:09:47,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 14:09:48,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:48,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 14:09:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:09:52,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:09:52,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:09:55,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:09:56,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:59,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:10:00,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:04,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 14:10:04,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 14:10:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:10:06,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:10:08,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 14:10:09,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:10:14,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:10:19,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1294446.6666666667, ans=0.2 2023-10-03 14:10:21,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:10:21,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:10:24,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 14:10:27,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:27,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 14:10:27,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:10:27,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:10:36,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:10:38,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 14:10:39,301 INFO [train.py:1046] (2/4) Epoch 37, batch 2950, loss[loss=0.1698, simple_loss=0.2443, pruned_loss=0.04769, over 23293.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2388, pruned_loss=0.04016, over 4705447.44 frames. ], batch size: 119, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:10:39,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:10:39,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:39,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1294580.0, ans=0.0 2023-10-03 14:10:40,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:10:41,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-03 14:10:42,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:10:43,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 14:10:45,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 14:10:45,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:10:45,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:10:49,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:10:50,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:10:52,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1294646.6666666667, ans=0.2 2023-10-03 14:10:54,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:10:54,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:10:58,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:11:00,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:11:01,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:11:02,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:11:02,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:11:03,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1294646.6666666667, ans=0.1 2023-10-03 14:11:04,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 14:11:09,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 14:11:10,187 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.983e+02 2.182e+02 2.466e+02 3.781e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 14:11:10,279 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 14:11:11,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:11:12,964 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 14:11:14,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 14:11:16,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:11:17,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:11:17,569 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 14:11:17,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:11:19,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 14:11:20,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:11:20,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:11:23,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:11:23,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:11:23,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:24,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 14:11:24,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:11:26,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 14:11:32,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:34,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:11:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 14:11:34,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1294780.0, ans=0.2 2023-10-03 14:11:34,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1294780.0, ans=0.125 2023-10-03 14:11:35,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:11:36,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 14:11:39,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:11:40,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:11:40,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:11:42,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:42,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:11:43,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:11:45,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:45,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:11:45,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:11:45,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1294846.6666666667, ans=0.0 2023-10-03 14:11:47,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:11:47,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:11:49,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:49,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 14:11:50,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:52,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:11:53,420 INFO [train.py:1046] (2/4) Epoch 37, batch 3000, loss[loss=0.1635, simple_loss=0.2392, pruned_loss=0.04391, over 23553.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2393, pruned_loss=0.04017, over 4717568.38 frames. ], batch size: 149, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:11:53,420 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 14:12:05,417 INFO [train.py:1078] (2/4) Epoch 37, validation: loss=0.3637, simple_loss=0.2861, pruned_loss=0.2207, over 1125622.00 frames. 2023-10-03 14:12:05,418 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 14:12:05,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:12:08,634 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 14:12:09,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 14:12:11,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:12:11,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:12:12,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 14:12:12,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:12:14,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1294913.3333333333, ans=0.125 2023-10-03 14:12:17,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1294913.3333333333, ans=0.1 2023-10-03 14:12:18,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:12:21,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1294980.0, ans=0.0 2023-10-03 14:12:30,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:12:35,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1295046.6666666667, ans=0.0 2023-10-03 14:12:36,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 14:12:38,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:12:39,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:12:41,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:12:41,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:12:42,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:12:42,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 14:12:45,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 14:12:47,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:12:47,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:12:49,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1295113.3333333333, ans=0.2 2023-10-03 14:12:50,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:12:50,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:12:51,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:12:51,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:12:56,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:12:57,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:12:57,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:12:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:13:03,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 14:13:03,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:13:04,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:04,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:13:04,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1295180.0, ans=0.125 2023-10-03 14:13:09,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:10,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:11,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:13:11,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 14:13:11,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:13:11,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 14:13:11,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:13:13,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 14:13:17,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:13:19,166 INFO [train.py:1046] (2/4) Epoch 37, batch 3050, loss[loss=0.1569, simple_loss=0.2268, pruned_loss=0.04352, over 23817.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.24, pruned_loss=0.04005, over 4733666.03 frames. ], batch size: 212, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:13:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:13:19,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 14:13:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 14:13:20,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:13:20,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1295246.6666666667, ans=0.2 2023-10-03 14:13:22,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:13:23,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:23,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:13:23,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:23,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:13:26,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 14:13:28,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:13:30,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:30,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:13:31,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1295246.6666666667, ans=0.125 2023-10-03 14:13:33,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:36,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 14:13:43,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 14:13:43,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 14:13:43,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1295313.3333333333, ans=0.125 2023-10-03 14:13:44,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:13:47,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:13:49,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:49,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:50,504 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.847e+02 2.000e+02 2.223e+02 2.874e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-03 14:13:50,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:13:53,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:13:53,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:13:55,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:13:55,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:55,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:13:55,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1295380.0, ans=0.125 2023-10-03 14:13:56,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:59,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:00,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:14:02,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 14:14:02,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:14:02,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:14:04,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:14:06,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:14:06,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:14:07,038 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:14:08,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:12,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:14:14,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:15,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1295446.6666666667, ans=0.1 2023-10-03 14:14:17,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:18,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:14:18,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:14:20,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:14:20,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:14:22,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:14:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 14:14:24,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:14:24,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:25,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 14:14:26,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1295513.3333333333, ans=0.0 2023-10-03 14:14:27,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:33,151 INFO [train.py:1046] (2/4) Epoch 37, batch 3100, loss[loss=0.1679, simple_loss=0.2354, pruned_loss=0.05024, over 23839.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2395, pruned_loss=0.03999, over 4726351.53 frames. ], batch size: 150, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:14:33,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:35,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:14:36,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:14:37,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 14:14:41,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 14:14:41,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 14:14:42,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:14:43,353 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-10-03 14:14:47,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:14:47,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:48,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-10-03 14:14:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 14:14:53,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:57,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 14:15:01,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:15:02,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:02,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:15:02,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:15:04,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 14:15:07,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:15:07,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 14:15:07,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:15:07,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:15:09,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 14:15:10,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:15:10,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1295713.3333333333, ans=0.0 2023-10-03 14:15:10,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1295713.3333333333, ans=0.0 2023-10-03 14:15:12,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1295713.3333333333, ans=0.125 2023-10-03 14:15:13,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:15:13,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 14:15:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 14:15:16,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:18,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:15:19,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:19,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:15:20,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.30 vs. limit=15.0 2023-10-03 14:15:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:15:21,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:15:22,524 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=22.5 2023-10-03 14:15:22,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:15:24,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:15:24,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:24,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:15:29,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:15:29,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 14:15:32,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:15:33,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 14:15:33,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:33,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:35,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 14:15:47,055 INFO [train.py:1046] (2/4) Epoch 37, batch 3150, loss[loss=0.1406, simple_loss=0.2238, pruned_loss=0.02866, over 24672.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2385, pruned_loss=0.03964, over 4730003.26 frames. ], batch size: 65, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:15:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 14:15:48,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:15:49,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:51,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:15:51,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:15:53,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 14:15:55,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:15:55,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:15:55,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 14:15:57,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:59,360 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 14:16:00,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 14:16:02,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:16:03,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 14:16:03,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 14:16:06,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 14:16:06,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 14:16:08,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 14:16:08,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:16:08,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:16:08,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:16:10,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 14:16:12,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:16:12,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:16:14,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:16:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:16:17,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1296046.6666666667, ans=0.125 2023-10-03 14:16:18,521 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.944e+02 2.154e+02 2.658e+02 3.587e+02, threshold=4.309e+02, percent-clipped=0.0 2023-10-03 14:16:19,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 14:16:19,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:16:23,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:16:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:16:24,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1296046.6666666667, ans=0.125 2023-10-03 14:16:25,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 14:16:26,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 14:16:27,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:16:27,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:16:27,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:16:28,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.01 vs. limit=15.0 2023-10-03 14:16:29,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:16:29,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:16:30,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:16:30,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:16:31,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 14:16:32,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:16:33,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:34,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:16:34,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:16:36,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 14:16:36,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:16:37,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 14:16:37,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:37,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 14:16:39,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 14:16:41,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:16:41,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:16:42,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 14:16:43,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 14:16:43,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:16:45,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1296180.0, ans=0.1 2023-10-03 14:16:46,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:16:48,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:48,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:16:54,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:16:55,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:58,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 14:17:01,199 INFO [train.py:1046] (2/4) Epoch 37, batch 3200, loss[loss=0.1724, simple_loss=0.2538, pruned_loss=0.04546, over 24374.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2369, pruned_loss=0.03928, over 4709275.28 frames. ], batch size: 77, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:17:02,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:17:02,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 14:17:02,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1296246.6666666667, ans=0.0 2023-10-03 14:17:06,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:17:07,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:17:07,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 14:17:10,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:17:17,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:17:18,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:17:25,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1296313.3333333333, ans=0.1 2023-10-03 14:17:28,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:17:32,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1296380.0, ans=0.2 2023-10-03 14:17:35,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 14:17:36,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:17:38,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 14:17:39,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:17:44,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:17:44,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:17:44,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:17:47,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 14:17:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 14:17:52,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 14:17:55,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 14:17:56,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:18:02,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:02,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:18:02,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:03,473 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 14:18:03,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:18:07,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:08,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 14:18:09,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 14:18:10,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 14:18:12,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 14:18:13,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:18:14,909 INFO [train.py:1046] (2/4) Epoch 37, batch 3250, loss[loss=0.1567, simple_loss=0.2277, pruned_loss=0.04279, over 23806.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2367, pruned_loss=0.03941, over 4708414.56 frames. ], batch size: 212, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:18:16,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:18:16,930 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 14:18:16,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:18:16,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:18,409 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 14:18:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:18:24,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:18:34,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:18:34,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 14:18:34,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:35,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:35,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:18:37,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:18:37,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:18:39,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:39,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:18:39,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:41,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:41,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:41,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:18:43,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:18:43,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1296713.3333333333, ans=0.0 2023-10-03 14:18:44,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:18:45,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:45,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:47,722 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.697e+02 1.879e+02 2.099e+02 2.263e+02 3.172e+02, threshold=4.197e+02, percent-clipped=0.0 2023-10-03 14:18:49,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:49,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:18:49,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:18:54,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.30 vs. limit=22.5 2023-10-03 14:18:55,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 14:18:56,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:18:56,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:18:58,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:59,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:18:59,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1296780.0, ans=0.035 2023-10-03 14:19:01,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1296780.0, ans=0.125 2023-10-03 14:19:01,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1296780.0, ans=0.2 2023-10-03 14:19:03,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:19:11,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:19:12,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:12,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 14:19:12,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:19:12,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:19:12,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1296846.6666666667, ans=0.5 2023-10-03 14:19:12,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1296846.6666666667, ans=0.2 2023-10-03 14:19:13,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:15,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 14:19:16,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 14:19:16,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:19:18,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:19:19,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:19:19,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 14:19:21,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:19:23,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:19:23,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1296846.6666666667, ans=0.0 2023-10-03 14:19:24,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:19:25,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 14:19:25,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:26,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:19:26,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 14:19:29,182 INFO [train.py:1046] (2/4) Epoch 37, batch 3300, loss[loss=0.1657, simple_loss=0.2505, pruned_loss=0.04043, over 24483.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2369, pruned_loss=0.03926, over 4711336.08 frames. ], batch size: 69, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:19:29,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:19:29,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 14:19:32,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 14:19:33,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 14:19:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:19:36,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:19:37,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:19:37,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:39,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:19:40,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:19:43,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:44,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:19:47,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 14:19:48,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:19:48,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:50,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:52,651 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 14:19:52,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:19:52,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:19:53,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.35 vs. limit=15.0 2023-10-03 14:19:54,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:19:54,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:19:54,698 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 14:19:57,713 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:19:58,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:20:00,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:20:01,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:01,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 14:20:03,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 14:20:03,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:05,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:20:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 14:20:08,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 14:20:08,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:20:12,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 14:20:14,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:20:17,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:20:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:20:21,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:20:21,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:20:21,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:20:21,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:20:23,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:20:23,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:24,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:20:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 14:20:27,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 14:20:28,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:20:30,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:20:30,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:31,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:20:31,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:31,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:20:33,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:34,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:20:34,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1297180.0, ans=0.0 2023-10-03 14:20:35,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:20:39,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 14:20:40,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:41,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:41,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:20:41,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1297246.6666666667, ans=0.0 2023-10-03 14:20:42,807 INFO [train.py:1046] (2/4) Epoch 37, batch 3350, loss[loss=0.1722, simple_loss=0.2603, pruned_loss=0.04208, over 24315.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03945, over 4722354.37 frames. ], batch size: 77, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:20:42,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:20:42,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:20:45,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:45,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:50,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:20:51,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:53,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:20:56,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:58,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:21:00,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:21:01,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:21:03,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 14:21:03,149 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 14:21:04,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:21:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 14:21:07,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 14:21:08,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:21:08,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:21:08,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:08,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 14:21:10,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:10,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:21:11,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:14,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:14,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:14,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:21:14,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1297380.0, ans=0.0 2023-10-03 14:21:15,430 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.884e+02 2.039e+02 2.303e+02 3.240e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 14:21:18,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:18,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1297380.0, ans=0.2 2023-10-03 14:21:21,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:21,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:26,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:21:26,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:28,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:28,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:31,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:35,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 14:21:35,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:21:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 14:21:37,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:21:38,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 14:21:38,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:39,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:44,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:44,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 14:21:45,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:21:45,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:21:47,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:21:50,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1297513.3333333333, ans=0.0 2023-10-03 14:21:52,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:21:53,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1297513.3333333333, ans=0.0 2023-10-03 14:21:55,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 14:21:55,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:21:55,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:21:57,596 INFO [train.py:1046] (2/4) Epoch 37, batch 3400, loss[loss=0.1687, simple_loss=0.24, pruned_loss=0.04869, over 23747.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2393, pruned_loss=0.0399, over 4721338.17 frames. ], batch size: 164, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:21:58,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:59,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 14:22:01,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:22:01,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 14:22:01,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1297580.0, ans=0.0 2023-10-03 14:22:02,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:22:02,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:22:04,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:22:05,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:22:05,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 14:22:09,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 14:22:09,740 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 14:22:09,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:14,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:22:15,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:22:15,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:17,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:22:20,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:22:21,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.58 vs. limit=15.0 2023-10-03 14:22:22,886 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.07 vs. limit=22.5 2023-10-03 14:22:24,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 14:22:28,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:22:30,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:30,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:22:31,070 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.27 vs. limit=10.0 2023-10-03 14:22:32,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 14:22:38,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:22:41,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 14:22:46,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:48,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:48,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 14:22:48,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:22:48,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:22:49,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:22:50,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:22:52,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:52,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1297780.0, ans=0.125 2023-10-03 14:22:55,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:22:55,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:22:57,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1297846.6666666667, ans=0.2 2023-10-03 14:23:02,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:23:03,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 14:23:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:23:12,536 INFO [train.py:1046] (2/4) Epoch 37, batch 3450, loss[loss=0.1817, simple_loss=0.2435, pruned_loss=0.05998, over 19609.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2403, pruned_loss=0.04042, over 4711727.22 frames. ], batch size: 388, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:23:14,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 14:23:15,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.78 vs. limit=12.0 2023-10-03 14:23:15,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 14:23:16,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:23:18,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:23:18,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 14:23:19,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:23:19,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1297913.3333333333, ans=0.125 2023-10-03 14:23:22,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:23:27,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:23:29,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:23:30,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:23:31,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:23:32,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:23:33,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1297980.0, ans=0.1 2023-10-03 14:23:38,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 14:23:45,258 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.909e+02 2.144e+02 2.320e+02 3.378e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 14:23:45,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 14:23:45,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:23:45,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:23:45,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:23:52,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 14:23:52,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:23:56,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:23:56,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:23:57,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.73 vs. limit=15.0 2023-10-03 14:23:58,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:23:58,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:24:00,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 14:24:00,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:24:01,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:24:02,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1298113.3333333333, ans=0.0 2023-10-03 14:24:02,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=1298113.3333333333, ans=0.05 2023-10-03 14:24:03,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:24:07,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 14:24:09,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:24:13,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:24:14,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:16,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:19,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1298180.0, ans=0.125 2023-10-03 14:24:21,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:21,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:24:21,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:24:23,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:24:26,299 INFO [train.py:1046] (2/4) Epoch 37, batch 3500, loss[loss=0.137, simple_loss=0.2187, pruned_loss=0.02764, over 24339.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2393, pruned_loss=0.04011, over 4713968.33 frames. ], batch size: 56, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:24:27,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:31,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:24:31,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 14:24:33,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:24:35,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:24:38,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:38,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 14:24:43,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:24:44,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:24:44,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:24:44,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:24:45,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:24:47,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:47,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:24:47,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 14:24:51,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:51,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:24:52,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:24:56,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:58,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 14:24:59,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:25:01,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:25:02,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:25:03,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:06,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:25:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:25:06,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1298380.0, ans=0.2 2023-10-03 14:25:08,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 14:25:10,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 14:25:10,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 14:25:10,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:25:11,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:11,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:25:11,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:25:14,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:25:14,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:25:18,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:25:20,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 14:25:20,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 14:25:20,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:25:20,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1298446.6666666667, ans=0.1 2023-10-03 14:25:24,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:25:24,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:25:26,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:28,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 14:25:30,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:25:31,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:25:31,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 14:25:34,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 14:25:37,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:37,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:25:37,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:25:37,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:25:41,959 INFO [train.py:1046] (2/4) Epoch 37, batch 3550, loss[loss=0.1509, simple_loss=0.2283, pruned_loss=0.03681, over 18253.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.238, pruned_loss=0.03943, over 4721516.63 frames. ], batch size: 39, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:25:42,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:25:42,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.68 vs. limit=15.0 2023-10-03 14:25:47,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:25:49,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 14:25:53,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:25:53,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:25:53,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1298580.0, ans=0.125 2023-10-03 14:25:56,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:25:58,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:25:58,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:25:58,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1298646.6666666667, ans=0.0 2023-10-03 14:26:01,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:26:01,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:26:02,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:26:02,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:26:04,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:26:06,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1298646.6666666667, ans=0.125 2023-10-03 14:26:09,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:26:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:26:11,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:26:11,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:26:11,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:26:11,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 14:26:12,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1298713.3333333333, ans=0.2 2023-10-03 14:26:13,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:14,582 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.883e+02 2.070e+02 2.275e+02 3.078e+02, threshold=4.140e+02, percent-clipped=0.0 2023-10-03 14:26:14,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:14,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:26:20,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:26:20,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:26:21,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.70 vs. limit=15.0 2023-10-03 14:26:21,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:26:23,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 14:26:25,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:26:26,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 14:26:26,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:26:28,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:26:29,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:26:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 14:26:33,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:26:39,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:26:40,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 14:26:40,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:44,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:44,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 14:26:52,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 14:26:52,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:26:52,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:26:54,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:54,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1298913.3333333333, ans=0.125 2023-10-03 14:26:55,304 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.07 vs. limit=15.0 2023-10-03 14:26:56,228 INFO [train.py:1046] (2/4) Epoch 37, batch 3600, loss[loss=0.1641, simple_loss=0.2344, pruned_loss=0.04688, over 23676.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2381, pruned_loss=0.03929, over 4717594.43 frames. ], batch size: 232, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:26:56,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:56,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:27:01,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:27:04,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:05,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:27:07,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:27:07,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:07,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 14:27:07,788 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.22 vs. limit=15.0 2023-10-03 14:27:13,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:27:14,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:16,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1298980.0, ans=0.125 2023-10-03 14:27:17,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:27:18,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:27:20,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:27:21,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:27:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 14:27:22,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:27:24,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:24,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:27:25,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:27,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:27:27,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:27:29,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 14:27:37,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:27:37,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:27:39,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 14:27:43,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:27:43,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1299113.3333333333, ans=0.125 2023-10-03 14:27:46,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:48,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:53,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1299113.3333333333, ans=0.95 2023-10-03 14:27:54,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:27:54,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:27:54,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 14:27:55,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 14:27:57,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 14:27:59,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:28:00,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:28:00,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 14:28:00,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:02,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:28:02,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:28:02,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 14:28:03,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1299180.0, ans=0.125 2023-10-03 14:28:04,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 14:28:05,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:28:05,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 14:28:10,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 14:28:10,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1299246.6666666667, ans=0.125 2023-10-03 14:28:11,740 INFO [train.py:1046] (2/4) Epoch 37, batch 3650, loss[loss=0.1606, simple_loss=0.2376, pruned_loss=0.04185, over 23783.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2382, pruned_loss=0.03926, over 4717451.75 frames. ], batch size: 212, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:28:11,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:28:12,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.88 vs. limit=22.5 2023-10-03 14:28:14,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 14:28:15,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 14:28:17,625 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:28:18,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:28:18,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:28:18,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:28:21,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:28:22,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:28:22,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 14:28:24,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:28:26,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:26,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 14:28:26,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1299313.3333333333, ans=0.125 2023-10-03 14:28:27,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:28:29,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:28:29,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:28:30,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:28:32,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1299313.3333333333, ans=0.2 2023-10-03 14:28:33,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 14:28:35,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 14:28:36,238 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.51 vs. limit=22.5 2023-10-03 14:28:37,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:28:40,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 14:28:40,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:28:40,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:28:40,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1299380.0, ans=0.125 2023-10-03 14:28:42,201 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=22.5 2023-10-03 14:28:44,297 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.976e+02 2.171e+02 2.481e+02 4.276e+02, threshold=4.341e+02, percent-clipped=1.0 2023-10-03 14:28:47,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:28:48,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:28:48,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:28:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:28:51,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:28:54,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:28:55,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:56,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:28:56,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:28:58,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:28:59,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:29:01,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:10,343 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 14:29:15,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:29:15,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:15,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1299513.3333333333, ans=0.125 2023-10-03 14:29:16,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:29:16,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:16,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1299513.3333333333, ans=0.125 2023-10-03 14:29:18,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:29:19,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:20,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 14:29:20,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:23,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:29:24,813 INFO [train.py:1046] (2/4) Epoch 37, batch 3700, loss[loss=0.1608, simple_loss=0.2342, pruned_loss=0.04369, over 23614.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03944, over 4713653.92 frames. ], batch size: 232, lr: 2.74e-03, grad_scale: 32.0 2023-10-03 14:29:26,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:29:27,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:29:29,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1299580.0, ans=0.0 2023-10-03 14:29:30,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:30,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 14:29:30,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:31,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:29:33,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:29:34,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:29:39,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:29:40,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:29:40,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:29:42,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:42,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:29:45,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:29:47,336 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 14:29:53,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:29:54,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:29:55,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:29:55,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 14:29:55,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:29:58,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:59,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=15.0 2023-10-03 14:29:59,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 14:29:59,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:01,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:30:02,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:04,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:30:06,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:30:10,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:30:10,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 14:30:12,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:30:12,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 14:30:18,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:30:18,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:30:19,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:30:21,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 14:30:22,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:30:22,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:30:23,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:30:23,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:30:24,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1299846.6666666667, ans=0.2 2023-10-03 14:30:26,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:30:28,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 14:30:29,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 14:30:30,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:30:30,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:32,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:30:33,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:30:35,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:36,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:30:37,876 INFO [train.py:1046] (2/4) Epoch 37, batch 3750, loss[loss=0.1655, simple_loss=0.2366, pruned_loss=0.04716, over 23717.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2393, pruned_loss=0.03975, over 4732604.45 frames. ], batch size: 212, lr: 2.74e-03, grad_scale: 32.0 2023-10-03 14:30:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:30:39,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 14:30:41,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 14:30:43,532 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:30:44,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:30:44,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 14:30:46,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:30:46,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:46,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1299913.3333333333, ans=0.1 2023-10-03 14:30:48,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:48,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1299913.3333333333, ans=0.2 2023-10-03 14:30:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:30:50,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1299913.3333333333, ans=0.125 2023-10-03 14:30:52,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:30:52,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1299980.0, ans=0.1 2023-10-03 14:30:56,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:30:57,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:31:00,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:31:02,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:31:03,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 14:31:03,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:31:04,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:31:04,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:31:07,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 14:31:10,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.862e+02 2.050e+02 2.343e+02 3.351e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-03 14:31:11,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 14:31:12,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:31:12,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:31:14,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:31:16,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1300046.6666666667, ans=0.1 2023-10-03 14:31:20,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:31:20,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:31:25,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 14:31:28,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:31:31,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:31:31,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:31:35,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:31:37,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1300180.0, ans=0.0 2023-10-03 14:31:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:31:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:31:43,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:31:43,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1300180.0, ans=0.125 2023-10-03 14:31:44,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:31:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:31:49,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=1300180.0, ans=0.02 2023-10-03 14:31:50,109 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.79 vs. limit=15.0 2023-10-03 14:31:51,815 INFO [train.py:1046] (2/4) Epoch 37, batch 3800, loss[loss=0.143, simple_loss=0.2321, pruned_loss=0.02693, over 24471.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2397, pruned_loss=0.03979, over 4733076.90 frames. ], batch size: 63, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:31:54,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:31:57,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1300246.6666666667, ans=0.1 2023-10-03 14:31:58,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:00,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:32:00,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 14:32:02,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:32:04,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:04,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:32:07,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 14:32:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:08,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:32:12,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:32:12,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:32:12,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:12,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1300313.3333333333, ans=0.125 2023-10-03 14:32:12,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1300313.3333333333, ans=0.0 2023-10-03 14:32:13,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 14:32:16,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 14:32:16,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1300313.3333333333, ans=0.1 2023-10-03 14:32:17,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:32:20,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:23,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:32:23,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:32:25,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:32:25,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:26,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:27,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:32,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1300380.0, ans=0.125 2023-10-03 14:32:32,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 14:32:33,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:32:33,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 14:32:34,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:32:39,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:32:39,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1300446.6666666667, ans=0.2 2023-10-03 14:32:43,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:32:47,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 14:32:48,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 14:32:48,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:50,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:32:50,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:52,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1300513.3333333333, ans=0.125 2023-10-03 14:32:53,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 14:32:55,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 14:32:55,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 14:32:55,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:57,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:33:03,882 INFO [train.py:1046] (2/4) Epoch 37, batch 3850, loss[loss=0.1457, simple_loss=0.2258, pruned_loss=0.03281, over 24495.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2386, pruned_loss=0.03945, over 4729074.55 frames. ], batch size: 63, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:33:03,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:33:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:33:08,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:33:10,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 14:33:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:33:11,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:33:16,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:33:16,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:33:17,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1300580.0, ans=0.125 2023-10-03 14:33:19,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:33:19,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 14:33:23,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:26,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:33:29,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:33:29,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:33:32,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:32,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:33:33,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:33:33,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:33:33,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:33:35,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:33:36,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:36,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:33:38,013 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.860e+02 2.056e+02 2.272e+02 4.240e+02, threshold=4.112e+02, percent-clipped=1.0 2023-10-03 14:33:38,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 14:33:38,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 14:33:39,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:33:39,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:41,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:41,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1300713.3333333333, ans=0.015 2023-10-03 14:33:42,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:42,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 14:33:44,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 14:33:45,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.12 vs. limit=15.0 2023-10-03 14:33:46,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:47,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 14:33:47,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1300780.0, ans=0.125 2023-10-03 14:33:49,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:33:53,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:53,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:58,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:58,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 14:34:01,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 14:34:02,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:02,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:06,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:34:06,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:34:08,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:08,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:09,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:34:09,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 14:34:09,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:34:11,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 14:34:12,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:12,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:15,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:34:15,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:16,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:34:18,203 INFO [train.py:1046] (2/4) Epoch 37, batch 3900, loss[loss=0.1458, simple_loss=0.2337, pruned_loss=0.02892, over 24653.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03921, over 4730615.38 frames. ], batch size: 73, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:34:18,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:18,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:34:19,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:34:19,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 14:34:20,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:23,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:34:26,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:34:26,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:34:27,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:34:28,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1300913.3333333333, ans=0.2 2023-10-03 14:34:29,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1300913.3333333333, ans=0.125 2023-10-03 14:34:30,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:34:30,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:30,860 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:34:31,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:34:33,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 14:34:33,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:34:34,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 14:34:34,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:36,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 14:34:37,883 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:34:39,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 14:34:43,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:34:43,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:34:43,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:34:43,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:34:50,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:34:50,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1301046.6666666667, ans=0.125 2023-10-03 14:34:51,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:34:54,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:34:55,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:34:55,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:34:55,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1301046.6666666667, ans=0.1 2023-10-03 14:35:01,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1301113.3333333333, ans=0.125 2023-10-03 14:35:02,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:35:02,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1301113.3333333333, ans=0.1 2023-10-03 14:35:03,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:35:08,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1301113.3333333333, ans=0.125 2023-10-03 14:35:09,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:35:10,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:35:21,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:35:23,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:35:23,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1301180.0, ans=0.2 2023-10-03 14:35:24,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 14:35:24,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 14:35:26,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:35:26,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 14:35:27,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:35:28,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 14:35:30,265 INFO [train.py:1046] (2/4) Epoch 37, batch 3950, loss[loss=0.1577, simple_loss=0.2364, pruned_loss=0.03947, over 23688.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2374, pruned_loss=0.03924, over 4736277.18 frames. ], batch size: 232, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:35:35,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:35:37,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 14:35:37,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:35:40,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:35:41,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:35:46,892 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 14:35:48,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:35:48,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 14:35:48,979 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 14:35:50,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:35:52,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:35:52,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:35:52,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:35:54,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 14:35:57,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:35:59,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:35:59,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:35:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:36:00,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:36:00,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1301380.0, ans=0.0 2023-10-03 14:36:05,920 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.893e+02 2.072e+02 2.243e+02 3.144e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-03 14:36:11,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:36:11,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:36:18,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 14:36:21,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1301446.6666666667, ans=0.125 2023-10-03 14:36:24,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 14:36:24,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 14:36:24,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:36:25,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:36:27,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1301446.6666666667, ans=0.2 2023-10-03 14:36:27,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1301446.6666666667, ans=0.125 2023-10-03 14:36:34,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:36:34,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:36:34,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:36:35,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:36:35,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 14:36:36,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.47 vs. limit=10.0 2023-10-03 14:36:39,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:36:39,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:36:43,892 INFO [train.py:1046] (2/4) Epoch 37, batch 4000, loss[loss=0.1692, simple_loss=0.2406, pruned_loss=0.04894, over 22891.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2376, pruned_loss=0.03951, over 4729975.10 frames. ], batch size: 322, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:36:43,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 14:36:45,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1301580.0, ans=0.1 2023-10-03 14:36:53,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:36:59,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:37:00,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.74 vs. limit=22.5 2023-10-03 14:37:03,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:03,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:37:05,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:37:05,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 14:37:06,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:37:08,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 14:37:08,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:37:08,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 14:37:09,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:10,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:37:12,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:37:12,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:37:12,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:37:12,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:37:13,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:37:15,056 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 14:37:16,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:37:18,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 14:37:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:37:22,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:37:23,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1301713.3333333333, ans=0.0 2023-10-03 14:37:28,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 14:37:28,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:37:32,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:37:33,415 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 14:37:34,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:37:34,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 14:37:34,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:37:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:36,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:37:37,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:37:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:37:37,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:37:39,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 14:37:40,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:40,488 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 14:37:45,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:37:48,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1301846.6666666667, ans=0.0 2023-10-03 14:37:49,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 14:37:50,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:37:52,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:52,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:37:53,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:37:58,131 INFO [train.py:1046] (2/4) Epoch 37, batch 4050, loss[loss=0.1579, simple_loss=0.233, pruned_loss=0.04137, over 23338.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2376, pruned_loss=0.03908, over 4732804.72 frames. ], batch size: 119, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:37:58,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:59,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:38:00,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1301913.3333333333, ans=0.0 2023-10-03 14:38:01,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 14:38:02,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:38:02,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:05,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:38:05,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:38:06,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:38:08,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1301913.3333333333, ans=0.125 2023-10-03 14:38:09,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:38:12,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:38:12,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:38:15,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:38:15,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:38:19,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:38:21,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:38:24,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 14:38:26,623 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.54 vs. limit=22.5 2023-10-03 14:38:27,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 14:38:27,288 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 14:38:30,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:38:33,273 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.902e+02 2.050e+02 2.347e+02 3.332e+02, threshold=4.101e+02, percent-clipped=0.0 2023-10-03 14:38:36,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 14:38:37,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:38:40,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:43,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:38:43,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:38:43,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:47,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:38:49,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 14:38:49,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:38:51,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:38:54,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 14:38:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:39:02,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.91 vs. limit=15.0 2023-10-03 14:39:04,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 14:39:06,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:39:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:39:07,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 14:39:07,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 14:39:07,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:11,527 INFO [train.py:1046] (2/4) Epoch 37, batch 4100, loss[loss=0.1616, simple_loss=0.2506, pruned_loss=0.03632, over 24661.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2392, pruned_loss=0.03995, over 4726779.02 frames. ], batch size: 68, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:39:11,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:39:11,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:11,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:39:18,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 14:39:20,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 14:39:24,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 14:39:24,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 14:39:24,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:24,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:26,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:26,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:39:27,454 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 14:39:30,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:39:30,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:39:30,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:31,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:39:34,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1302313.3333333333, ans=0.0 2023-10-03 14:39:37,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:39:38,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:39:38,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:39:38,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 14:39:38,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:38,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:39:38,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:39:40,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:39:40,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 14:39:43,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:39:44,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 14:39:45,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:39:48,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:39:48,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 14:39:49,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:39:49,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:39:49,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:39:50,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1302380.0, ans=0.125 2023-10-03 14:39:51,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 14:39:53,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:39:54,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:39:56,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 14:39:57,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:58,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:40:01,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:40:05,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:07,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:40:07,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:40:14,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1302513.3333333333, ans=0.0 2023-10-03 14:40:15,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:15,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1302513.3333333333, ans=0.0 2023-10-03 14:40:16,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:40:20,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:40:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:40:26,398 INFO [train.py:1046] (2/4) Epoch 37, batch 4150, loss[loss=0.1472, simple_loss=0.2146, pruned_loss=0.03993, over 23568.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.03999, over 4711705.68 frames. ], batch size: 256, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:40:27,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1302580.0, ans=0.2 2023-10-03 14:40:29,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:40:30,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:40:32,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:40:32,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:40:34,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 14:40:34,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:34,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 14:40:36,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 14:40:36,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 14:40:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:42,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:40:42,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:45,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:40:46,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:40:46,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:40:48,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.02 vs. limit=10.0 2023-10-03 14:40:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:40:49,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:40:52,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:40:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:59,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:40:59,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 14:41:02,042 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.895e+02 2.138e+02 2.418e+02 3.497e+02, threshold=4.277e+02, percent-clipped=0.0 2023-10-03 14:41:02,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 14:41:02,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:41:03,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 14:41:03,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:41:03,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:41:03,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1302713.3333333333, ans=0.125 2023-10-03 14:41:05,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:06,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:41:10,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 14:41:13,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:41:15,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:41:15,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 14:41:16,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:41:18,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 14:41:21,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:41:21,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:41:23,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:24,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 14:41:24,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:41:24,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:41:25,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:41:27,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 14:41:27,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:27,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:41:27,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:41:28,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 14:41:29,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:41:29,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:41:30,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:41:32,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:33,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 14:41:33,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:41:39,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:41:40,847 INFO [train.py:1046] (2/4) Epoch 37, batch 4200, loss[loss=0.1572, simple_loss=0.2153, pruned_loss=0.04961, over 22773.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2384, pruned_loss=0.03979, over 4706084.57 frames. ], batch size: 322, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:41:41,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 14:41:42,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:41:45,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:41:45,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:41:46,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:41:46,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:41:49,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 14:41:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 14:41:54,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:41:57,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:41:59,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:42:03,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:42:04,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:04,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:05,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 14:42:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:42:07,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:07,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:42:08,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:42:08,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:42:09,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.81 vs. limit=22.5 2023-10-03 14:42:10,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 14:42:10,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1303046.6666666667, ans=22.5 2023-10-03 14:42:11,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:16,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:42:17,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:42:19,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:42:20,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:42:20,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1303046.6666666667, ans=0.125 2023-10-03 14:42:22,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:42:22,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 14:42:23,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:42:23,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1303113.3333333333, ans=0.125 2023-10-03 14:42:25,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:42:25,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1303113.3333333333, ans=0.0 2023-10-03 14:42:25,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1303113.3333333333, ans=0.125 2023-10-03 14:42:29,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-10-03 14:42:30,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:42:31,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:37,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:42:38,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 14:42:41,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:42:47,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:42:47,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:42:50,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 14:42:50,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1303180.0, ans=0.125 2023-10-03 14:42:54,554 INFO [train.py:1046] (2/4) Epoch 37, batch 4250, loss[loss=0.1482, simple_loss=0.2186, pruned_loss=0.03887, over 23382.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2367, pruned_loss=0.03915, over 4692124.09 frames. ], batch size: 285, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:42:54,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:42:57,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:57,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:43:02,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:02,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1303246.6666666667, ans=0.125 2023-10-03 14:43:06,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:43:06,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 14:43:08,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:43:10,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:12,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1303313.3333333333, ans=0.125 2023-10-03 14:43:13,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:43:14,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.29 vs. limit=15.0 2023-10-03 14:43:15,944 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:43:17,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:17,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:19,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1303313.3333333333, ans=0.0 2023-10-03 14:43:20,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:43:20,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:43:21,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:21,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:23,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:25,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:43:25,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:43:26,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 14:43:26,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1303380.0, ans=0.125 2023-10-03 14:43:29,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 14:43:29,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:31,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:43:31,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:32,489 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.912e+02 2.052e+02 2.334e+02 3.222e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-03 14:43:32,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:43:32,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:32,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:35,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:43:36,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:43:40,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:43:42,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:43:42,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 14:43:44,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:43:44,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 14:43:46,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:43:47,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:43:48,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:48,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:43:50,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1303446.6666666667, ans=0.0 2023-10-03 14:43:51,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1303446.6666666667, ans=0.125 2023-10-03 14:43:52,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 14:43:54,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:43:54,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:43:58,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1303513.3333333333, ans=0.95 2023-10-03 14:43:59,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:59,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:44:01,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:44:02,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:44:03,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:44:03,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:44:04,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1303513.3333333333, ans=0.0 2023-10-03 14:44:05,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:44:05,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 14:44:06,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:44:09,544 INFO [train.py:1046] (2/4) Epoch 37, batch 4300, loss[loss=0.1561, simple_loss=0.2272, pruned_loss=0.04248, over 23729.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2363, pruned_loss=0.03917, over 4687913.85 frames. ], batch size: 232, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:44:11,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:44:11,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:44:16,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:44:23,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:44:23,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 14:44:25,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:44:26,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:44:28,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:44:28,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 14:44:29,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:44:31,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:44:32,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1303646.6666666667, ans=0.0 2023-10-03 14:44:33,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 14:44:33,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:44:35,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 14:44:37,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:44:39,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:44:41,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:44:41,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:44:43,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:44:44,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:44:46,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:44:46,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 14:44:48,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 14:44:51,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:44:53,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:44:53,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:44:53,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:44:54,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:44:54,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 14:44:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 14:44:55,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 14:44:56,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:44:58,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 14:44:58,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 14:45:00,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:45:02,323 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 14:45:03,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:45:06,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:06,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:45:07,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 14:45:09,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:45:09,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:45:09,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:45:09,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:45:09,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:45:12,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:45:13,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:14,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:45:14,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:45:21,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 14:45:22,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:45:24,267 INFO [train.py:1046] (2/4) Epoch 37, batch 4350, loss[loss=0.1596, simple_loss=0.2455, pruned_loss=0.03681, over 24653.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.238, pruned_loss=0.03961, over 4697981.57 frames. ], batch size: 65, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:45:26,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:45:29,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:32,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:45:32,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:45:37,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:45:40,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:42,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1303980.0, ans=0.0 2023-10-03 14:45:43,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:45:43,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:45:46,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:45:49,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:45:51,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:45:57,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 14:45:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:45:57,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:02,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1304046.6666666667, ans=0.125 2023-10-03 14:46:03,752 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.872e+02 2.033e+02 2.293e+02 3.578e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-03 14:46:03,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:06,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1304046.6666666667, ans=0.2 2023-10-03 14:46:07,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 14:46:10,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:11,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:46:14,767 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 14:46:16,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:46:16,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:46:17,583 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 14:46:18,882 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 14:46:18,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:46:18,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:46:20,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:46:20,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-10-03 14:46:21,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:46:21,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:46:22,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:46:25,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 14:46:25,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:25,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:25,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:26,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 14:46:26,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1304180.0, ans=0.125 2023-10-03 14:46:27,902 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 14:46:27,913 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 14:46:27,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 14:46:31,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:46:31,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:46:31,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:46:31,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:46:34,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 14:46:35,827 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 14:46:35,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:37,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.53 vs. limit=6.0 2023-10-03 14:46:38,482 INFO [train.py:1046] (2/4) Epoch 37, batch 4400, loss[loss=0.164, simple_loss=0.2502, pruned_loss=0.03895, over 24661.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2383, pruned_loss=0.0394, over 4721630.43 frames. ], batch size: 73, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:46:38,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:46:38,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:41,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:44,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 14:46:44,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 14:46:44,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 14:46:44,093 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 14:46:45,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:46:45,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:46:48,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 14:46:48,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.68 vs. limit=12.0 2023-10-03 14:46:50,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:52,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:46:52,346 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 14:46:52,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1304313.3333333333, ans=0.0 2023-10-03 14:46:54,748 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.25 vs. limit=12.0 2023-10-03 14:46:56,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:46:56,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 14:46:56,148 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 14:46:58,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 14:47:00,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 14:47:00,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 14:47:00,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:01,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:47:02,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:47:04,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:47:06,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 14:47:06,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 14:47:08,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:47:09,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:47:09,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:47:10,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:10,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:47:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 14:47:10,992 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 14:47:14,385 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-10-03 14:47:15,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:15,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1304380.0, ans=0.125 2023-10-03 14:47:20,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:47:22,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 14:47:25,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:47:29,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:47:32,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:47:32,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 14:47:32,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:47:32,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:47:32,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:47:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:47:36,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 14:47:39,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 14:47:40,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 14:47:41,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:47:41,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 14:47:42,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:47:45,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:47:45,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1304513.3333333333, ans=0.125 2023-10-03 14:47:46,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 14:47:50,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:47:51,868 INFO [train.py:1046] (2/4) Epoch 37, batch 4450, loss[loss=0.1699, simple_loss=0.2418, pruned_loss=0.04898, over 23734.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2396, pruned_loss=0.04006, over 4720695.05 frames. ], batch size: 232, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:47:53,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:55,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:47:55,812 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.70 vs. limit=15.0 2023-10-03 14:48:01,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:01,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:48:04,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:06,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:48:06,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1304646.6666666667, ans=0.125 2023-10-03 14:48:07,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:48:07,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:48:10,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 14:48:10,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:48:11,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:11,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:48:11,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:48:13,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:48:14,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1304646.6666666667, ans=0.125 2023-10-03 14:48:17,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:18,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.50 vs. limit=22.5 2023-10-03 14:48:18,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:18,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:48:20,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:48:20,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1304713.3333333333, ans=0.125 2023-10-03 14:48:21,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:48:26,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:48:27,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 14:48:27,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 14:48:27,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:48:27,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1304713.3333333333, ans=0.125 2023-10-03 14:48:32,540 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.931e+02 2.177e+02 2.666e+02 4.430e+02, threshold=4.354e+02, percent-clipped=2.0 2023-10-03 14:48:32,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:34,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 14:48:37,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:48:41,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:41,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 14:48:41,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:41,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:48:42,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:48:42,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:44,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:47,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:48:47,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 14:48:47,443 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:48:48,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:48:49,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.49 vs. limit=15.0 2023-10-03 14:48:50,788 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=15.0 2023-10-03 14:48:51,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:48:52,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:48:53,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:54,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:48:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:48:55,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1304846.6666666667, ans=0.0 2023-10-03 14:48:59,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 14:49:01,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:49:05,937 INFO [train.py:1046] (2/4) Epoch 37, batch 4500, loss[loss=0.1528, simple_loss=0.2258, pruned_loss=0.03986, over 23783.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2397, pruned_loss=0.04006, over 4724800.21 frames. ], batch size: 164, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:49:06,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:49:09,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 14:49:09,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 14:49:10,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:49:15,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:49:15,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:49:17,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:49:17,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:49:17,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:49:17,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1304913.3333333333, ans=0.125 2023-10-03 14:49:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:49:27,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1304980.0, ans=0.1 2023-10-03 14:49:30,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:49:31,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:49:33,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:49:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:49:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:49:44,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:49:47,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:49:48,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.20 vs. limit=15.0 2023-10-03 14:49:51,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:49:52,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:49:52,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 14:49:54,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:49:55,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:49:57,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:49:57,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:50:00,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:50:00,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 14:50:00,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:50:00,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:05,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:50:05,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:50:08,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:11,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:50:11,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:50:12,256 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.60 vs. limit=15.0 2023-10-03 14:50:12,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 14:50:14,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 14:50:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 14:50:19,400 INFO [train.py:1046] (2/4) Epoch 37, batch 4550, loss[loss=0.1397, simple_loss=0.2069, pruned_loss=0.03628, over 23495.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2398, pruned_loss=0.03977, over 4737452.27 frames. ], batch size: 256, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:50:19,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 14:50:20,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 14:50:22,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:50:22,526 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:50:25,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:50:25,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:50:27,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:50:31,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:50:32,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:50:34,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:50:34,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:50:34,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:36,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:50:37,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:50:39,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:50:42,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 14:50:43,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 14:50:45,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:50:45,824 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.30 vs. limit=15.0 2023-10-03 14:50:46,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 14:50:49,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 14:50:50,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:50:54,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 14:50:56,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:50:58,876 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.866e+02 2.085e+02 2.369e+02 3.164e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-03 14:50:58,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:59,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:59,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:51:01,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 14:51:02,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1305446.6666666667, ans=0.125 2023-10-03 14:51:04,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:51:06,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:06,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:51:08,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:51:09,595 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.63 vs. limit=10.0 2023-10-03 14:51:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 14:51:10,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1305446.6666666667, ans=0.0 2023-10-03 14:51:11,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 14:51:11,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:51:12,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 14:51:14,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 14:51:15,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1305446.6666666667, ans=0.0 2023-10-03 14:51:16,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:51:16,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:16,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:51:19,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:19,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:51:19,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1305513.3333333333, ans=0.125 2023-10-03 14:51:20,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:51:20,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1305513.3333333333, ans=0.125 2023-10-03 14:51:21,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 14:51:23,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:51:23,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 14:51:23,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 14:51:23,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:51:23,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 14:51:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:51:26,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:51:27,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:51:27,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:27,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:51:28,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:51:32,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:51:34,107 INFO [train.py:1046] (2/4) Epoch 37, batch 4600, loss[loss=0.157, simple_loss=0.2486, pruned_loss=0.03266, over 24436.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2389, pruned_loss=0.03917, over 4742837.65 frames. ], batch size: 69, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:51:34,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:35,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:51:38,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:51:38,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:51:40,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:51:41,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 14:51:42,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:51:45,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:51:46,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:51:49,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:49,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1305646.6666666667, ans=0.125 2023-10-03 14:51:55,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 14:51:57,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:00,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:02,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1305713.3333333333, ans=0.95 2023-10-03 14:52:03,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:52:03,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:52:11,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 14:52:11,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:52:12,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:52:15,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:16,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:52:18,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:52:21,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 14:52:22,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:52:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:27,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:52:28,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:28,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 14:52:29,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:31,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 14:52:31,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:31,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:33,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:34,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:52:34,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:34,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 14:52:36,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 14:52:36,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 14:52:36,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:36,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:52:37,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:37,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:48,316 INFO [train.py:1046] (2/4) Epoch 37, batch 4650, loss[loss=0.1501, simple_loss=0.2375, pruned_loss=0.03133, over 24328.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.239, pruned_loss=0.03936, over 4746146.69 frames. ], batch size: 74, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:52:48,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:52:49,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:52:49,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:51,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:52:51,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:51,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:52:52,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:55,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 14:52:59,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:53:00,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 14:53:00,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:53:02,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 14:53:02,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:53:03,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 14:53:04,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 14:53:04,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:04,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:53:05,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:53:07,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:07,726 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 14:53:10,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:12,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 14:53:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:14,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:53:15,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 14:53:17,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:53:20,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:53:21,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1306046.6666666667, ans=15.0 2023-10-03 14:53:23,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:53:24,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1306046.6666666667, ans=0.0 2023-10-03 14:53:27,825 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.915e+02 2.038e+02 2.271e+02 3.983e+02, threshold=4.075e+02, percent-clipped=0.0 2023-10-03 14:53:27,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:29,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:31,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:53:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 14:53:35,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 14:53:36,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 14:53:36,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 14:53:37,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:53:42,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1306113.3333333333, ans=0.1 2023-10-03 14:53:46,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:53:46,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:53:46,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 14:53:47,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:53:48,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:53:50,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:53:50,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:53:51,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1306180.0, ans=0.125 2023-10-03 14:53:52,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:53:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:53:53,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1306180.0, ans=0.125 2023-10-03 14:53:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:57,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:53:57,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:53:57,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:53:58,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 14:53:58,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:53:59,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 14:54:01,690 INFO [train.py:1046] (2/4) Epoch 37, batch 4700, loss[loss=0.1733, simple_loss=0.2494, pruned_loss=0.0486, over 23809.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2392, pruned_loss=0.03966, over 4746289.22 frames. ], batch size: 179, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:54:08,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:09,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:54:09,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:54:10,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:54:12,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:54:17,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 14:54:17,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 14:54:19,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:21,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:54:21,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:54:24,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:29,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:54:29,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:54:32,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:54:37,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 14:54:38,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:54:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:54:42,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.05 vs. limit=15.0 2023-10-03 14:54:43,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 14:54:45,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:54:49,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:54:50,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 14:54:52,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:54:53,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:54:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:57,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:54:57,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 14:54:59,228 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 14:55:00,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:55:02,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:02,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:02,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 14:55:04,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:07,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 14:55:08,466 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.18 vs. limit=15.0 2023-10-03 14:55:10,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:55:11,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:14,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:16,105 INFO [train.py:1046] (2/4) Epoch 37, batch 4750, loss[loss=0.157, simple_loss=0.2395, pruned_loss=0.0372, over 24465.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2396, pruned_loss=0.03952, over 4741534.48 frames. ], batch size: 63, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:55:16,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:55:17,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 14:55:18,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:55:23,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 14:55:24,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:55:24,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:55:25,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:55:28,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 14:55:29,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1306646.6666666667, ans=0.0 2023-10-03 14:55:38,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:55:39,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 14:55:40,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:55:43,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:55:43,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:55:43,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:43,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1306646.6666666667, ans=0.0 2023-10-03 14:55:44,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 14:55:44,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 14:55:49,201 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:55:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 14:55:53,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:55:54,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:55:57,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:55:57,592 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 14:55:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:55:59,039 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.889e+02 2.021e+02 2.290e+02 3.051e+02, threshold=4.042e+02, percent-clipped=0.0 2023-10-03 14:56:00,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:56:01,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:56:05,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 14:56:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 14:56:06,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:56:06,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:56:06,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:07,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:56:08,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 14:56:11,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 14:56:12,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:14,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:56:14,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 14:56:15,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:56:16,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:56:21,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:21,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:56:25,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:56:25,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 14:56:26,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 14:56:27,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 14:56:30,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:56:30,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:56:31,511 INFO [train.py:1046] (2/4) Epoch 37, batch 4800, loss[loss=0.1462, simple_loss=0.2287, pruned_loss=0.03185, over 23367.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2396, pruned_loss=0.03914, over 4756132.27 frames. ], batch size: 105, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:56:31,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 14:56:36,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:36,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:42,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:56:43,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:43,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:44,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 14:56:46,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:56:46,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:56:47,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:56:49,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1306980.0, ans=0.0 2023-10-03 14:56:50,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:56:51,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:51,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:56:53,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:53,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 14:56:53,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:55,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:57,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1306980.0, ans=0.0 2023-10-03 14:56:58,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:57:01,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1307046.6666666667, ans=0.2 2023-10-03 14:57:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:57:02,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:57:02,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:57:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:08,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 14:57:08,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 14:57:08,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1307046.6666666667, ans=0.125 2023-10-03 14:57:10,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:10,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:57:11,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:57:11,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:57:11,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:57:13,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:57:13,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:57:17,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:57:18,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:20,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:57:24,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 14:57:27,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:57:28,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:28,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:57:28,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:32,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:57:34,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:57:34,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:35,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:57:35,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:57:36,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:57:41,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:57:41,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:41,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:57:42,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 14:57:45,534 INFO [train.py:1046] (2/4) Epoch 37, batch 4850, loss[loss=0.1654, simple_loss=0.2333, pruned_loss=0.04868, over 23840.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2395, pruned_loss=0.03971, over 4743342.00 frames. ], batch size: 179, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:57:45,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 14:57:45,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:45,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:47,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:57:47,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:50,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:57,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 14:57:59,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:58:01,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:58:02,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1307313.3333333333, ans=0.0 2023-10-03 14:58:03,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:58:03,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:58:07,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:58:07,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:58:10,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:58:10,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 14:58:13,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:58:15,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:58:15,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:58:17,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:58:17,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 14:58:19,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:58:19,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:21,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1307380.0, ans=0.0 2023-10-03 14:58:25,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:25,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 14:58:25,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 14:58:26,624 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.885e+02 2.047e+02 2.380e+02 3.001e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 14:58:26,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:58:34,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:58:34,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 14:58:34,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:58:34,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:58:37,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:58:39,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 14:58:39,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:39,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 14:58:39,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:58:40,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:58:41,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 14:58:44,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1307513.3333333333, ans=0.125 2023-10-03 14:58:49,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:55,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:58:55,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:58:59,480 INFO [train.py:1046] (2/4) Epoch 37, batch 4900, loss[loss=0.1496, simple_loss=0.2075, pruned_loss=0.04583, over 22626.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2385, pruned_loss=0.03936, over 4738160.52 frames. ], batch size: 322, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:59:01,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 14:59:01,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:59:06,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:07,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:59:08,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:59:12,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 14:59:16,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 14:59:19,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 14:59:21,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 14:59:21,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:59:22,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:59:22,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:59:22,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:59:22,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:59:23,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 14:59:25,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 14:59:26,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:59:29,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:59:29,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:59:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:59:32,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:59:32,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 14:59:34,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:59:35,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:59:35,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 14:59:35,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 14:59:40,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 14:59:41,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:59:44,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:59:44,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:59:44,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:45,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 14:59:45,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:59:45,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 14:59:49,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:59:51,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:59:53,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:59:54,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 14:59:55,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:59:56,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:59:56,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 15:00:02,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:00:03,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:00:04,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 15:00:04,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:00:05,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:00:06,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:11,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:00:11,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:00:11,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:00:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 15:00:14,125 INFO [train.py:1046] (2/4) Epoch 37, batch 4950, loss[loss=0.1645, simple_loss=0.2481, pruned_loss=0.04049, over 24420.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2366, pruned_loss=0.03947, over 4707948.79 frames. ], batch size: 77, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:00:14,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:00:16,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:00:16,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:00:17,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1307913.3333333333, ans=0.125 2023-10-03 15:00:19,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1307913.3333333333, ans=0.125 2023-10-03 15:00:20,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 15:00:20,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 15:00:21,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:00:21,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 15:00:21,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:21,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:00:23,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:00:23,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:26,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:27,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:00:27,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1307980.0, ans=0.0 2023-10-03 15:00:28,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:00:30,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:00:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:31,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1307980.0, ans=0.2 2023-10-03 15:00:33,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:00:37,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:00:40,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:40,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:00:44,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:44,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:44,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:00:45,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 15:00:47,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 15:00:48,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:50,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:00:50,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:00:51,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:00:51,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:00:52,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:00:54,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:56,285 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.953e+02 2.233e+02 2.560e+02 3.668e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-03 15:00:56,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:00:59,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:01:01,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:01,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:01,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 15:01:03,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:01:03,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:01:06,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1308113.3333333333, ans=0.5 2023-10-03 15:01:08,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:01:09,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1308113.3333333333, ans=0.2 2023-10-03 15:01:10,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:01:10,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:01:10,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:12,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:01:13,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:01:15,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:01:15,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:01:15,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:01:16,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 15:01:21,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:25,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 15:01:27,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:01:28,802 INFO [train.py:1046] (2/4) Epoch 37, batch 5000, loss[loss=0.1618, simple_loss=0.2283, pruned_loss=0.0476, over 23908.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2358, pruned_loss=0.03911, over 4704898.44 frames. ], batch size: 195, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:01:30,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1308246.6666666667, ans=0.1 2023-10-03 15:01:31,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1308246.6666666667, ans=0.0 2023-10-03 15:01:34,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:34,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:01:36,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 15:01:37,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 15:01:38,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:01:41,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 15:01:41,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:01:41,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:01:42,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 15:01:42,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:42,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:01:44,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 15:01:44,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:45,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:01:46,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 15:01:47,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 15:01:48,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:01:49,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 15:01:49,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:01:49,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:50,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:01:50,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 15:01:50,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 15:01:52,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 15:01:53,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:53,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:53,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 15:01:54,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:01:55,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1308313.3333333333, ans=0.0 2023-10-03 15:01:56,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:56,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:57,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 15:01:59,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 15:02:01,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:02:02,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:02:06,808 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 15:02:09,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:02:09,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:02:09,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:13,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 15:02:13,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:02:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:02:15,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:02:17,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 15:02:17,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:02:17,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1308446.6666666667, ans=0.015 2023-10-03 15:02:19,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:02:20,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:02:22,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1308446.6666666667, ans=0.125 2023-10-03 15:02:25,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.62 vs. limit=12.0 2023-10-03 15:02:26,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 15:02:29,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:29,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1308513.3333333333, ans=0.1 2023-10-03 15:02:37,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:02:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:40,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:02:40,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:02:40,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:02:41,564 INFO [train.py:1046] (2/4) Epoch 37, batch 5050, loss[loss=0.1766, simple_loss=0.2646, pruned_loss=0.04424, over 23960.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2367, pruned_loss=0.03935, over 4716753.13 frames. ], batch size: 80, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:02:41,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:02:41,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:41,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1308580.0, ans=0.0 2023-10-03 15:02:41,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1308580.0, ans=0.125 2023-10-03 15:02:48,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:48,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 15:02:49,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:02:50,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:02:52,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:02:52,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 15:02:54,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:02:54,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:02:55,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:02:57,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:02:58,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:03:06,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 15:03:06,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:03:07,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:03:08,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 15:03:08,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:03:10,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:10,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:03:11,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:03:11,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 15:03:12,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 15:03:14,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:16,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:18,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1308713.3333333333, ans=0.125 2023-10-03 15:03:19,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:19,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 15:03:20,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:03:23,934 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.822e+02 1.997e+02 2.206e+02 4.245e+02, threshold=3.993e+02, percent-clipped=0.0 2023-10-03 15:03:24,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 15:03:25,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:03:25,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:03:26,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:03:26,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:03:28,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:03:29,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:03:31,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:31,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:03:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:03:32,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 15:03:33,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:03:35,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:03:38,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:03:38,423 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 15:03:38,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:03:39,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:03:41,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:41,184 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 15:03:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:42,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 15:03:42,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:47,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:03:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:49,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 15:03:49,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1308846.6666666667, ans=0.0 2023-10-03 15:03:50,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 15:03:53,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:03:53,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:03:54,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:03:56,088 INFO [train.py:1046] (2/4) Epoch 37, batch 5100, loss[loss=0.1419, simple_loss=0.2233, pruned_loss=0.03028, over 24363.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2377, pruned_loss=0.03948, over 4725313.17 frames. ], batch size: 56, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:03:56,179 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 15:03:57,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:04:01,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 15:04:02,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 15:04:03,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:04:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:04:09,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:04:09,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 15:04:09,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 15:04:13,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:04:14,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:04:17,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:04:17,606 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.21 vs. limit=15.0 2023-10-03 15:04:19,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1308980.0, ans=0.0 2023-10-03 15:04:19,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.44 vs. limit=15.0 2023-10-03 15:04:20,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 15:04:20,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:04:22,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:04:22,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 15:04:25,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:26,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:26,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 15:04:27,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1309046.6666666667, ans=0.0 2023-10-03 15:04:28,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 15:04:29,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:29,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 15:04:29,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 15:04:32,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:04:34,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1309046.6666666667, ans=0.125 2023-10-03 15:04:34,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1309046.6666666667, ans=0.2 2023-10-03 15:04:39,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:04:42,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 15:04:42,638 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 15:04:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 15:04:44,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1309113.3333333333, ans=0.125 2023-10-03 15:04:45,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 15:04:45,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:47,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 15:04:51,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 15:04:55,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:04:55,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1309180.0, ans=0.125 2023-10-03 15:04:56,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:04:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 15:05:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:05:02,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 15:05:06,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:05:06,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:05:06,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:05:08,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:05:08,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:05:08,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1309246.6666666667, ans=0.0 2023-10-03 15:05:09,433 INFO [train.py:1046] (2/4) Epoch 37, batch 5150, loss[loss=0.1383, simple_loss=0.2198, pruned_loss=0.02842, over 23789.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2388, pruned_loss=0.03971, over 4737661.96 frames. ], batch size: 149, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:05:09,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:05:09,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 15:05:09,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 15:05:09,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 15:05:09,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:05:09,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 15:05:11,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:12,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 15:05:13,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:15,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1309246.6666666667, ans=0.04949747468305833 2023-10-03 15:05:19,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:05:19,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 15:05:20,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:21,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:05:23,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:05:23,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:05:23,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:05:25,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:05:25,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:05:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 15:05:26,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:05:28,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:05:31,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:05:31,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 15:05:32,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:05:36,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:05:40,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 15:05:42,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:05:47,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:05:48,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:51,925 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.958e+02 2.128e+02 2.414e+02 3.634e+02, threshold=4.256e+02, percent-clipped=0.0 2023-10-03 15:05:52,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:05:52,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:05:53,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 15:05:59,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:59,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:05:59,414 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:06:00,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:06:03,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:03,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:06:05,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 15:06:09,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:06:10,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:06:13,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:06:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:06:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:06:15,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:06:16,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:06:16,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:06:17,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-10-03 15:06:22,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:06:24,015 INFO [train.py:1046] (2/4) Epoch 37, batch 5200, loss[loss=0.1542, simple_loss=0.2328, pruned_loss=0.03783, over 24293.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03993, over 4730713.18 frames. ], batch size: 56, lr: 2.73e-03, grad_scale: 16.0 2023-10-03 15:06:24,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:06:27,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:06:30,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 15:06:31,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:06:31,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:34,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:06:35,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:06:35,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:38,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 15:06:40,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:06:41,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:43,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 15:06:45,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:06:47,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:06:47,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 15:06:48,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 15:06:50,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1309646.6666666667, ans=0.125 2023-10-03 15:06:51,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 15:06:51,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:51,943 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 15:06:51,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:55,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:06:55,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:06:56,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 15:06:57,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:06:59,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:07:02,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 15:07:02,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 15:07:02,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 15:07:05,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1309713.3333333333, ans=0.2 2023-10-03 15:07:07,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 15:07:08,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:07:14,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:07:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:17,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 15:07:17,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:07:17,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:07:17,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:18,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:07:20,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:07:20,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:07:22,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1309846.6666666667, ans=0.125 2023-10-03 15:07:25,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:07:27,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:27,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:34,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:34,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 15:07:35,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:07:35,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:07:37,235 INFO [train.py:1046] (2/4) Epoch 37, batch 5250, loss[loss=0.1564, simple_loss=0.2503, pruned_loss=0.03124, over 24664.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2382, pruned_loss=0.03995, over 4735219.17 frames. ], batch size: 73, lr: 2.73e-03, grad_scale: 16.0 2023-10-03 15:07:37,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:37,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:07:38,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:07:39,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1309913.3333333333, ans=0.2 2023-10-03 15:07:40,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:07:43,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:43,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1309913.3333333333, ans=0.1 2023-10-03 15:07:44,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:07:45,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1309913.3333333333, ans=0.125 2023-10-03 15:07:46,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:07:50,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:51,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:07:53,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:07:56,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:07:57,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 15:07:58,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1309980.0, ans=0.2 2023-10-03 15:07:59,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:59,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:08:04,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1309980.0, ans=0.125 2023-10-03 15:08:08,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.99 vs. limit=15.0 2023-10-03 15:08:18,118 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.876e+02 2.013e+02 2.256e+02 3.803e+02, threshold=4.026e+02, percent-clipped=0.0 2023-10-03 15:08:45,608 INFO [train.py:1046] (2/4) Epoch 37, batch 5300, loss[loss=0.1465, simple_loss=0.2224, pruned_loss=0.03533, over 23741.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2368, pruned_loss=0.03994, over 4713128.63 frames. ], batch size: 149, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:09:00,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:09:00,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 15:09:00,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 15:09:00,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:00,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:00,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:00,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:00,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:00,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:00,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:09:01,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:09:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 15:09:01,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 15:09:01,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 15:09:01,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:09:01,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 15:09:01,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 15:09:01,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:01,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:02,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:09:02,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:09:02,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:09:02,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:09:02,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:02,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:02,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:09:02,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:02,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:09:02,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:02,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:09:03,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 15:09:03,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:09:04,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:04,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 15:09:04,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 15:09:04,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:09:04,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:04,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 15:09:04,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 15:09:04,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:09:04,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:09:04,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:09:05,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 15:09:05,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 15:09:05,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:09:05,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:05,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 15:09:05,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 15:09:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 15:09:05,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:09:12,148 INFO [train.py:1046] (2/4) Epoch 38, batch 0, loss[loss=0.1636, simple_loss=0.2437, pruned_loss=0.0417, over 23375.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2437, pruned_loss=0.0417, over 23375.00 frames. ], batch size: 285, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:09:12,149 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 15:09:24,066 INFO [train.py:1078] (2/4) Epoch 38, validation: loss=0.3257, simple_loss=0.2715, pruned_loss=0.1899, over 1125622.00 frames. 2023-10-03 15:09:24,067 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 15:09:27,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 15:09:28,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:09:31,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:09:31,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.07 vs. limit=15.0 2023-10-03 15:09:35,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:35,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:09:36,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:36,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1310326.6666666667, ans=0.125 2023-10-03 15:09:37,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 15:09:38,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 15:09:40,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:40,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:43,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:44,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:09:44,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:09:46,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1310393.3333333333, ans=0.0 2023-10-03 15:09:47,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 15:09:49,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:09:53,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1310460.0, ans=0.125 2023-10-03 15:09:55,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:09:55,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:57,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 15:10:00,653 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.40 vs. limit=10.0 2023-10-03 15:10:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:10:01,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:10:02,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:07,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:10:10,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:16,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 15:10:18,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 15:10:18,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:10:18,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:20,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:10:22,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:10:23,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 15:10:28,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:28,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:32,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:10:35,252 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 15:10:37,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:10:38,649 INFO [train.py:1046] (2/4) Epoch 38, batch 50, loss[loss=0.1524, simple_loss=0.2331, pruned_loss=0.03578, over 24321.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2401, pruned_loss=0.0385, over 1076209.91 frames. ], batch size: 61, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:10:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:10:40,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:10:40,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 15:10:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:10:41,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:10:43,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:10:45,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:10:47,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:10:50,120 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.66 vs. limit=22.5 2023-10-03 15:10:51,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 15:10:51,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:52,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1310726.6666666667, ans=0.125 2023-10-03 15:10:57,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:10:59,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 15:11:00,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1310726.6666666667, ans=0.1 2023-10-03 15:11:01,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 15:11:01,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1310726.6666666667, ans=0.125 2023-10-03 15:11:02,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:11:03,995 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.896e+02 2.083e+02 2.341e+02 4.077e+02, threshold=4.166e+02, percent-clipped=1.0 2023-10-03 15:11:04,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:11:04,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:11:04,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:11:04,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:11:05,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:11:05,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:11:08,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1310793.3333333333, ans=0.07 2023-10-03 15:11:14,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:11:17,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:11:17,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:11:19,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 15:11:20,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:11:21,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:11:21,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 15:11:23,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:11:25,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 15:11:26,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1310860.0, ans=0.2 2023-10-03 15:11:31,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:11:31,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:11:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:11:33,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:11:33,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:11:36,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 15:11:36,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 15:11:36,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1310926.6666666667, ans=0.125 2023-10-03 15:11:39,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:11:39,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:11:41,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:11:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:11:41,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 15:11:43,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 15:11:45,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 15:11:46,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:11:46,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:11:48,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 15:11:48,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 15:11:48,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:11:49,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:11:51,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:11:51,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:11:52,658 INFO [train.py:1046] (2/4) Epoch 38, batch 100, loss[loss=0.1322, simple_loss=0.2124, pruned_loss=0.026, over 24615.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2393, pruned_loss=0.03893, over 1892137.20 frames. ], batch size: 60, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:11:52,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:11:57,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:12:00,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:12:02,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 15:12:02,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:12:02,446 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:12:05,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:12:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:12:05,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:12:06,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:12:06,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:12:07,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 15:12:08,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1311060.0, ans=0.125 2023-10-03 15:12:10,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:12:10,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:10,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:12:10,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:12:13,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 15:12:15,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:16,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:12:17,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.77 vs. limit=10.0 2023-10-03 15:12:17,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.57 vs. limit=15.0 2023-10-03 15:12:17,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:12:21,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:12:24,100 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:12:25,094 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 15:12:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 15:12:26,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:12:26,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:12:29,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:12:31,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:32,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:38,132 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=12.0 2023-10-03 15:12:38,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:38,556 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 15:12:41,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 15:12:43,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1311193.3333333333, ans=0.2 2023-10-03 15:12:44,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:12:45,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:12:48,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:52,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:12:56,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:12:57,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:13:00,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:00,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:03,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:03,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:13:03,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:03,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 15:13:03,720 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 15:13:05,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:05,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:13:06,869 INFO [train.py:1046] (2/4) Epoch 38, batch 150, loss[loss=0.1521, simple_loss=0.2339, pruned_loss=0.03509, over 24657.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2404, pruned_loss=0.03984, over 2534211.06 frames. ], batch size: 65, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:13:06,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:06,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:06,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 15:13:06,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:13:08,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:13:08,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:08,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:08,987 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-03 15:13:09,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:11,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:13:11,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:13:12,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:12,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1311326.6666666667, ans=0.0 2023-10-03 15:13:14,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1311326.6666666667, ans=0.0 2023-10-03 15:13:15,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:13:15,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:16,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:19,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:19,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:24,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:13:24,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:28,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 15:13:28,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 15:13:28,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 15:13:31,438 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.858e+02 2.070e+02 2.351e+02 3.809e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-03 15:13:31,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:13:31,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:13:32,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:13:34,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:34,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:37,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 15:13:39,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1311460.0, ans=0.125 2023-10-03 15:13:40,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:41,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.41 vs. limit=22.5 2023-10-03 15:13:44,040 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:13:45,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:49,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:13:49,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 15:13:53,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:13:53,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:54,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:13:56,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:13:59,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:14:00,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:14:00,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:02,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 15:14:05,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:07,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:07,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:14:07,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:14:10,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:11,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1311593.3333333333, ans=22.5 2023-10-03 15:14:12,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 15:14:13,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:14:14,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=15.0 2023-10-03 15:14:14,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:14:16,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:14:18,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:14:18,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 15:14:18,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:14:19,016 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 15:14:20,217 INFO [train.py:1046] (2/4) Epoch 38, batch 200, loss[loss=0.1475, simple_loss=0.2254, pruned_loss=0.03478, over 23431.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2398, pruned_loss=0.03915, over 3024625.24 frames. ], batch size: 134, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:14:22,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:14:25,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:14:25,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:14:29,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 15:14:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:14:30,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:31,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 15:14:32,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:14:33,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:35,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:38,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:14:38,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:14:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:50,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.83 vs. limit=22.5 2023-10-03 15:14:56,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:14:56,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:14:58,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:14:58,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:15:01,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:15:01,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:15:03,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:05,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:15:06,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:15:07,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:15:07,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 15:15:09,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:15:09,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:11,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1311860.0, ans=0.125 2023-10-03 15:15:12,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:15:12,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1311860.0, ans=0.125 2023-10-03 15:15:16,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:15:23,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:23,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:15:26,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1311926.6666666667, ans=0.125 2023-10-03 15:15:30,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:31,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 15:15:32,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:32,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:15:32,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:15:34,202 INFO [train.py:1046] (2/4) Epoch 38, batch 250, loss[loss=0.1473, simple_loss=0.235, pruned_loss=0.02983, over 24287.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2388, pruned_loss=0.03905, over 3402042.63 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:15:34,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:15:34,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 15:15:36,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:15:36,108 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 15:15:38,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:39,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:15:41,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:41,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:44,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:15:45,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:46,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:15:49,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.14 vs. limit=8.0 2023-10-03 15:15:49,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:15:55,724 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.96 vs. limit=15.0 2023-10-03 15:15:59,008 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.823e+02 2.031e+02 2.288e+02 2.955e+02, threshold=4.062e+02, percent-clipped=0.0 2023-10-03 15:16:01,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:16:02,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:16:02,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:16:05,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1312126.6666666667, ans=0.0 2023-10-03 15:16:10,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:16:11,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:16:11,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1312126.6666666667, ans=0.125 2023-10-03 15:16:13,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:16:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:16:13,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:16:13,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:16:15,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:16:19,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:16:20,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 15:16:20,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:16:22,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:16:22,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:16:22,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:16:22,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1312193.3333333333, ans=0.1 2023-10-03 15:16:23,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:16:25,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:16:25,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:16:25,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:27,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:16:27,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:16:30,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:16:34,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1312260.0, ans=0.2 2023-10-03 15:16:35,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:39,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:16:43,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:16:44,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:16:48,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 15:16:48,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:16:48,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:16:49,537 INFO [train.py:1046] (2/4) Epoch 38, batch 300, loss[loss=0.1434, simple_loss=0.2085, pruned_loss=0.03917, over 23426.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2359, pruned_loss=0.03924, over 3672142.39 frames. ], batch size: 285, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:16:49,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 15:16:49,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:16:51,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:16:51,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 15:16:55,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:56,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:16:59,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:16:59,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 15:17:02,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:17:02,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:17:04,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 15:17:04,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:09,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:17:12,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.64 vs. limit=15.0 2023-10-03 15:17:13,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:17:13,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 15:17:16,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 15:17:18,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:19,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:19,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1312460.0, ans=10.0 2023-10-03 15:17:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 15:17:21,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:17:22,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:17:22,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1312460.0, ans=0.125 2023-10-03 15:17:23,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:17:23,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:17:28,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:17:28,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 15:17:28,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:17:32,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:32,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 15:17:34,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:17:39,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:17:40,618 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.35 vs. limit=15.0 2023-10-03 15:17:42,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:17:42,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 15:17:47,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:47,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:17:49,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:51,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:17:52,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 15:17:52,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:17:52,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:17:54,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 15:17:55,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:17:56,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:58,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:17:58,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:17:59,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1312593.3333333333, ans=10.0 2023-10-03 15:18:02,275 INFO [train.py:1046] (2/4) Epoch 38, batch 350, loss[loss=0.154, simple_loss=0.2251, pruned_loss=0.0415, over 23671.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2357, pruned_loss=0.039, over 3909719.12 frames. ], batch size: 232, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:18:04,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:04,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 15:18:06,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:12,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:18:14,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:16,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:17,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 15:18:19,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:19,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 15:18:21,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:22,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 15:18:23,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:18:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 15:18:26,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:18:28,150 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.973e+02 2.217e+02 2.536e+02 3.904e+02, threshold=4.435e+02, percent-clipped=0.0 2023-10-03 15:18:29,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:18:29,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:18:31,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:31,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:31,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:18:32,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:32,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:18:34,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:18:34,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:35,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1312793.3333333333, ans=0.09899494936611666 2023-10-03 15:18:40,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:18:40,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:18:41,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:18:41,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:43,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-10-03 15:18:45,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1312793.3333333333, ans=0.0 2023-10-03 15:18:46,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 15:18:46,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:51,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:51,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:18:52,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:53,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 15:18:56,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:18:56,643 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 15:18:57,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 15:18:57,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:58,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1312860.0, ans=0.125 2023-10-03 15:19:01,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:19:01,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 15:19:04,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:06,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=12.0 2023-10-03 15:19:07,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:19:07,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:09,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:09,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:19:10,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:19:13,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1312926.6666666667, ans=0.125 2023-10-03 15:19:14,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:19:17,102 INFO [train.py:1046] (2/4) Epoch 38, batch 400, loss[loss=0.1516, simple_loss=0.2285, pruned_loss=0.03737, over 23442.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2354, pruned_loss=0.03878, over 4086900.55 frames. ], batch size: 134, lr: 2.69e-03, grad_scale: 32.0 2023-10-03 15:19:17,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:19:17,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 15:19:17,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:18,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:20,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:19:20,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:23,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:23,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:26,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 15:19:28,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 15:19:28,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:28,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 15:19:30,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:32,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1313060.0, ans=0.125 2023-10-03 15:19:36,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:19:36,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:19:36,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1313060.0, ans=0.0 2023-10-03 15:19:38,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 15:19:38,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:19:38,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:38,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:19:39,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:41,085 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 15:19:43,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 15:19:47,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:48,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:48,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 15:19:50,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 15:19:54,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:19:55,516 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.19 vs. limit=15.0 2023-10-03 15:19:57,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:03,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 15:20:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:20:09,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 15:20:11,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:20:12,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:20:12,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 15:20:16,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:20:16,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1313260.0, ans=0.1 2023-10-03 15:20:18,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:20:20,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:20:21,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 15:20:21,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1313260.0, ans=0.125 2023-10-03 15:20:24,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:20:24,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 15:20:27,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:20:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:20:28,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 15:20:30,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:20:31,637 INFO [train.py:1046] (2/4) Epoch 38, batch 450, loss[loss=0.1822, simple_loss=0.247, pruned_loss=0.05868, over 23677.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2361, pruned_loss=0.03881, over 4236551.71 frames. ], batch size: 179, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:20:31,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:20:31,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:20:31,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 15:20:33,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:20:33,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:20:34,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:20:34,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 15:20:34,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:20:36,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:20:39,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:20:50,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:50,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:20:51,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 15:20:53,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 15:20:57,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:20:58,822 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.857e+02 2.046e+02 2.242e+02 3.489e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 15:20:58,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:21:01,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:04,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:21:05,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:21:08,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 15:21:08,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 15:21:10,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 15:21:10,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:12,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:12,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:21:13,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 15:21:13,696 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 15:21:13,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:21:16,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:21:17,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 15:21:21,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1313526.6666666667, ans=0.125 2023-10-03 15:21:22,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:21:22,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:21:22,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:21:24,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 15:21:25,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:21:27,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1313526.6666666667, ans=0.125 2023-10-03 15:21:28,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:21:29,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:21:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 15:21:35,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:21:35,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 15:21:35,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 15:21:36,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:21:39,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1313593.3333333333, ans=0.0 2023-10-03 15:21:41,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:21:42,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:21:43,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:21:44,871 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 15:21:46,211 INFO [train.py:1046] (2/4) Epoch 38, batch 500, loss[loss=0.162, simple_loss=0.2528, pruned_loss=0.03563, over 24559.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2376, pruned_loss=0.03942, over 4335961.35 frames. ], batch size: 71, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:21:49,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:50,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:21:52,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:52,283 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 15:21:53,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 15:21:53,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:55,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:21:59,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:22:00,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1313726.6666666667, ans=0.5 2023-10-03 15:22:01,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:22:02,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:22:02,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:22:04,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:15,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:15,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:22:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:22:17,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:17,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 15:22:18,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:22:20,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:22:21,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:22:21,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:22:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:22,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 15:22:25,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1313793.3333333333, ans=0.025 2023-10-03 15:22:27,465 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 15:22:30,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:30,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:32,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:33,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:33,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:22:33,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1313860.0, ans=0.125 2023-10-03 15:22:34,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 15:22:36,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1313860.0, ans=0.1 2023-10-03 15:22:37,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:22:38,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:22:41,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:22:46,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:49,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1313926.6666666667, ans=0.09899494936611666 2023-10-03 15:22:50,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:51,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1313926.6666666667, ans=0.0 2023-10-03 15:22:52,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 15:22:53,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:22:53,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:57,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 15:22:57,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:22:57,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1313926.6666666667, ans=0.125 2023-10-03 15:22:58,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:23:00,349 INFO [train.py:1046] (2/4) Epoch 38, batch 550, loss[loss=0.1483, simple_loss=0.2365, pruned_loss=0.03007, over 24628.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2391, pruned_loss=0.03984, over 4424743.55 frames. ], batch size: 65, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:23:03,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 15:23:04,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 15:23:04,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:04,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 15:23:05,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:23:05,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:06,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:07,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:07,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:23:08,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:23:11,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:23:12,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 15:23:12,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:23:19,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:19,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:19,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.41 vs. limit=15.0 2023-10-03 15:23:20,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:23:22,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:26,465 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.899e+02 2.084e+02 2.370e+02 3.616e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-03 15:23:27,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 15:23:27,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 15:23:29,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:23:30,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1314126.6666666667, ans=0.125 2023-10-03 15:23:33,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-10-03 15:23:34,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:23:35,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:23:36,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:23:39,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:39,470 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 15:23:40,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:40,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:23:43,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:23:43,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:23:43,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:23:45,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:45,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 15:23:48,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 15:23:50,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:23:50,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:23:51,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:23:51,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:53,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:23:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:23:59,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:23:59,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:00,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 15:24:00,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:24:01,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:03,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:24:03,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:04,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:24:04,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 15:24:11,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 15:24:12,789 INFO [train.py:1046] (2/4) Epoch 38, batch 600, loss[loss=0.137, simple_loss=0.211, pruned_loss=0.03147, over 23723.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2391, pruned_loss=0.0397, over 4504483.18 frames. ], batch size: 134, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:24:14,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 15:24:15,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:24:17,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:24:17,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:23,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:24:25,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:24:28,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 15:24:30,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:24:33,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:24:33,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:36,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 15:24:36,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:24:37,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1314393.3333333333, ans=0.1 2023-10-03 15:24:41,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 15:24:44,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:24:44,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:44,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:24:47,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1314460.0, ans=0.0 2023-10-03 15:24:50,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:24:50,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:24:50,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:55,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1314460.0, ans=0.0 2023-10-03 15:24:57,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:25:01,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:25:01,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:25:03,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:25:07,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 15:25:12,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:25:13,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:25:19,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 15:25:20,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:25:21,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 15:25:21,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:25:23,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:25:27,774 INFO [train.py:1046] (2/4) Epoch 38, batch 650, loss[loss=0.1757, simple_loss=0.2636, pruned_loss=0.04384, over 24674.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2382, pruned_loss=0.03981, over 4538015.45 frames. ], batch size: 73, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:25:27,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 15:25:27,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:25:31,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:25:33,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:25:34,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:25:34,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1314660.0, ans=0.0 2023-10-03 15:25:35,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 15:25:35,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:25:41,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:25:41,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:25:44,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:25:48,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 15:25:49,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:25:50,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:25:52,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:25:54,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 15:25:55,438 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.877e+02 2.025e+02 2.307e+02 3.864e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-03 15:25:56,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:25:57,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1314793.3333333333, ans=0.125 2023-10-03 15:25:58,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:25:59,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:25:59,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:01,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:26:03,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:26:03,068 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 15:26:03,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:26:03,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:26:04,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1314793.3333333333, ans=0.2 2023-10-03 15:26:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:08,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:26:08,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:09,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.46 vs. limit=22.5 2023-10-03 15:26:09,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:26:10,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 15:26:11,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:26:11,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:26:11,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:26:12,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:26:14,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:26:16,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 15:26:17,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 15:26:17,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:19,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:26:19,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:26:19,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:26:20,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:26:26,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:26,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:26:28,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:26:28,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1314926.6666666667, ans=0.0 2023-10-03 15:26:31,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:31,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:26:31,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:37,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:26:37,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:26:37,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:26:38,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:26:42,490 INFO [train.py:1046] (2/4) Epoch 38, batch 700, loss[loss=0.1519, simple_loss=0.2373, pruned_loss=0.03323, over 24370.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2364, pruned_loss=0.0392, over 4575013.13 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:26:42,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 15:26:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 15:26:46,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 15:26:47,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:49,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:26:50,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 15:26:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:26:56,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.82 vs. limit=15.0 2023-10-03 15:26:58,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:26:58,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1315060.0, ans=0.0 2023-10-03 15:27:00,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:27:01,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:27:01,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:27:04,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:27:05,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 15:27:05,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:27:07,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 15:27:07,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1315060.0, ans=0.025 2023-10-03 15:27:11,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 15:27:12,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:27:14,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:27:16,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1315126.6666666667, ans=0.125 2023-10-03 15:27:17,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:27:20,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:27:21,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 15:27:23,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1315126.6666666667, ans=0.1 2023-10-03 15:27:24,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.67 vs. limit=15.0 2023-10-03 15:27:26,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:27:26,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:27:28,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 15:27:30,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1315193.3333333333, ans=0.125 2023-10-03 15:27:32,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:27:34,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:27:35,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:27:41,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:27:41,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 15:27:43,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 15:27:45,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 15:27:47,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:27:49,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:27:50,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:27:53,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:27:53,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 15:27:57,712 INFO [train.py:1046] (2/4) Epoch 38, batch 750, loss[loss=0.1471, simple_loss=0.2393, pruned_loss=0.02749, over 24629.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2358, pruned_loss=0.03889, over 4607762.68 frames. ], batch size: 73, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:27:57,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 15:27:57,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 15:27:59,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 15:27:59,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 15:28:00,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 15:28:00,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:28:02,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 15:28:03,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:28:05,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:28:06,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:08,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:09,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:28:09,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:28:10,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.43 vs. limit=22.5 2023-10-03 15:28:11,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:28:11,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:28:12,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:28:14,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-10-03 15:28:16,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:16,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:16,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 15:28:18,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:28:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:28:21,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:28:22,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:28:24,077 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.901e+02 2.100e+02 2.467e+02 3.467e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-03 15:28:24,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 15:28:24,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:28:25,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 15:28:26,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1315460.0, ans=0.0 2023-10-03 15:28:27,061 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 15:28:27,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 15:28:27,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:28:28,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:28:31,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:28:37,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:28:37,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:28:38,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1315460.0, ans=0.0 2023-10-03 15:28:39,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:28:40,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:40,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1315526.6666666667, ans=0.0 2023-10-03 15:28:43,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:28:43,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 15:28:43,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:28:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 15:28:46,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:28:48,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:28:50,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 15:28:50,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:28:55,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:28:56,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:28:58,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:59,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:29:03,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 15:29:05,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:29:05,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:06,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1315593.3333333333, ans=0.125 2023-10-03 15:29:08,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:08,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:09,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:09,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:29:11,385 INFO [train.py:1046] (2/4) Epoch 38, batch 800, loss[loss=0.1562, simple_loss=0.2481, pruned_loss=0.03213, over 24648.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03846, over 4651845.98 frames. ], batch size: 73, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:29:19,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:19,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:20,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:29:21,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1315660.0, ans=0.125 2023-10-03 15:29:22,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:22,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:22,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:24,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:28,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:30,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:29:32,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 15:29:32,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:34,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:29:34,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:29:35,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.95 vs. limit=6.0 2023-10-03 15:29:36,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 15:29:36,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:36,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.02 vs. limit=6.0 2023-10-03 15:29:37,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 15:29:39,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:41,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:42,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:43,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:29:45,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:45,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:49,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:29:49,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:29:49,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 15:29:51,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1315793.3333333333, ans=0.1 2023-10-03 15:29:53,223 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 15:29:53,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 15:29:53,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:29:53,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:54,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:29:59,953 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 15:30:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 15:30:02,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:30:04,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:30:08,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:30:12,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:30:13,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 15:30:13,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:30:16,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 15:30:21,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:30:23,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:30:24,888 INFO [train.py:1046] (2/4) Epoch 38, batch 850, loss[loss=0.1647, simple_loss=0.2431, pruned_loss=0.04321, over 23614.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2377, pruned_loss=0.03898, over 4662997.93 frames. ], batch size: 149, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:30:25,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 15:30:25,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:30:26,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:30:27,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 15:30:27,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:28,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1315993.3333333333, ans=10.0 2023-10-03 15:30:29,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:30:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:30:31,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:30:33,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:30:34,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 15:30:34,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 15:30:34,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 15:30:35,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:30:35,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:30:38,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:30:40,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:30:40,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:30:40,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1316060.0, ans=0.0 2023-10-03 15:30:44,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:46,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:30:46,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 15:30:50,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 15:30:51,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:53,069 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.915e+02 2.103e+02 2.491e+02 3.805e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-03 15:30:53,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 15:30:55,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1316126.6666666667, ans=0.125 2023-10-03 15:30:58,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 15:30:59,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 15:30:59,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1316126.6666666667, ans=0.09899494936611666 2023-10-03 15:31:02,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 15:31:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:31:02,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:31:02,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:31:05,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:07,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:07,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 15:31:08,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:31:11,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:31:12,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:31:12,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:31:15,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:31:15,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:31:17,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 15:31:20,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:31:20,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:31:21,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:31:21,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:31:23,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:31:25,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:29,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:31:29,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:31:29,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:31:31,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:31:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:31:40,032 INFO [train.py:1046] (2/4) Epoch 38, batch 900, loss[loss=0.1768, simple_loss=0.2516, pruned_loss=0.05104, over 23797.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2388, pruned_loss=0.03954, over 4685484.44 frames. ], batch size: 212, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:31:40,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:31:40,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 15:31:41,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:31:41,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:31:43,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 15:31:44,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1316326.6666666667, ans=0.0 2023-10-03 15:31:48,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:31:51,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:31:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 15:31:55,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:31:55,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 15:31:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 15:31:57,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:31:57,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:31:59,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:31:59,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:31:59,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1316393.3333333333, ans=0.0 2023-10-03 15:32:07,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:07,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:32:08,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:32:10,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:32:14,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 15:32:15,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:32:16,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1316460.0, ans=0.0 2023-10-03 15:32:20,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:32:20,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:32:20,346 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 15:32:21,191 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-10-03 15:32:21,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 15:32:27,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:32:27,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:32:29,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:32:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:37,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:32:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 15:32:37,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:32:38,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1316593.3333333333, ans=0.125 2023-10-03 15:32:41,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 15:32:42,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:32:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:42,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1316593.3333333333, ans=0.125 2023-10-03 15:32:45,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:32:45,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:32:50,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 15:32:50,150 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 15:32:51,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:32:51,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 15:32:54,237 INFO [train.py:1046] (2/4) Epoch 38, batch 950, loss[loss=0.1512, simple_loss=0.2275, pruned_loss=0.03742, over 24340.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2389, pruned_loss=0.03966, over 4694541.90 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:32:54,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:57,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 15:33:02,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1316660.0, ans=0.125 2023-10-03 15:33:03,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:04,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:06,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:33:09,539 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 15:33:11,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:12,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:33:12,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:12,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:33:13,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 15:33:13,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:33:14,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.54 vs. limit=15.0 2023-10-03 15:33:15,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:16,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 15:33:16,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:33:21,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:21,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:33:21,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:33:22,508 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.928e+02 2.082e+02 2.343e+02 3.541e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 15:33:23,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 15:33:25,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:33:28,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:33:30,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:33:34,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:33:34,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:39,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 15:33:42,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 15:33:42,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:33:42,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:33:42,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:42,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:33:46,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 15:33:46,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:33:46,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1316860.0, ans=0.125 2023-10-03 15:33:47,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1316860.0, ans=0.0 2023-10-03 15:33:49,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:33:50,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:50,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 15:33:50,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:50,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:33:51,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 15:33:55,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1316926.6666666667, ans=0.125 2023-10-03 15:33:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:33:59,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:34:04,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:34:05,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 15:34:05,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 15:34:08,416 INFO [train.py:1046] (2/4) Epoch 38, batch 1000, loss[loss=0.168, simple_loss=0.2516, pruned_loss=0.04218, over 23619.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2386, pruned_loss=0.03931, over 4701819.52 frames. ], batch size: 85, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:34:08,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:34:13,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 15:34:13,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:17,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1316993.3333333333, ans=0.125 2023-10-03 15:34:18,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:34:20,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 15:34:20,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 15:34:24,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:24,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:34:25,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.78 vs. limit=22.5 2023-10-03 15:34:26,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:27,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 15:34:30,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 15:34:32,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.66 vs. limit=22.5 2023-10-03 15:34:32,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 15:34:33,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:34:35,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 15:34:35,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1317060.0, ans=0.05 2023-10-03 15:34:36,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 15:34:36,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 15:34:38,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:38,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:45,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:47,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:34:47,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1317126.6666666667, ans=0.125 2023-10-03 15:34:48,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:48,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:48,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 15:34:48,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:34:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:34:49,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:51,124 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 15:34:54,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 15:34:55,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 15:34:56,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 15:34:57,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1317193.3333333333, ans=0.07 2023-10-03 15:34:59,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:35:05,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:05,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:35:05,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1317193.3333333333, ans=0.04949747468305833 2023-10-03 15:35:06,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:06,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:35:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 15:35:08,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:35:09,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 15:35:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 15:35:12,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:35:12,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:35:13,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:35:15,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:35:16,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:35:20,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:35:22,122 INFO [train.py:1046] (2/4) Epoch 38, batch 1050, loss[loss=0.1606, simple_loss=0.2376, pruned_loss=0.04178, over 23702.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2369, pruned_loss=0.03904, over 4698866.18 frames. ], batch size: 149, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:35:22,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:35:25,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:35:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:27,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:35:30,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:35:30,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:35:30,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1317326.6666666667, ans=0.125 2023-10-03 15:35:34,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:35:34,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:35:34,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:35:37,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:35:37,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 15:35:39,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:35:39,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 15:35:41,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=22.5 2023-10-03 15:35:41,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:35:41,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 15:35:41,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:35:47,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:47,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1317393.3333333333, ans=0.0 2023-10-03 15:35:48,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:35:48,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:35:50,136 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.904e+02 2.119e+02 2.413e+02 3.551e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-03 15:35:51,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 15:35:51,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 15:35:51,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:35:53,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1317460.0, ans=0.125 2023-10-03 15:35:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 15:35:58,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 15:35:59,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:02,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 15:36:03,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.63 vs. limit=15.0 2023-10-03 15:36:05,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 15:36:05,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:36:06,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:36:10,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:36:13,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 15:36:15,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 15:36:15,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 15:36:15,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:36:15,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:36:17,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 15:36:17,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1317526.6666666667, ans=0.1 2023-10-03 15:36:19,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:36:22,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:36:22,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:36:22,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:36:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:29,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:29,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 15:36:32,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:36:32,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 15:36:33,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 15:36:33,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:36:36,019 INFO [train.py:1046] (2/4) Epoch 38, batch 1100, loss[loss=0.1453, simple_loss=0.2353, pruned_loss=0.02763, over 24492.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03852, over 4727354.72 frames. ], batch size: 63, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:36:38,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:36:42,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:36:48,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:36:48,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:36:48,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:36:49,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 15:36:51,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:36:52,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:36:53,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=15.0 2023-10-03 15:36:54,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=15.0 2023-10-03 15:36:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:36:58,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:36:58,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 15:36:59,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:37:01,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:01,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:37:03,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:37:04,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:37:10,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:37:13,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 15:37:14,965 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 15:37:15,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:15,610 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-10-03 15:37:17,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:37:19,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:37:19,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 15:37:20,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:37:20,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:37:20,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:37:20,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:20,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 15:37:22,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1317860.0, ans=0.125 2023-10-03 15:37:26,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:37:26,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.64 vs. limit=15.0 2023-10-03 15:37:27,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 15:37:29,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:37:32,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:37:34,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1317926.6666666667, ans=0.2 2023-10-03 15:37:36,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 15:37:36,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:37:37,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:39,093 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.15 vs. limit=15.0 2023-10-03 15:37:39,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:39,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:37:42,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 15:37:42,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:37:44,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:37:44,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1317926.6666666667, ans=0.125 2023-10-03 15:37:45,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 15:37:45,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:37:46,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1317926.6666666667, ans=0.0 2023-10-03 15:37:47,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 15:37:48,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:37:48,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:37:48,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:37:50,248 INFO [train.py:1046] (2/4) Epoch 38, batch 1150, loss[loss=0.1646, simple_loss=0.2408, pruned_loss=0.04425, over 23856.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.238, pruned_loss=0.03881, over 4734083.13 frames. ], batch size: 195, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:37:53,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:37:55,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:37:57,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:57,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:37:57,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 15:37:57,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:38:00,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 15:38:00,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1317993.3333333333, ans=0.125 2023-10-03 15:38:02,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:38:02,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:38:08,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 15:38:10,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:38:11,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1318060.0, ans=0.125 2023-10-03 15:38:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:38:14,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:14,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 15:38:14,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:38:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:38:18,538 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.952e+02 2.181e+02 2.480e+02 4.023e+02, threshold=4.362e+02, percent-clipped=0.0 2023-10-03 15:38:18,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 15:38:20,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:38:22,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:38:27,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.90 vs. limit=15.0 2023-10-03 15:38:33,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:35,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1318193.3333333333, ans=0.0 2023-10-03 15:38:40,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:40,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 15:38:42,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:43,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:47,642 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 15:38:49,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:55,118 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 15:38:59,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:00,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:39:00,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:39:01,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:39:03,907 INFO [train.py:1046] (2/4) Epoch 38, batch 1200, loss[loss=0.1288, simple_loss=0.2092, pruned_loss=0.02415, over 23579.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2381, pruned_loss=0.03909, over 4734161.38 frames. ], batch size: 52, lr: 2.69e-03, grad_scale: 32.0 2023-10-03 15:39:05,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:09,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:39:11,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:39:13,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:13,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:13,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:39:16,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:39:16,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:39:18,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:39:20,331 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 15:39:23,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 15:39:25,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:39:27,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:39:28,461 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.41 vs. limit=22.5 2023-10-03 15:39:29,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:32,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:39:32,017 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 15:39:32,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:40,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:39:40,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:39:41,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 15:39:43,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:39:44,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 15:39:48,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 15:39:48,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:50,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:39:51,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:39:53,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:39:54,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:54,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:39:55,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:39:57,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 15:39:57,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:39:57,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:39:57,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:39:59,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:59,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:40:00,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1318526.6666666667, ans=0.0 2023-10-03 15:40:04,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:40:07,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:40:08,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 15:40:16,037 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 15:40:17,487 INFO [train.py:1046] (2/4) Epoch 38, batch 1250, loss[loss=0.1687, simple_loss=0.2576, pruned_loss=0.03992, over 24569.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2392, pruned_loss=0.03975, over 4732135.33 frames. ], batch size: 71, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:40:17,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:40:17,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1318660.0, ans=0.1 2023-10-03 15:40:19,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:40:20,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:40:22,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:40:25,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 15:40:29,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:40:29,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:40:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 15:40:32,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:40:33,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:40:37,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:40:39,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:40:39,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1318726.6666666667, ans=0.0 2023-10-03 15:40:40,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:40:40,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:40:42,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:40:47,226 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.889e+02 2.073e+02 2.333e+02 3.437e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 15:40:48,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:40:48,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:40:48,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:40:49,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:40:50,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:40:52,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:40:52,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:40:53,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1318793.3333333333, ans=0.0 2023-10-03 15:40:58,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 15:40:58,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:41:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:41:02,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 15:41:02,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:41:02,210 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 15:41:02,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:02,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:06,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:41:10,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:41:10,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:41:13,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 15:41:13,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 15:41:13,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 15:41:15,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:41:16,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 15:41:16,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:18,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 15:41:18,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:41:20,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 15:41:22,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:41:22,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:41:22,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 15:41:23,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:41:25,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 15:41:27,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:41:27,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:41:29,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:41:32,045 INFO [train.py:1046] (2/4) Epoch 38, batch 1300, loss[loss=0.1728, simple_loss=0.2522, pruned_loss=0.04671, over 23326.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2393, pruned_loss=0.03963, over 4736144.76 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:41:33,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:41:36,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:41:36,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 15:41:39,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1318993.3333333333, ans=0.05 2023-10-03 15:41:41,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:41:42,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:41:43,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:41:45,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:46,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:41:46,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 15:41:51,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:41:53,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:41:54,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 15:41:57,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:42:01,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:01,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:42:03,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:42:04,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:04,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:42:05,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:42:07,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 15:42:09,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1319126.6666666667, ans=0.0 2023-10-03 15:42:11,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:42:13,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:42:14,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 15:42:14,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:42:17,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:42:19,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:42:20,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 15:42:20,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:42:20,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 15:42:22,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:42:26,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:42:26,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:42:27,138 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.88 vs. limit=15.0 2023-10-03 15:42:29,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 15:42:30,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 15:42:32,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 15:42:33,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1319260.0, ans=0.125 2023-10-03 15:42:36,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:42:39,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 15:42:39,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:45,081 INFO [train.py:1046] (2/4) Epoch 38, batch 1350, loss[loss=0.1611, simple_loss=0.2333, pruned_loss=0.04446, over 23709.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2384, pruned_loss=0.03951, over 4732418.47 frames. ], batch size: 164, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:42:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 15:42:50,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:42:51,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:42:53,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1319326.6666666667, ans=0.1 2023-10-03 15:42:54,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:54,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:42:55,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:42:55,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:43:00,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:43:01,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 15:43:03,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:43:03,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:43:06,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 15:43:07,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:43:09,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:43:09,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 15:43:11,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 15:43:13,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 15:43:14,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:14,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 15:43:16,096 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.825e+02 1.970e+02 2.155e+02 2.948e+02, threshold=3.940e+02, percent-clipped=0.0 2023-10-03 15:43:20,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1319460.0, ans=0.125 2023-10-03 15:43:21,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1319460.0, ans=0.2 2023-10-03 15:43:23,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1319460.0, ans=0.125 2023-10-03 15:43:27,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:29,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1319526.6666666667, ans=0.0 2023-10-03 15:43:34,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:34,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:43:36,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 15:43:39,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:43:40,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 15:43:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:43:40,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1319526.6666666667, ans=0.1 2023-10-03 15:43:41,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:43:43,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:43:44,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 15:43:44,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1319593.3333333333, ans=0.0 2023-10-03 15:43:46,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:43:47,097 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.77 vs. limit=6.0 2023-10-03 15:43:52,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 15:43:54,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 15:43:59,516 INFO [train.py:1046] (2/4) Epoch 38, batch 1400, loss[loss=0.1532, simple_loss=0.2337, pruned_loss=0.03631, over 23312.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2374, pruned_loss=0.03915, over 4732064.43 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:43:59,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 15:44:00,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:44:03,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:44:05,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:44:10,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 15:44:10,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 15:44:21,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:44:25,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:44:27,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:44:27,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:44:29,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:44:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 15:44:40,032 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.36 vs. limit=15.0 2023-10-03 15:44:40,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:40,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:44,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 15:44:44,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:44:45,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:44:46,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:44:47,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:44:48,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:44:48,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:44:48,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1319860.0, ans=0.125 2023-10-03 15:44:49,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:44:51,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 15:44:51,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:44:56,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:59,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:45:06,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 15:45:06,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:45:07,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:45:10,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 15:45:10,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:12,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:45:13,313 INFO [train.py:1046] (2/4) Epoch 38, batch 1450, loss[loss=0.1453, simple_loss=0.2254, pruned_loss=0.0326, over 24286.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2367, pruned_loss=0.03909, over 4727572.79 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:45:15,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1319993.3333333333, ans=0.1 2023-10-03 15:45:18,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:45:19,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:45:19,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:19,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 15:45:20,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.93 vs. limit=15.0 2023-10-03 15:45:24,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:26,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:45:29,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:45:29,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 15:45:29,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:45:30,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 15:45:30,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:31,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:31,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 15:45:33,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:45:33,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:45:34,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 15:45:34,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:34,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:45:36,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:37,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:40,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:45:40,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:45:42,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:43,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:45,229 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.843e+02 2.021e+02 2.311e+02 3.468e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 15:45:45,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:45,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:45:46,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:46,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:45:49,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 15:45:52,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:45:57,326 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 15:45:58,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:45:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:46:00,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:00,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1320193.3333333333, ans=0.1 2023-10-03 15:46:02,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 15:46:06,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:08,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 15:46:09,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 15:46:11,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:14,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:46:14,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:46:17,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 15:46:19,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 15:46:19,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 15:46:20,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:20,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:46:20,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1320260.0, ans=0.0 2023-10-03 15:46:28,091 INFO [train.py:1046] (2/4) Epoch 38, batch 1500, loss[loss=0.167, simple_loss=0.2514, pruned_loss=0.04135, over 23972.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2376, pruned_loss=0.03917, over 4734091.62 frames. ], batch size: 86, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:46:31,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1320326.6666666667, ans=0.0 2023-10-03 15:46:32,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 15:46:32,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:46:32,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:46:35,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:35,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:46:36,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:46:36,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1320326.6666666667, ans=0.125 2023-10-03 15:46:37,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 15:46:39,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:46:39,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:46:39,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:46:40,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:46:41,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:46:43,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:46:49,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:46:49,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 15:46:49,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:46:50,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:46:50,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:54,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 15:46:59,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 15:47:01,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:47:02,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 15:47:04,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:47:05,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:47:05,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:47:06,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:07,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 15:47:08,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:47:08,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:47:08,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1320460.0, ans=0.125 2023-10-03 15:47:09,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 15:47:09,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:47:10,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.48 vs. limit=10.0 2023-10-03 15:47:13,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:47:13,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 15:47:19,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:47:20,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:47:26,006 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 15:47:26,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1320593.3333333333, ans=0.0 2023-10-03 15:47:28,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 15:47:29,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:47:31,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:47:31,141 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 15:47:31,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1320593.3333333333, ans=0.0 2023-10-03 15:47:32,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:47:35,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 15:47:36,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:39,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:47:40,821 INFO [train.py:1046] (2/4) Epoch 38, batch 1550, loss[loss=0.1681, simple_loss=0.2588, pruned_loss=0.03869, over 24310.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2383, pruned_loss=0.03913, over 4731158.90 frames. ], batch size: 77, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:47:40,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:40,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:47:40,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:41,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:47:43,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 15:47:43,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 15:47:43,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:47:44,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 15:47:45,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 15:47:48,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:48,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:49,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:47:49,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:47:51,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:51,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:54,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 15:47:54,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:47:54,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:47:54,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1320726.6666666667, ans=0.1 2023-10-03 15:47:55,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:47:57,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:47:57,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 15:47:59,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:59,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 15:48:00,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 15:48:00,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 15:48:00,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:01,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1320726.6666666667, ans=0.2 2023-10-03 15:48:03,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:05,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1320726.6666666667, ans=0.2 2023-10-03 15:48:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:48:11,706 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.894e+02 2.092e+02 2.413e+02 3.361e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 15:48:11,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 15:48:11,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 15:48:13,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1320793.3333333333, ans=0.0 2023-10-03 15:48:18,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:22,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:48:24,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:48:24,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:48:24,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 15:48:31,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:48:31,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:34,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:48:36,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-03 15:48:37,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:48:37,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:37,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 15:48:38,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:48:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:48:40,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:41,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 15:48:41,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 15:48:41,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:48:47,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1320926.6666666667, ans=0.125 2023-10-03 15:48:48,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 15:48:52,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:48:54,717 INFO [train.py:1046] (2/4) Epoch 38, batch 1600, loss[loss=0.1699, simple_loss=0.2439, pruned_loss=0.04792, over 23596.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2391, pruned_loss=0.03989, over 4718447.91 frames. ], batch size: 256, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:48:54,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:54,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 15:48:56,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:48:57,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:48:57,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:48:57,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:48:57,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:49:02,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:02,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 15:49:04,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 15:49:05,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 15:49:08,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:49:08,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1321060.0, ans=0.0 2023-10-03 15:49:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 15:49:11,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:49:12,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:49:17,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:49:19,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 15:49:21,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:49:21,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 15:49:22,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:22,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 15:49:27,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 15:49:36,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:49:37,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 15:49:37,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:49:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:49:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:49:40,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 15:49:44,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 15:49:46,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:49:47,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:48,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:49:51,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:49:51,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:49:53,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:49:59,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:50:01,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:50:02,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 15:50:02,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:50:04,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1321260.0, ans=0.125 2023-10-03 15:50:05,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 15:50:08,147 INFO [train.py:1046] (2/4) Epoch 38, batch 1650, loss[loss=0.161, simple_loss=0.2487, pruned_loss=0.0366, over 23959.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.239, pruned_loss=0.03992, over 4724256.52 frames. ], batch size: 86, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:50:10,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1321326.6666666667, ans=0.0 2023-10-03 15:50:11,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:11,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:50:12,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:50:12,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 15:50:12,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 15:50:12,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 15:50:12,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 15:50:15,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:50:17,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:50:17,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:50:17,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:50:19,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:22,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 15:50:24,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:50:24,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:50:24,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:50:24,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:50:25,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 15:50:25,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 15:50:27,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1321393.3333333333, ans=0.125 2023-10-03 15:50:31,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:50:34,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:50:35,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1321393.3333333333, ans=0.0 2023-10-03 15:50:38,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1321460.0, ans=0.0 2023-10-03 15:50:41,121 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.937e+02 2.128e+02 2.357e+02 3.873e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 15:50:43,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 15:50:45,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 15:50:49,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:50:51,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:50:51,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:50:52,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:50:52,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:50:54,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:55,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:55,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:55,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:50:55,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:50:57,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:50:57,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:51:02,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:51:02,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 15:51:05,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:51:06,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 15:51:06,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 15:51:06,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 15:51:06,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:08,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:51:08,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:51:09,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:51:09,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 15:51:12,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:51:15,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:51:15,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:51:18,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 15:51:22,523 INFO [train.py:1046] (2/4) Epoch 38, batch 1700, loss[loss=0.126, simple_loss=0.2084, pruned_loss=0.02179, over 24318.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2378, pruned_loss=0.03927, over 4728905.73 frames. ], batch size: 56, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:51:22,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:51:22,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:51:22,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 15:51:23,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:51:23,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:51:24,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:51:25,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:51:25,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:51:25,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 15:51:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:51:38,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:51:40,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:51:42,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1321726.6666666667, ans=0.025 2023-10-03 15:51:43,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1321726.6666666667, ans=0.1 2023-10-03 15:51:46,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:51:46,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:51:46,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:51:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:51:50,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 15:51:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:51:51,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:53,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:51:55,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:51:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 15:51:56,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 15:51:58,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:01,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 15:52:02,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:52:03,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1321793.3333333333, ans=0.0 2023-10-03 15:52:11,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:11,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:52:14,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:52:14,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 15:52:14,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:52:15,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1321860.0, ans=6.0 2023-10-03 15:52:17,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:17,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 15:52:17,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:52:17,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:17,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:18,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:20,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:52:21,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:21,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:52:21,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:23,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1321926.6666666667, ans=0.0 2023-10-03 15:52:26,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:52:27,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 15:52:29,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:31,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:52:34,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 15:52:37,189 INFO [train.py:1046] (2/4) Epoch 38, batch 1750, loss[loss=0.1643, simple_loss=0.2376, pruned_loss=0.04552, over 23836.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.03891, over 4724720.04 frames. ], batch size: 164, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:52:40,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:41,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:41,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:52:42,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1321993.3333333333, ans=0.0 2023-10-03 15:52:43,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 15:52:43,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:46,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:52:46,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:48,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1321993.3333333333, ans=0.07 2023-10-03 15:52:50,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 15:52:53,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:54,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 15:52:56,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:53:00,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 15:53:00,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 15:53:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:53:03,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 15:53:09,931 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.023e+02 2.283e+02 3.172e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-03 15:53:12,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:53:15,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:53:15,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:53:18,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:19,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:53:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:53:21,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:24,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:53:25,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:53:26,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 15:53:29,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:53:32,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 15:53:32,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:53:34,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:53:35,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:53:36,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1322260.0, ans=0.0 2023-10-03 15:53:38,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:53:38,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:53:38,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:41,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:53:44,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:53:45,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:53:45,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1322260.0, ans=0.0 2023-10-03 15:53:47,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:53:48,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 15:53:48,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:53:48,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:53:48,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:53:48,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:53:50,047 INFO [train.py:1046] (2/4) Epoch 38, batch 1800, loss[loss=0.1551, simple_loss=0.2245, pruned_loss=0.04284, over 22948.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2355, pruned_loss=0.03906, over 4717738.76 frames. ], batch size: 322, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:53:50,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:53:50,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:53:52,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:53:52,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:53,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1322326.6666666667, ans=0.1 2023-10-03 15:53:56,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:53:56,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1322326.6666666667, ans=0.1 2023-10-03 15:53:58,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:54:02,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 15:54:02,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:54:05,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:08,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:10,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:54:12,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:54:12,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 15:54:12,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:14,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1322393.3333333333, ans=0.0 2023-10-03 15:54:16,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:19,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 15:54:22,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 15:54:23,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 15:54:23,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:25,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:25,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:54:25,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:54:27,663 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.90 vs. limit=22.5 2023-10-03 15:54:28,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1322460.0, ans=0.0 2023-10-03 15:54:32,842 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 15:54:34,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:54:36,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:38,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 15:54:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 15:54:38,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:54:39,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:54:41,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:54:46,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 15:54:52,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:54:53,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 15:54:53,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:54:53,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:55,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:54:56,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 15:54:57,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1322593.3333333333, ans=0.125 2023-10-03 15:54:59,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:54:59,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:02,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 15:55:02,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:03,818 INFO [train.py:1046] (2/4) Epoch 38, batch 1850, loss[loss=0.1585, simple_loss=0.2447, pruned_loss=0.03615, over 24486.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2361, pruned_loss=0.03895, over 4714192.10 frames. ], batch size: 66, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:55:03,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:03,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:55:04,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1322660.0, ans=0.125 2023-10-03 15:55:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:55:05,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:55:07,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:55:09,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:55:09,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:12,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:55:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:55:17,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:55:17,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 15:55:21,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 15:55:22,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.68 vs. limit=22.5 2023-10-03 15:55:23,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 15:55:26,859 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:55:27,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:27,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 15:55:27,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 15:55:30,961 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:55:36,367 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.902e+02 2.104e+02 2.357e+02 3.020e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 15:55:38,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:55:39,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 15:55:42,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:55:42,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:55:42,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1322793.3333333333, ans=0.1 2023-10-03 15:55:45,346 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-10-03 15:55:46,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 15:55:47,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:47,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:55:48,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:55:50,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:55:53,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:55:57,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:57,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 15:55:57,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:58,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1322860.0, ans=0.125 2023-10-03 15:55:59,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:56:00,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:56:04,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 15:56:04,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:56:08,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:56:09,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:56:09,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 15:56:09,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 15:56:11,446 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 15:56:12,806 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 15:56:15,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:56:15,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:56:15,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:56:15,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:15,548 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 15:56:15,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:56:15,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:15,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1322926.6666666667, ans=0.125 2023-10-03 15:56:17,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:56:18,355 INFO [train.py:1046] (2/4) Epoch 38, batch 1900, loss[loss=0.1463, simple_loss=0.2382, pruned_loss=0.0272, over 24670.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2366, pruned_loss=0.03888, over 4712041.44 frames. ], batch size: 73, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:56:18,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:56:19,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:56:19,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 15:56:21,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:21,246 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 15:56:21,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:56:22,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:56:28,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:56:29,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.23 vs. limit=12.0 2023-10-03 15:56:30,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:56:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 15:56:31,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 15:56:32,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:56:32,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:56:33,009 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 15:56:34,321 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 15:56:37,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 15:56:38,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:56:41,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1323060.0, ans=0.1 2023-10-03 15:56:42,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 15:56:43,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 15:56:52,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 15:56:55,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 15:56:55,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:57,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 15:56:57,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 15:56:57,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 15:56:58,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 15:56:58,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:02,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 15:57:02,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1323193.3333333333, ans=0.07 2023-10-03 15:57:05,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:57:07,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:57:07,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 15:57:08,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:57:13,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 15:57:13,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:57:18,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:57:18,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:57:20,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:57:21,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:57:22,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:57:22,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 15:57:24,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:57:26,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.32 vs. limit=15.0 2023-10-03 15:57:27,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:57:27,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:57:28,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:57:28,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:57:30,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:57:31,649 INFO [train.py:1046] (2/4) Epoch 38, batch 1950, loss[loss=0.1668, simple_loss=0.2386, pruned_loss=0.04754, over 23583.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03958, over 4691599.00 frames. ], batch size: 256, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:57:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:57:34,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:57:37,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:57:37,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:37,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:57:40,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 15:57:40,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:57:40,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:42,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:45,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:57:46,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:57:46,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:48,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:57:48,759 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.15 vs. limit=15.0 2023-10-03 15:57:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:57:52,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:57:52,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:57:52,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:56,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:59,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:57:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:57:59,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:57:59,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 15:58:01,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:58:01,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:58:02,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:58:05,334 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 1.976e+02 2.296e+02 2.556e+02 3.551e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 15:58:06,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:58:10,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:58:13,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:58:15,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:58:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 15:58:15,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:58:18,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:58:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:58:20,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:58:22,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1323526.6666666667, ans=0.125 2023-10-03 15:58:28,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:28,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:32,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:33,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:58:37,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:37,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 15:58:37,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:58:39,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:58:41,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 15:58:44,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:58:45,749 INFO [train.py:1046] (2/4) Epoch 38, batch 2000, loss[loss=0.147, simple_loss=0.2227, pruned_loss=0.03564, over 23389.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.238, pruned_loss=0.03979, over 4693368.47 frames. ], batch size: 285, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 15:58:47,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:58:47,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:58:49,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:58:51,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:58:53,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:54,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 15:58:56,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:58:57,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:59:00,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 15:59:02,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:59:02,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:59:04,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:59:06,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 15:59:06,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1323726.6666666667, ans=0.125 2023-10-03 15:59:07,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:08,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:08,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:10,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 15:59:10,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:59:12,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 15:59:12,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:59:16,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:59:18,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:59:18,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:59:20,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:59:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 15:59:24,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 15:59:24,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:59:24,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:30,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:31,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:59:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:59:33,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:59:34,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:59:35,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:36,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:59:36,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:37,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:40,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:59:40,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 15:59:46,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:59:46,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.02 vs. limit=15.0 2023-10-03 15:59:47,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:50,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:51,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:59:53,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:56,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:59:56,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:57,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:59:57,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:59:59,292 INFO [train.py:1046] (2/4) Epoch 38, batch 2050, loss[loss=0.138, simple_loss=0.2031, pruned_loss=0.03642, over 23540.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03946, over 4702817.26 frames. ], batch size: 256, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:59:59,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:01,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:02,929 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:00:04,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:00:05,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:08,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1323993.3333333333, ans=0.1 2023-10-03 16:00:09,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:00:10,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:00:11,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:11,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.81 vs. limit=22.5 2023-10-03 16:00:12,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:00:14,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 16:00:15,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:00:16,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:00:16,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:00:21,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1324060.0, ans=0.125 2023-10-03 16:00:23,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1324060.0, ans=0.125 2023-10-03 16:00:26,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:00:26,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:26,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1324060.0, ans=0.07 2023-10-03 16:00:27,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 16:00:29,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:29,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 16:00:31,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:00:32,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:00:35,354 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.904e+02 2.086e+02 2.285e+02 3.176e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 16:00:35,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:00:36,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:00:38,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:00:39,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:00:39,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:00:40,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:00:44,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:00:47,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:00:48,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.99 vs. limit=12.0 2023-10-03 16:00:50,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:00:50,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:00:53,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:01:00,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:01:00,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 16:01:05,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:01:06,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:01:07,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:01:08,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-03 16:01:09,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 16:01:12,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1324326.6666666667, ans=0.125 2023-10-03 16:01:13,244 INFO [train.py:1046] (2/4) Epoch 38, batch 2100, loss[loss=0.1561, simple_loss=0.2441, pruned_loss=0.03406, over 24384.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2368, pruned_loss=0.03914, over 4713817.79 frames. ], batch size: 77, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:01:13,345 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 16:01:13,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:15,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:01:15,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:01:16,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:01:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 16:01:16,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 16:01:18,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:01:21,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1324326.6666666667, ans=0.0 2023-10-03 16:01:23,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:01:23,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:01:23,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1324326.6666666667, ans=0.125 2023-10-03 16:01:26,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:26,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:01:26,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.79 vs. limit=12.0 2023-10-03 16:01:27,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 16:01:27,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:01:28,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 16:01:28,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 16:01:30,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:01:30,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:01:30,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 16:01:32,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:01:34,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=12.0 2023-10-03 16:01:35,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1324393.3333333333, ans=0.125 2023-10-03 16:01:37,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 16:01:37,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:01:40,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:01:41,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:01:43,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:01:43,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1324460.0, ans=0.1 2023-10-03 16:01:45,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 16:01:45,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:01:45,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 16:01:45,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1324460.0, ans=10.0 2023-10-03 16:01:48,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 16:01:49,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:49,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 16:01:49,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 16:01:51,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 16:01:52,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:01:53,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:01:54,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.03 vs. limit=6.0 2023-10-03 16:01:55,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:01:57,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:01:58,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:01:58,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:01:58,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 16:01:58,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:00,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:02:00,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:00,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 16:02:02,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 16:02:03,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 16:02:06,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:02:09,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:02:09,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 16:02:16,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:17,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:02:19,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:02:19,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:02:19,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 16:02:20,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:02:21,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1324593.3333333333, ans=0.125 2023-10-03 16:02:21,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.95 vs. limit=12.0 2023-10-03 16:02:22,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:22,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:02:23,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:02:23,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:25,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 16:02:26,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 16:02:26,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:02:28,741 INFO [train.py:1046] (2/4) Epoch 38, batch 2150, loss[loss=0.1548, simple_loss=0.24, pruned_loss=0.03483, over 24676.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.236, pruned_loss=0.03881, over 4714780.69 frames. ], batch size: 65, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:02:30,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:02:30,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:02:30,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:02:30,782 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=14.85 vs. limit=15.0 2023-10-03 16:02:31,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:02:36,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 16:02:38,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:02:40,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:40,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1324660.0, ans=0.0 2023-10-03 16:02:41,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:02:41,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:42,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:02:45,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:46,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:02:46,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:02:51,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 16:02:56,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:02:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:02:57,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:57,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:02:58,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.18 vs. limit=22.5 2023-10-03 16:02:59,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:59,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:02:59,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1324793.3333333333, ans=0.125 2023-10-03 16:03:00,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:03:00,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:03:00,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:03:02,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 16:03:04,123 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.852e+02 2.034e+02 2.219e+02 3.109e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 16:03:04,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:03:05,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:05,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:06,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:03:08,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:03:10,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:10,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:03:12,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:12,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 16:03:12,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:03:15,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:03:17,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:17,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:03:18,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:03:19,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:19,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:19,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 16:03:23,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 16:03:23,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:03:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 16:03:25,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:25,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:03:26,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 16:03:27,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:03:27,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 16:03:27,291 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 16:03:27,291 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 16:03:27,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 16:03:28,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:30,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:03:30,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:03:31,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:32,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:03:34,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:34,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:38,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1324926.6666666667, ans=0.125 2023-10-03 16:03:39,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1324926.6666666667, ans=0.125 2023-10-03 16:03:42,155 INFO [train.py:1046] (2/4) Epoch 38, batch 2200, loss[loss=0.1751, simple_loss=0.2527, pruned_loss=0.04869, over 23595.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2361, pruned_loss=0.03848, over 4725196.16 frames. ], batch size: 256, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:03:42,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:03:42,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 16:03:46,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:03:49,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:49,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:03:49,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:51,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:03:53,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:54,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:54,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 16:03:58,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 16:04:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:04:02,423 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.78 vs. limit=6.0 2023-10-03 16:04:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 16:04:09,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1325060.0, ans=0.0 2023-10-03 16:04:10,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:10,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1325126.6666666667, ans=0.125 2023-10-03 16:04:11,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:04:13,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:04:16,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:04:18,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 16:04:21,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:04:21,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:21,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1325126.6666666667, ans=10.0 2023-10-03 16:04:22,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 16:04:24,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:04:27,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:04:27,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:04:28,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:31,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 16:04:32,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:32,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 16:04:36,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:36,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:04:36,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:37,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:04:39,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:04:39,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:39,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:39,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:04:40,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:04:42,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:04:45,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 16:04:47,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:04:48,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:04:51,778 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 16:04:53,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:04:53,176 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 16:04:54,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:04:55,927 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 16:04:56,503 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.94 vs. limit=15.0 2023-10-03 16:04:57,214 INFO [train.py:1046] (2/4) Epoch 38, batch 2250, loss[loss=0.1417, simple_loss=0.2245, pruned_loss=0.02944, over 20427.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.03854, over 4716662.12 frames. ], batch size: 44, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:04:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:58,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:05:00,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:05:00,350 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 16:05:01,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:05:04,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:05:07,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:05:10,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:05:15,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:15,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:05:16,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:05:16,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1325393.3333333333, ans=0.125 2023-10-03 16:05:19,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 16:05:19,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:05:19,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:05:21,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 16:05:23,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:05:23,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:25,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:05:27,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1325460.0, ans=0.125 2023-10-03 16:05:28,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:05:28,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:05:30,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:05:31,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 16:05:32,778 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.873e+02 2.067e+02 2.203e+02 2.954e+02, threshold=4.134e+02, percent-clipped=0.0 2023-10-03 16:05:32,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:36,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:05:38,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:05:41,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:05:41,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:05:41,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:05:44,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:05:45,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1325526.6666666667, ans=0.125 2023-10-03 16:05:46,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:05:50,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:05:54,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:05:59,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:05:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:05:59,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:06:01,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1325593.3333333333, ans=0.125 2023-10-03 16:06:05,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:06:05,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1325593.3333333333, ans=0.125 2023-10-03 16:06:06,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:06:06,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 16:06:06,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:08,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:06:09,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 16:06:11,258 INFO [train.py:1046] (2/4) Epoch 38, batch 2300, loss[loss=0.2049, simple_loss=0.2691, pruned_loss=0.07029, over 19160.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.03908, over 4720812.27 frames. ], batch size: 388, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:06:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:06:14,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:14,391 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:06:19,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:19,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:06:22,108 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 16:06:23,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:29,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:06:29,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:06:29,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:06:30,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:30,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 16:06:31,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:06:33,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:06:33,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:06:35,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.79 vs. limit=22.5 2023-10-03 16:06:39,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:06:40,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1325793.3333333333, ans=0.1 2023-10-03 16:06:41,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:06:41,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1325793.3333333333, ans=0.125 2023-10-03 16:06:44,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:06:50,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:06:50,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:53,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:06:55,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1325860.0, ans=0.2 2023-10-03 16:06:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:07:00,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:07:00,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:07:00,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:07:01,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 16:07:05,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:07:05,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:05,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:05,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:07:05,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:07:05,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 16:07:05,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:07:06,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 16:07:06,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:07:06,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:06,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 16:07:15,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:07:18,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:07:19,618 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.15 vs. limit=22.5 2023-10-03 16:07:22,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:07:22,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:07:22,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:07:24,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:07:24,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:07:26,599 INFO [train.py:1046] (2/4) Epoch 38, batch 2350, loss[loss=0.168, simple_loss=0.2382, pruned_loss=0.04885, over 23450.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03943, over 4720512.12 frames. ], batch size: 119, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:07:26,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:07:27,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 16:07:33,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1325993.3333333333, ans=0.04949747468305833 2023-10-03 16:07:35,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:07:35,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 16:07:36,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1325993.3333333333, ans=0.125 2023-10-03 16:07:37,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.24 vs. limit=15.0 2023-10-03 16:07:39,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 16:07:43,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:45,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1326060.0, ans=0.1 2023-10-03 16:07:46,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:46,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:46,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:07:47,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:07:47,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 16:07:51,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:07:51,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.06 vs. limit=22.5 2023-10-03 16:07:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 16:07:57,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:08:02,026 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.917e+02 2.142e+02 2.413e+02 3.614e+02, threshold=4.285e+02, percent-clipped=0.0 2023-10-03 16:08:02,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:08:02,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:08:03,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:08:04,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 16:08:04,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:08:07,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:08:07,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:08:09,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:08:09,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1326193.3333333333, ans=10.0 2023-10-03 16:08:10,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:08:13,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 16:08:13,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:08:14,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1326193.3333333333, ans=0.125 2023-10-03 16:08:16,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:08:16,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:08:18,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 16:08:19,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:08:23,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 16:08:23,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:08:27,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 16:08:29,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 16:08:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:08:30,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:08:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 16:08:31,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 16:08:34,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 16:08:36,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:08:40,329 INFO [train.py:1046] (2/4) Epoch 38, batch 2400, loss[loss=0.1693, simple_loss=0.2351, pruned_loss=0.05178, over 23730.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2379, pruned_loss=0.0392, over 4722796.93 frames. ], batch size: 179, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:08:40,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:08:42,291 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:08:43,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:08:46,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:08:47,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 16:08:47,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 16:08:54,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:08:54,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:08:56,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-10-03 16:08:56,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 16:08:58,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:08:59,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:08:59,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 16:08:59,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1326393.3333333333, ans=0.0 2023-10-03 16:09:04,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:07,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 16:09:09,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:09:14,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 16:09:15,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:09:17,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:22,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:09:22,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 16:09:22,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:09:29,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:32,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:09:35,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:09:36,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:09:36,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:09:36,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:09:36,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:37,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:09:37,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:09:41,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:09:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:09:42,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 16:09:44,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 16:09:47,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:09:47,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:49,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 16:09:49,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 16:09:49,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 16:09:49,249 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 16:09:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 16:09:53,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:09:54,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:55,238 INFO [train.py:1046] (2/4) Epoch 38, batch 2450, loss[loss=0.1447, simple_loss=0.2327, pruned_loss=0.02836, over 24484.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2372, pruned_loss=0.0389, over 4733595.52 frames. ], batch size: 63, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:09:55,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:09:56,970 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 16:09:57,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:58,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:10:01,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:10:01,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:10:06,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:06,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:07,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 16:10:11,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:10:11,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:16,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:10:16,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:10:16,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:10:16,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 16:10:20,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:22,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:10:23,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:10:24,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1326793.3333333333, ans=0.125 2023-10-03 16:10:26,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:10:26,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:28,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:29,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:10:31,032 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.921e+02 2.165e+02 2.566e+02 3.578e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 16:10:31,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 16:10:32,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:10:38,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:40,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:41,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:10:41,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:10:42,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:42,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:10:44,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 16:10:47,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:47,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:10:50,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:10:50,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:10:55,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:10:55,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 16:10:57,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:10:58,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:10:58,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 16:10:59,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:11:01,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:11:03,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:11:05,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:11:05,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:11:10,252 INFO [train.py:1046] (2/4) Epoch 38, batch 2500, loss[loss=0.1627, simple_loss=0.2477, pruned_loss=0.03886, over 24303.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2364, pruned_loss=0.03854, over 4749361.18 frames. ], batch size: 74, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:11:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 16:11:11,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:11:16,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:11:27,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:11:27,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:11:27,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:11:27,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 16:11:28,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1327060.0, ans=0.125 2023-10-03 16:11:33,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1327060.0, ans=0.125 2023-10-03 16:11:35,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:11:35,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:11:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:11:37,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:11:38,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 16:11:40,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:41,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:11:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 16:11:43,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:43,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 16:11:43,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:11:47,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:11:48,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1327126.6666666667, ans=0.125 2023-10-03 16:11:49,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:11:51,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:11:51,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 16:11:52,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1327126.6666666667, ans=0.2 2023-10-03 16:11:53,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:11:53,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1327193.3333333333, ans=0.0 2023-10-03 16:11:53,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1327193.3333333333, ans=0.125 2023-10-03 16:11:54,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:58,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:02,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:04,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:12:08,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:12:10,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 16:12:10,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:12:10,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:12:12,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.18 vs. limit=15.0 2023-10-03 16:12:12,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:12:12,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:12:12,830 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 16:12:12,831 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 16:12:14,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 16:12:17,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:12:18,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1327260.0, ans=0.2 2023-10-03 16:12:20,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 16:12:20,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 16:12:20,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:12:20,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 16:12:21,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1327260.0, ans=0.2 2023-10-03 16:12:22,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 16:12:24,749 INFO [train.py:1046] (2/4) Epoch 38, batch 2550, loss[loss=0.1644, simple_loss=0.2422, pruned_loss=0.04336, over 23806.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.237, pruned_loss=0.03858, over 4745908.71 frames. ], batch size: 164, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:12:24,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1327326.6666666667, ans=0.1 2023-10-03 16:12:26,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1327326.6666666667, ans=0.125 2023-10-03 16:12:27,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:12:29,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:12:29,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:12:30,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:12:32,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 16:12:32,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:12:37,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 16:12:38,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:12:40,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:41,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1327393.3333333333, ans=0.0 2023-10-03 16:12:42,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:12:42,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 16:12:43,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:12:44,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:12:44,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:12:47,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:12:47,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 16:12:47,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:12:47,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:47,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 16:12:59,651 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.861e+02 2.049e+02 2.382e+02 3.363e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-03 16:13:03,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:13:07,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:07,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:07,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:13:08,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:13:14,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:13:17,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:13:17,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:13:17,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:13:17,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:13:19,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:13:22,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:22,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:27,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:13:27,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 16:13:27,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:13:27,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:29,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:13:29,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:13:31,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:13:35,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:13:38,554 INFO [train.py:1046] (2/4) Epoch 38, batch 2600, loss[loss=0.1758, simple_loss=0.2473, pruned_loss=0.05217, over 23558.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2378, pruned_loss=0.03895, over 4734843.21 frames. ], batch size: 256, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:13:38,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:13:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 16:13:42,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.36 vs. limit=15.0 2023-10-03 16:13:43,378 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 16:13:44,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:13:44,712 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 16:13:46,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 16:13:46,091 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 16:13:46,320 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:13:48,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:50,462 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 16:13:50,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 16:13:51,957 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 16:13:53,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1327726.6666666667, ans=0.2 2023-10-03 16:13:53,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1327726.6666666667, ans=0.0 2023-10-03 16:13:54,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:13:56,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 16:13:56,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 16:13:56,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1327726.6666666667, ans=0.2 2023-10-03 16:13:56,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1327726.6666666667, ans=0.0 2023-10-03 16:13:58,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:13:58,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 16:14:02,099 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 16:14:02,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 16:14:08,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:09,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:09,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:14:09,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 16:14:12,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:14:13,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1327793.3333333333, ans=0.125 2023-10-03 16:14:17,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 16:14:21,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:23,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:24,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 16:14:24,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:14:24,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:14:25,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 16:14:28,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:14:28,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:14:30,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1327860.0, ans=0.0 2023-10-03 16:14:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:14:34,538 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 16:14:36,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:14:36,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:14:40,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:14:41,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:14:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 16:14:42,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1327926.6666666667, ans=0.1 2023-10-03 16:14:43,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:44,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:14:44,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:14:51,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 16:14:52,579 INFO [train.py:1046] (2/4) Epoch 38, batch 2650, loss[loss=0.1565, simple_loss=0.2485, pruned_loss=0.03225, over 24315.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2382, pruned_loss=0.03881, over 4735571.06 frames. ], batch size: 74, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:14:52,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:54,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:14:56,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1327993.3333333333, ans=0.1 2023-10-03 16:14:58,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 16:14:58,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:59,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:14:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 16:14:59,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:02,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:15:04,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:15:06,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:15:06,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1328060.0, ans=0.2 2023-10-03 16:15:08,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:15:09,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1328060.0, ans=0.125 2023-10-03 16:15:10,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 16:15:10,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:15:10,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:15:14,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 16:15:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 16:15:17,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:15:20,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 16:15:20,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:22,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 16:15:23,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1328126.6666666667, ans=0.0 2023-10-03 16:15:24,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:26,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:15:26,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:26,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:29,039 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.957e+02 2.182e+02 2.477e+02 3.538e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 16:15:30,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 16:15:30,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 16:15:33,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:15:34,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 16:15:36,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:36,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:38,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:15:38,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:38,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:15:41,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:41,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:15:44,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:15:45,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:15:45,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:15:47,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:47,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:15:49,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:50,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:15:50,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:15:53,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:54,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:15:54,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:54,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 16:15:58,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:16:00,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:00,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:01,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:03,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:16:03,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:05,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1328326.6666666667, ans=0.1 2023-10-03 16:16:06,459 INFO [train.py:1046] (2/4) Epoch 38, batch 2700, loss[loss=0.1761, simple_loss=0.2493, pruned_loss=0.05142, over 23675.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2393, pruned_loss=0.0394, over 4729413.31 frames. ], batch size: 232, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:16:06,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:16:06,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 16:16:09,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:16:11,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 16:16:13,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:16:13,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:13,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:15,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1328326.6666666667, ans=0.125 2023-10-03 16:16:15,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1328326.6666666667, ans=0.1 2023-10-03 16:16:15,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1328326.6666666667, ans=0.125 2023-10-03 16:16:16,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:16:16,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:16:16,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:16:16,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:16:16,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 16:16:18,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:16:20,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:16:20,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:16:21,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:22,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.94 vs. limit=22.5 2023-10-03 16:16:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:16:24,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 16:16:25,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:16:31,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:16:31,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:16:31,885 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.55 vs. limit=15.0 2023-10-03 16:16:36,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:16:37,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:16:37,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:16:37,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:16:40,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:16:43,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:16:43,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:16:43,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:16:43,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1328460.0, ans=22.5 2023-10-03 16:16:47,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:47,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:16:57,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:16:58,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:17:01,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:17:01,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:04,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:17:06,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:06,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:17:06,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1328593.3333333333, ans=0.125 2023-10-03 16:17:07,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:07,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1328593.3333333333, ans=0.025 2023-10-03 16:17:08,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:17:10,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:17:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:17:13,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1328593.3333333333, ans=0.0 2023-10-03 16:17:14,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:17:14,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:17:17,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 16:17:18,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:20,110 INFO [train.py:1046] (2/4) Epoch 38, batch 2750, loss[loss=0.1598, simple_loss=0.246, pruned_loss=0.03681, over 24465.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.239, pruned_loss=0.03943, over 4732589.88 frames. ], batch size: 66, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:17:21,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:17:21,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 16:17:24,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 16:17:24,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:26,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:26,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:29,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:29,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:17:29,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:29,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1328660.0, ans=10.0 2023-10-03 16:17:29,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1328660.0, ans=0.125 2023-10-03 16:17:31,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:17:31,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:17:32,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:17:32,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:32,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 16:17:33,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:17:34,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:38,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 16:17:39,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:17:40,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:41,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:17:41,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:17:42,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:17:44,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:44,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:17:50,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:17:50,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:17:52,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:54,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:17:57,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1328793.3333333333, ans=0.125 2023-10-03 16:17:58,230 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.929e+02 2.191e+02 2.513e+02 4.361e+02, threshold=4.383e+02, percent-clipped=0.0 2023-10-03 16:17:58,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:59,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:18:01,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:04,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:18:04,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:18:04,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:18:11,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:18:11,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1328860.0, ans=0.125 2023-10-03 16:18:12,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:18:12,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 16:18:15,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:18,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 16:18:24,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:18:27,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:18:27,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 16:18:29,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:18:31,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:18:31,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 16:18:31,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:18:34,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 16:18:34,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:34,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:18:35,422 INFO [train.py:1046] (2/4) Epoch 38, batch 2800, loss[loss=0.151, simple_loss=0.2294, pruned_loss=0.03628, over 23326.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2376, pruned_loss=0.03926, over 4725973.71 frames. ], batch size: 93, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:18:35,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 16:18:36,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:18:36,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:38,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:18:39,504 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 16:18:39,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 16:18:40,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:43,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:18:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:18:46,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:18:50,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 16:18:51,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 16:18:51,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1329060.0, ans=0.2 2023-10-03 16:18:52,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 16:18:54,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:55,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:18:55,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:18:55,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1329060.0, ans=0.2 2023-10-03 16:18:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:18:58,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:58,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:18:59,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:19:08,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:19:08,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1329126.6666666667, ans=0.1 2023-10-03 16:19:10,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:19:12,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:12,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:19:14,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:18,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:19:18,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 16:19:20,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:22,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:19:22,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:19:26,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:26,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:27,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=1329193.3333333333, ans=0.025 2023-10-03 16:19:30,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:19:32,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:19:33,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:33,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:19:33,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:19:33,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1329260.0, ans=0.125 2023-10-03 16:19:34,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:19:35,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:19:35,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 16:19:35,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:19:35,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:19:35,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:19:38,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 16:19:38,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1329260.0, ans=0.09899494936611666 2023-10-03 16:19:39,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:39,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:19:40,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:19:41,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 16:19:45,928 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:19:47,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:19:47,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:19:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:19:48,409 INFO [train.py:1046] (2/4) Epoch 38, batch 2850, loss[loss=0.1753, simple_loss=0.2553, pruned_loss=0.04764, over 24529.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2372, pruned_loss=0.03897, over 4729162.96 frames. ], batch size: 71, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:19:48,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:19:52,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:19:53,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:19:54,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:54,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1329326.6666666667, ans=0.125 2023-10-03 16:19:57,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1329326.6666666667, ans=0.1 2023-10-03 16:19:58,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:58,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:59,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:20:01,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 16:20:07,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 16:20:07,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:08,038 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.21 vs. limit=15.0 2023-10-03 16:20:10,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 16:20:11,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:14,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 16:20:14,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 16:20:15,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:18,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.75 vs. limit=15.0 2023-10-03 16:20:26,108 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.929e+02 2.175e+02 2.437e+02 3.531e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 16:20:26,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:20:27,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:20:28,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:20:30,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:20:30,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:20:30,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:20:31,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:20:33,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 16:20:35,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:20:35,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:20:36,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:20:36,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:38,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:20:39,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:20:40,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:42,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:20:45,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:20:45,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:45,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1329526.6666666667, ans=15.0 2023-10-03 16:20:46,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:47,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:20:49,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.14 vs. limit=15.0 2023-10-03 16:20:50,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:20:50,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1329593.3333333333, ans=0.125 2023-10-03 16:20:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 16:20:52,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1329593.3333333333, ans=0.04949747468305833 2023-10-03 16:20:54,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 16:20:55,452 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.94 vs. limit=15.0 2023-10-03 16:20:55,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:20:55,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:20:55,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 16:20:57,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:20:58,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:20:58,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:20:58,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:20:58,597 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 16:20:58,633 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 16:20:58,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:20:59,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:02,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.02 vs. limit=10.0 2023-10-03 16:21:02,749 INFO [train.py:1046] (2/4) Epoch 38, batch 2900, loss[loss=0.1505, simple_loss=0.2235, pruned_loss=0.03872, over 23519.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2374, pruned_loss=0.03909, over 4732218.60 frames. ], batch size: 285, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:21:04,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:21:04,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:21:06,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:21:06,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 16:21:09,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:21:10,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 16:21:11,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 16:21:13,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:21:13,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:21:14,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:21:15,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:21:19,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:21:19,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:21:23,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:21:23,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 16:21:25,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:21:25,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.31 vs. limit=22.5 2023-10-03 16:21:26,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:26,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1329726.6666666667, ans=0.125 2023-10-03 16:21:29,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 16:21:30,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 16:21:34,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:21:34,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 16:21:34,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:21:35,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:21:35,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:21:37,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:21:39,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:43,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:21:44,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:21:46,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 16:21:46,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 16:21:46,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:21:50,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:21:52,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 16:21:53,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:21:58,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:22:06,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:22:07,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:22:08,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 16:22:11,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:11,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 16:22:11,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:22:13,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:22:14,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1329993.3333333333, ans=0.1 2023-10-03 16:22:15,791 INFO [train.py:1046] (2/4) Epoch 38, batch 2950, loss[loss=0.1633, simple_loss=0.2407, pruned_loss=0.04293, over 23723.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2376, pruned_loss=0.03889, over 4733508.36 frames. ], batch size: 232, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:22:18,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:22:19,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 16:22:21,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:22:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:22,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:22:24,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:22:25,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 16:22:26,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 16:22:27,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:22:27,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:22:35,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:22:37,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:22:38,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:22:38,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:22:41,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1330060.0, ans=0.0 2023-10-03 16:22:42,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:22:42,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:22:44,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:44,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:45,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:22:47,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 16:22:50,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 16:22:51,258 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 16:22:51,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:22:52,582 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.960e+02 2.141e+02 2.460e+02 3.177e+02, threshold=4.282e+02, percent-clipped=0.0 2023-10-03 16:22:53,877 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 16:22:53,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 16:22:55,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:22:57,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:22:57,168 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 16:22:57,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:22:59,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 16:23:01,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:23:01,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:23:03,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1330193.3333333333, ans=0.2 2023-10-03 16:23:04,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:23:05,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:23:05,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:07,840 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 16:23:07,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:23:09,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 16:23:10,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1330193.3333333333, ans=0.125 2023-10-03 16:23:12,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1330193.3333333333, ans=10.0 2023-10-03 16:23:14,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:14,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:23:14,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 16:23:14,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:23:16,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 16:23:18,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:23:20,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:23:20,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:23:21,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:21,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:23:23,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:23:24,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:24,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:23:26,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:23:27,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:23:28,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:23:29,353 INFO [train.py:1046] (2/4) Epoch 38, batch 3000, loss[loss=0.1768, simple_loss=0.2432, pruned_loss=0.0552, over 23777.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2386, pruned_loss=0.03941, over 4743288.58 frames. ], batch size: 195, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:23:29,353 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 16:23:41,534 INFO [train.py:1078] (2/4) Epoch 38, validation: loss=0.3508, simple_loss=0.2758, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-03 16:23:41,534 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 16:23:41,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:41,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 16:23:43,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:45,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:23:45,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:23:48,736 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 16:23:48,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 16:23:52,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:23:52,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:23:54,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 16:23:54,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:23:55,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1330393.3333333333, ans=0.125 2023-10-03 16:24:01,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:24:09,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:24:14,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 16:24:14,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1330460.0, ans=0.125 2023-10-03 16:24:16,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:24:17,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:24:17,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:24:17,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:24:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:24:20,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 16:24:23,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 16:24:23,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:24:24,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1330460.0, ans=0.125 2023-10-03 16:24:25,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:24:27,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:24:27,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:24:27,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:27,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:24:31,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:24:31,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:24:31,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:24:33,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:24:35,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 16:24:37,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:24:37,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:24:38,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:24:41,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:41,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:42,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 16:24:42,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 16:24:42,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:24:42,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 16:24:44,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:24:45,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 16:24:49,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:24:49,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:24:51,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 16:24:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 16:24:51,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:24:52,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:24:54,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:54,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:24:54,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:24:54,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:24:55,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1330660.0, ans=0.125 2023-10-03 16:24:56,281 INFO [train.py:1046] (2/4) Epoch 38, batch 3050, loss[loss=0.203, simple_loss=0.273, pruned_loss=0.06653, over 19525.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2394, pruned_loss=0.03992, over 4732381.74 frames. ], batch size: 388, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:24:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 16:24:59,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:25:00,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:02,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:25:02,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1330660.0, ans=0.125 2023-10-03 16:25:04,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:07,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 16:25:07,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1330660.0, ans=0.125 2023-10-03 16:25:14,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 16:25:14,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 16:25:15,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:18,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:25:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:21,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:21,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:23,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:25:25,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:25:25,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:26,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:26,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:28,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:30,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:33,291 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.918e+02 2.115e+02 2.470e+02 3.368e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-03 16:25:33,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:33,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 16:25:34,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:34,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:25:38,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:25:38,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:25:38,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:25:40,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:25:41,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1330860.0, ans=0.125 2023-10-03 16:25:45,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:45,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:25:52,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:54,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:25:54,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:55,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:25:55,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:25:55,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:25:56,554 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.27 vs. limit=22.5 2023-10-03 16:25:57,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 16:25:59,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:25:59,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:59,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1330926.6666666667, ans=0.0 2023-10-03 16:26:00,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 16:26:03,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:26:05,017 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:26:07,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:26:10,301 INFO [train.py:1046] (2/4) Epoch 38, batch 3100, loss[loss=0.1567, simple_loss=0.2485, pruned_loss=0.0324, over 24660.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03946, over 4728859.10 frames. ], batch size: 73, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:26:10,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:26:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:26:13,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 16:26:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 16:26:17,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 16:26:19,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:26:21,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:26:21,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:23,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 16:26:26,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:31,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 16:26:35,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:26:35,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:35,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:26:35,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:26:37,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 16:26:37,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1331060.0, ans=0.125 2023-10-03 16:26:40,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:26:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 16:26:40,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:26:43,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:44,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 16:26:44,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:26:49,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:26:50,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 16:26:51,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 16:26:53,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:53,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:56,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:26:56,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:57,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:26:59,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:26:59,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:27:00,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:27:00,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:00,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:00,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 16:27:01,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=22.5 2023-10-03 16:27:04,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1331193.3333333333, ans=0.125 2023-10-03 16:27:06,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:27:06,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 16:27:09,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:27:09,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 16:27:10,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:10,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:11,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 16:27:13,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1331260.0, ans=0.125 2023-10-03 16:27:20,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 16:27:23,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:23,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:24,353 INFO [train.py:1046] (2/4) Epoch 38, batch 3150, loss[loss=0.1564, simple_loss=0.2295, pruned_loss=0.04162, over 23689.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2365, pruned_loss=0.03942, over 4715135.45 frames. ], batch size: 120, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:27:25,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:27:25,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:27:27,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 16:27:28,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:28,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 16:27:29,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 16:27:33,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:34,600 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 16:27:37,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1331393.3333333333, ans=0.1 2023-10-03 16:27:38,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 16:27:38,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:27:40,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 16:27:41,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 16:27:41,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 16:27:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 16:27:43,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 16:27:43,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:43,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:27:43,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1331393.3333333333, ans=0.0 2023-10-03 16:27:44,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:44,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 16:27:45,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1331393.3333333333, ans=0.125 2023-10-03 16:27:48,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:48,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:48,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:49,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:27:54,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 16:27:54,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:27:57,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:27:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:58,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 16:28:01,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 16:28:02,633 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.924e+02 2.139e+02 2.464e+02 3.251e+02, threshold=4.278e+02, percent-clipped=0.0 2023-10-03 16:28:02,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:28:02,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:28:02,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:28:04,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:28:04,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:28:06,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:28:06,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:28:07,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 16:28:08,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:28:08,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:10,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:28:10,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:28:11,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 16:28:11,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:14,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 16:28:14,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:16,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 16:28:17,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 16:28:17,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1331526.6666666667, ans=0.125 2023-10-03 16:28:19,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:28:19,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:19,709 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:28:20,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 16:28:22,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 16:28:23,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:28:25,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:28:25,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:25,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:28:25,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1331593.3333333333, ans=0.125 2023-10-03 16:28:31,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:28:32,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:33,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 16:28:38,470 INFO [train.py:1046] (2/4) Epoch 38, batch 3200, loss[loss=0.1397, simple_loss=0.218, pruned_loss=0.03069, over 16373.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2354, pruned_loss=0.03946, over 4699767.17 frames. ], batch size: 35, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:28:39,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:28:39,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:28:45,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:45,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:28:45,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 16:28:48,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:50,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:28:55,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:29:02,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:29:05,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1331726.6666666667, ans=0.125 2023-10-03 16:29:10,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1331793.3333333333, ans=0.0 2023-10-03 16:29:11,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 16:29:11,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:29:17,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 16:29:18,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:29:21,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:29:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:29:21,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:29:26,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 16:29:26,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 16:29:28,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 16:29:30,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 16:29:33,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:29:36,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1331926.6666666667, ans=0.125 2023-10-03 16:29:39,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:29:39,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:29:39,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:29:40,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 16:29:40,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:29:44,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:29:46,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 16:29:48,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 16:29:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 16:29:50,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 16:29:53,176 INFO [train.py:1046] (2/4) Epoch 38, batch 3250, loss[loss=0.1753, simple_loss=0.2601, pruned_loss=0.04525, over 24000.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2349, pruned_loss=0.03933, over 4689322.27 frames. ], batch size: 80, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:29:53,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:29:54,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.46 vs. limit=22.5 2023-10-03 16:29:54,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:29:54,932 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 16:29:56,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:29:56,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:29:56,398 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 16:30:00,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:30:03,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:30:12,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:30:12,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 16:30:13,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:13,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:30:13,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:30:15,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:30:15,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:30:18,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:30:19,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:19,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:30:22,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:24,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:30:26,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:26,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:27,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:29,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:30:29,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:30:33,137 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.003e+02 2.220e+02 2.605e+02 4.440e+02, threshold=4.440e+02, percent-clipped=1.0 2023-10-03 16:30:34,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 16:30:34,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:30:34,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:30:35,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:37,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:30:37,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1332193.3333333333, ans=0.05 2023-10-03 16:30:41,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:30:50,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:30:50,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:50,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 16:30:50,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:30:50,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:30:51,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:51,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1332260.0, ans=0.2 2023-10-03 16:30:51,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1332260.0, ans=0.2 2023-10-03 16:30:53,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 16:30:53,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 16:30:53,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:30:55,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:57,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:30:58,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 16:30:58,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:31:00,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:31:00,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:31:01,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 16:31:01,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:02,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:31:03,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 16:31:06,785 INFO [train.py:1046] (2/4) Epoch 38, batch 3300, loss[loss=0.169, simple_loss=0.2404, pruned_loss=0.04882, over 23829.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2361, pruned_loss=0.03933, over 4689547.86 frames. ], batch size: 164, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:31:06,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:31:06,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 16:31:08,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 16:31:10,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 16:31:10,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:15,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:31:15,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:31:17,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:19,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:31:19,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:31:21,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:23,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-10-03 16:31:24,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:31:27,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 16:31:28,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:31:28,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:31,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:32,055 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 16:31:33,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:31:33,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:31:33,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1332393.3333333333, ans=0.125 2023-10-03 16:31:34,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:31:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:31:34,900 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 16:31:37,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:38,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-10-03 16:31:38,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:31:41,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:41,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 16:31:43,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 16:31:43,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:44,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:31:46,337 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 16:31:47,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 16:31:49,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:31:52,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 16:31:53,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:31:55,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:31:56,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:31:58,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:31:59,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:59,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:59,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:32:01,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:32:01,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:01,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:32:03,178 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 16:32:04,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 16:32:05,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 16:32:06,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:32:07,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:32:07,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:08,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:32:08,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:08,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1332593.3333333333, ans=0.2 2023-10-03 16:32:10,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:32:11,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:11,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:32:12,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:14,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:32:16,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 16:32:18,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:18,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:20,873 INFO [train.py:1046] (2/4) Epoch 38, batch 3350, loss[loss=0.1537, simple_loss=0.2334, pruned_loss=0.03697, over 23415.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.237, pruned_loss=0.03903, over 4704772.91 frames. ], batch size: 105, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:32:20,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:32:22,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:32:22,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1332660.0, ans=0.1 2023-10-03 16:32:23,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:23,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:23,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:27,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:32:29,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:29,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:32:29,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1332660.0, ans=0.125 2023-10-03 16:32:33,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:35,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:32:37,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:32:38,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 16:32:39,954 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 16:32:39,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:42,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 16:32:42,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 16:32:44,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:32:44,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:32:44,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:32:44,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 16:32:45,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:45,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:32:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:49,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:50,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:50,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:32:53,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:32:54,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:54,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:32:58,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:58,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:33:00,140 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 1.992e+02 2.171e+02 2.466e+02 3.497e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-03 16:33:00,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:33:00,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:03,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:05,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 16:33:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:33:06,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 16:33:06,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:33:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 16:33:08,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1332860.0, ans=0.125 2023-10-03 16:33:09,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:33:10,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:33:16,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:18,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 16:33:18,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:33:18,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1332926.6666666667, ans=0.125 2023-10-03 16:33:20,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:33:22,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:33:27,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:33:29,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 16:33:29,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:33:30,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:33:31,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:33:33,594 INFO [train.py:1046] (2/4) Epoch 38, batch 3400, loss[loss=0.1414, simple_loss=0.2205, pruned_loss=0.03114, over 24342.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2382, pruned_loss=0.03975, over 4710050.81 frames. ], batch size: 56, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:33:33,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 16:33:33,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:35,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 16:33:35,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1332993.3333333333, ans=0.0 2023-10-03 16:33:36,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:33:36,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:33:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:33:38,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:33:39,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 16:33:43,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 16:33:43,975 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 16:33:43,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:33:45,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1332993.3333333333, ans=0.125 2023-10-03 16:33:48,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:33:48,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:33:48,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:33:48,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:33:49,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1333060.0, ans=0.2 2023-10-03 16:33:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:33:56,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 16:33:58,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1333060.0, ans=0.0 2023-10-03 16:34:01,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:34:01,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:34:02,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:34:04,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:34:10,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:34:13,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 16:34:19,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:34:21,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:34:21,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 16:34:21,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:34:21,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:34:22,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:34:22,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:34:25,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:34:27,343 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.73 vs. limit=15.0 2023-10-03 16:34:29,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:34:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:34:34,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:34:35,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 16:34:41,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:34:45,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 16:34:45,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1333326.6666666667, ans=0.1 2023-10-03 16:34:47,546 INFO [train.py:1046] (2/4) Epoch 38, batch 3450, loss[loss=0.1491, simple_loss=0.2116, pruned_loss=0.04329, over 22715.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2387, pruned_loss=0.04006, over 4701810.28 frames. ], batch size: 322, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:34:51,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 16:34:52,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:34:54,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:34:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 16:34:54,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:34:58,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:35:03,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:35:03,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:05,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:35:05,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:07,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:13,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 16:35:17,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 16:35:17,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:35:17,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:35:19,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:24,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 16:35:26,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:35:26,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1333460.0, ans=0.125 2023-10-03 16:35:28,779 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.894e+02 2.065e+02 2.341e+02 2.920e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-03 16:35:28,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:35:28,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:35:29,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1333460.0, ans=0.125 2023-10-03 16:35:29,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1333460.0, ans=0.1 2023-10-03 16:35:30,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:35:32,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:35:33,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 16:35:33,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:35:34,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:37,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:35:38,212 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.30 vs. limit=15.0 2023-10-03 16:35:40,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 16:35:43,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:35:48,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:35:51,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:35:58,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:58,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:35:59,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:35:59,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:36:02,545 INFO [train.py:1046] (2/4) Epoch 38, batch 3500, loss[loss=0.1493, simple_loss=0.2284, pruned_loss=0.03513, over 24662.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2376, pruned_loss=0.03984, over 4693300.53 frames. ], batch size: 65, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:36:04,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:36:06,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:36:07,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 16:36:08,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:36:11,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:36:14,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:36:14,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 16:36:20,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:36:21,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:36:23,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:36:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:36:24,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:36:24,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:24,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:36:25,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 16:36:28,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:29,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:36:29,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:36:34,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:36,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 16:36:36,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:36:38,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:36:41,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:36:41,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1333793.3333333333, ans=0.09899494936611666 2023-10-03 16:36:42,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:44,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:36:44,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:36:44,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.31 vs. limit=15.0 2023-10-03 16:36:45,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 16:36:45,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 16:36:47,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 16:36:47,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1333860.0, ans=0.07 2023-10-03 16:36:48,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:36:48,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:49,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1333860.0, ans=0.125 2023-10-03 16:36:50,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:36:50,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:36:53,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:36:53,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:36:59,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:00,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 16:37:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 16:37:00,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:03,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:37:05,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:37:06,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:09,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 16:37:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:37:10,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:37:12,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 16:37:13,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 16:37:16,911 INFO [train.py:1046] (2/4) Epoch 38, batch 3550, loss[loss=0.1525, simple_loss=0.2266, pruned_loss=0.03917, over 23368.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2363, pruned_loss=0.03947, over 4690693.65 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:37:16,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:17,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:37:18,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:18,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:22,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:37:30,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:32,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 16:37:36,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:36,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:37:37,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:37,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:37:37,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:37:39,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1334060.0, ans=0.2 2023-10-03 16:37:40,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:41,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:37:41,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:43,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:37:43,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:37:47,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:37:47,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:37:49,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:50,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:37:50,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 16:37:50,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:50,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:52,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:37:57,015 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.030e+02 2.255e+02 2.585e+02 3.418e+02, threshold=4.510e+02, percent-clipped=0.0 2023-10-03 16:37:58,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:58,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:59,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 16:38:01,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1334193.3333333333, ans=0.125 2023-10-03 16:38:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:38:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 16:38:04,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:38:06,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:38:07,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:38:10,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 16:38:11,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:17,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:17,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 16:38:17,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:19,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1334260.0, ans=0.125 2023-10-03 16:38:21,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:38:21,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 16:38:29,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 16:38:29,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:38:29,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:38:29,588 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:38:30,633 INFO [train.py:1046] (2/4) Epoch 38, batch 3600, loss[loss=0.1371, simple_loss=0.2161, pruned_loss=0.02912, over 24420.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2356, pruned_loss=0.03929, over 4688088.15 frames. ], batch size: 58, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:38:30,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:32,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:38:33,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1334326.6666666667, ans=0.0 2023-10-03 16:38:33,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1334326.6666666667, ans=0.0 2023-10-03 16:38:35,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:38:36,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:38,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:38:39,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:38:39,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:39,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 16:38:42,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:38:42,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:45,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:38:48,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:38:48,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:38:48,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:38:50,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 16:38:50,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:38:53,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:54,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:38:56,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:57,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:38:57,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:38:58,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 16:39:07,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:39:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:39:08,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 16:39:09,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1334460.0, ans=0.1 2023-10-03 16:39:13,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:39:15,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.45 vs. limit=15.0 2023-10-03 16:39:18,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:21,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:25,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1334526.6666666667, ans=0.2 2023-10-03 16:39:27,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:39:28,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:39:28,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 16:39:29,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 16:39:30,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1334593.3333333333, ans=0.1 2023-10-03 16:39:31,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 16:39:32,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:39:33,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:39:34,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 16:39:34,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:39:34,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:39:34,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:39:36,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 16:39:36,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 16:39:37,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1334593.3333333333, ans=0.1 2023-10-03 16:39:39,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1334593.3333333333, ans=0.125 2023-10-03 16:39:40,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:40,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 16:39:43,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.62 vs. limit=15.0 2023-10-03 16:39:44,416 INFO [train.py:1046] (2/4) Epoch 38, batch 3650, loss[loss=0.1604, simple_loss=0.2327, pruned_loss=0.04411, over 23410.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2362, pruned_loss=0.03928, over 4698888.96 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:39:46,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 16:39:48,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:39:54,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 16:39:54,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 16:39:57,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:39:57,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:39:57,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:39:57,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1334660.0, ans=0.0 2023-10-03 16:40:00,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:40:00,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:40:02,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 16:40:02,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:40:02,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:03,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 16:40:04,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:40:04,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:40:04,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:07,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:40:09,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 16:40:11,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 16:40:12,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:40:14,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 16:40:15,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:40:15,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:40:18,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1334793.3333333333, ans=0.2 2023-10-03 16:40:22,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:40:25,393 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.902e+02 2.048e+02 2.259e+02 3.256e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 16:40:25,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:25,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:40:26,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:40:26,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:40:28,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:40:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:33,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:40:33,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:40:34,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:40:35,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:37,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:40:37,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1334860.0, ans=0.1 2023-10-03 16:40:41,348 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 16:40:44,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:40:44,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:40:46,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:40:46,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:40:47,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:40:48,680 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.09 vs. limit=15.0 2023-10-03 16:40:49,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:40:52,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 16:40:52,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:40:56,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:40:56,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1334926.6666666667, ans=0.125 2023-10-03 16:40:58,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:58,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1334993.3333333333, ans=0.125 2023-10-03 16:40:59,336 INFO [train.py:1046] (2/4) Epoch 38, batch 3700, loss[loss=0.1547, simple_loss=0.2463, pruned_loss=0.03156, over 24586.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03943, over 4716376.55 frames. ], batch size: 71, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:40:59,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:41:02,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:41:02,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 16:41:02,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:41:03,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:41:03,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:41:07,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:41:08,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1334993.3333333333, ans=0.0 2023-10-03 16:41:10,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:41:10,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:12,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:41:13,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:41:13,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:41:13,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1335060.0, ans=0.1 2023-10-03 16:41:14,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:16,351 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 16:41:21,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.28 vs. limit=15.0 2023-10-03 16:41:23,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:41:23,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:41:26,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:41:27,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 16:41:27,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:41:31,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:31,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 16:41:33,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:34,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:41:35,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.55 vs. limit=15.0 2023-10-03 16:41:39,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:39,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:41:41,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:41:45,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:41:45,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 16:41:46,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:46,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 16:41:50,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:41:50,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:41:53,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:41:54,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 16:41:56,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:41:56,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:41:56,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:41:56,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:42:01,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:42:02,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 16:42:04,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 16:42:05,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:42:05,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:07,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:42:08,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:42:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:42:11,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:42:12,974 INFO [train.py:1046] (2/4) Epoch 38, batch 3750, loss[loss=0.1566, simple_loss=0.2352, pruned_loss=0.03902, over 23424.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2376, pruned_loss=0.03952, over 4723329.65 frames. ], batch size: 106, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:42:13,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:42:14,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 16:42:15,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 16:42:18,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:42:18,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 16:42:18,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:42:20,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:20,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:21,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:42:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:42:31,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:42:31,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:42:32,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:42:36,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:42:36,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 16:42:38,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:42:38,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:42:39,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:42:41,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 16:42:45,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 16:42:45,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:42:45,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:42:47,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:42:53,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:42:54,480 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.979e+02 2.269e+02 2.767e+02 4.342e+02, threshold=4.539e+02, percent-clipped=2.0 2023-10-03 16:42:54,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:42:57,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 16:43:01,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:04,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:43:04,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:43:08,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:43:11,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:43:13,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:43:13,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1335593.3333333333, ans=0.125 2023-10-03 16:43:14,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:43:15,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:43:16,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1335593.3333333333, ans=0.125 2023-10-03 16:43:17,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:43:24,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:43:27,553 INFO [train.py:1046] (2/4) Epoch 38, batch 3800, loss[loss=0.1457, simple_loss=0.2208, pruned_loss=0.03529, over 19299.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2375, pruned_loss=0.03941, over 4713983.86 frames. ], batch size: 42, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:43:30,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:43:30,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:43:32,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 16:43:33,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:35,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:43:36,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:43:37,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 16:43:37,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:43:39,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:43:42,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:42,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:43:43,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:43:45,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 16:43:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 16:43:49,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:43:51,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:43:53,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:43:55,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:43:57,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:43:57,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=1335793.3333333333, ans=15.0 2023-10-03 16:43:58,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:44:00,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:00,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:44:04,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:44:04,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 16:44:07,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:44:13,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:44:16,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1335860.0, ans=0.125 2023-10-03 16:44:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:44:21,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 16:44:23,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 16:44:23,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:44:26,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:44:26,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:29,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 16:44:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 16:44:32,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 16:44:34,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:35,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:44:40,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:44:41,696 INFO [train.py:1046] (2/4) Epoch 38, batch 3850, loss[loss=0.1444, simple_loss=0.2231, pruned_loss=0.03286, over 24302.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.236, pruned_loss=0.03886, over 4712615.51 frames. ], batch size: 56, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:44:41,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:44:44,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.78 vs. limit=12.0 2023-10-03 16:44:46,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:44:46,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 16:44:47,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:44:49,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:52,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:44:53,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:44:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:44:57,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 16:45:03,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:45:06,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:07,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:45:10,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:10,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:45:10,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:10,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:45:13,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:16,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:16,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:17,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:45:19,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 16:45:19,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 16:45:20,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:20,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:22,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:22,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:23,443 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.919e+02 2.123e+02 2.422e+02 3.840e+02, threshold=4.245e+02, percent-clipped=0.0 2023-10-03 16:45:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 16:45:26,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 16:45:27,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:29,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 16:45:33,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:45:34,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1336193.3333333333, ans=0.125 2023-10-03 16:45:37,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:39,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:43,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:43,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 16:45:46,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 16:45:48,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:49,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:50,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:45:50,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:45:52,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:53,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:53,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:45:53,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 16:45:53,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:55,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 16:45:55,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:56,389 INFO [train.py:1046] (2/4) Epoch 38, batch 3900, loss[loss=0.1557, simple_loss=0.2346, pruned_loss=0.0384, over 23267.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2354, pruned_loss=0.03868, over 4711962.79 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:45:56,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:59,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:45:59,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:59,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:45:59,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1336326.6666666667, ans=0.5 2023-10-03 16:46:00,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:46:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:46:00,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:46:00,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 16:46:02,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:03,385 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.53 vs. limit=15.0 2023-10-03 16:46:07,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:46:07,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1336326.6666666667, ans=0.0 2023-10-03 16:46:09,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:46:09,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:46:09,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:46:12,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:46:12,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:13,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:46:13,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1336393.3333333333, ans=0.125 2023-10-03 16:46:15,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 16:46:16,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:46:17,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 16:46:17,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:19,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 16:46:21,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 16:46:26,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:46:26,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:46:26,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:46:28,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:46:32,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:46:35,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:46:37,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:46:37,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:46:38,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:46:45,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:46:45,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:46:51,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:46:53,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:46:58,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1336593.3333333333, ans=0.0 2023-10-03 16:47:02,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:47:03,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:47:03,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 16:47:05,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 16:47:05,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:47:05,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 16:47:08,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:47:08,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 16:47:11,655 INFO [train.py:1046] (2/4) Epoch 38, batch 3950, loss[loss=0.1655, simple_loss=0.2513, pruned_loss=0.03987, over 24660.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2356, pruned_loss=0.0387, over 4710855.14 frames. ], batch size: 68, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:47:17,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:47:17,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 16:47:19,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:47:20,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:47:21,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:47:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 16:47:26,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:47:26,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 16:47:28,338 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 16:47:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:47:32,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:47:32,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:47:32,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:47:33,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 16:47:37,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:47:37,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:47:37,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:47:38,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:47:38,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:47:49,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:47:49,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:47:53,691 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.916e+02 2.036e+02 2.336e+02 4.528e+02, threshold=4.072e+02, percent-clipped=1.0 2023-10-03 16:47:55,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 16:48:00,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 16:48:00,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 16:48:01,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:48:01,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:48:04,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1336860.0, ans=0.09899494936611666 2023-10-03 16:48:06,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1336860.0, ans=0.0 2023-10-03 16:48:06,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1336860.0, ans=0.1 2023-10-03 16:48:09,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:48:09,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:48:09,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:48:10,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:48:10,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 16:48:15,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:48:15,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:48:15,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1336926.6666666667, ans=0.125 2023-10-03 16:48:18,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 16:48:19,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=1336926.6666666667, ans=0.5 2023-10-03 16:48:25,542 INFO [train.py:1046] (2/4) Epoch 38, batch 4000, loss[loss=0.1528, simple_loss=0.2284, pruned_loss=0.03864, over 23692.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.236, pruned_loss=0.03874, over 4704479.31 frames. ], batch size: 232, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:48:27,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:34,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:34,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1336993.3333333333, ans=0.5 2023-10-03 16:48:37,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1336993.3333333333, ans=0.1 2023-10-03 16:48:38,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:48:40,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:48:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:41,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 16:48:42,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:48:43,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 16:48:43,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:48:43,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 16:48:46,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:48:48,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1337060.0, ans=0.1 2023-10-03 16:48:49,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:48:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:48:49,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:48:50,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:48:50,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 16:48:52,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:48:54,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 16:48:54,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:48:55,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:48:58,343 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 16:48:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:48:59,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:49:06,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 16:49:06,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1337126.6666666667, ans=0.025 2023-10-03 16:49:07,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:49:11,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:49:11,163 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 16:49:12,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:49:12,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 16:49:12,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:49:14,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:49:15,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:49:17,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:49:17,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:49:17,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:49:18,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 16:49:18,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:49:22,138 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 16:49:28,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:49:29,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 16:49:32,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:49:33,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:49:33,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:49:34,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:49:37,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:49:39,158 INFO [train.py:1046] (2/4) Epoch 38, batch 4050, loss[loss=0.161, simple_loss=0.2364, pruned_loss=0.04275, over 23284.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2372, pruned_loss=0.03898, over 4715719.67 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:49:41,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:49:41,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 16:49:44,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:49:44,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:49:45,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:49:47,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:49:47,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:49:50,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1337326.6666666667, ans=0.0 2023-10-03 16:49:51,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:49:54,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:49:54,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:49:54,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1337393.3333333333, ans=0.125 2023-10-03 16:49:56,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:49:57,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:50:01,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:50:04,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:50:07,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 16:50:08,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 16:50:08,840 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 16:50:10,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:50:16,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 16:50:17,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1337460.0, ans=0.125 2023-10-03 16:50:18,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:50:20,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:50:22,139 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.926e+02 2.112e+02 2.341e+02 3.122e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 16:50:24,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:50:24,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:50:24,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:50:28,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:50:31,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 16:50:32,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:50:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:50:34,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 16:50:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:50:41,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1337593.3333333333, ans=0.0 2023-10-03 16:50:44,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 16:50:47,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:50:47,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:50:50,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 16:50:50,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 16:50:50,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:50:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:50:53,510 INFO [train.py:1046] (2/4) Epoch 38, batch 4100, loss[loss=0.1584, simple_loss=0.2449, pruned_loss=0.03596, over 24469.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2383, pruned_loss=0.03973, over 4709276.76 frames. ], batch size: 63, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:50:53,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:50:54,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:50:58,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1337660.0, ans=0.2 2023-10-03 16:50:59,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1337660.0, ans=0.125 2023-10-03 16:51:01,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 16:51:02,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 16:51:03,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 16:51:05,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 16:51:05,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:05,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:06,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:06,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:51:06,802 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 16:51:09,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:51:10,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:51:10,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:12,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:51:16,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:51:18,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:51:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:51:20,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 16:51:20,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:20,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:51:20,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:51:20,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:51:20,833 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.69 vs. limit=15.0 2023-10-03 16:51:21,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 16:51:22,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:51:26,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 16:51:26,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:51:29,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:51:29,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 16:51:29,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:51:30,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:51:30,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:51:32,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 16:51:32,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1337793.3333333333, ans=0.0 2023-10-03 16:51:34,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:51:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:51:39,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 16:51:39,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:39,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:51:41,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:51:46,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:51:50,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:51:51,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:59,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:51:59,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:52:03,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:52:03,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1337926.6666666667, ans=0.125 2023-10-03 16:52:05,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:52:07,680 INFO [train.py:1046] (2/4) Epoch 38, batch 4150, loss[loss=0.1713, simple_loss=0.2577, pruned_loss=0.04243, over 24469.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2378, pruned_loss=0.03972, over 4716474.93 frames. ], batch size: 69, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:52:09,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:52:10,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:52:10,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:52:10,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:52:10,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1337993.3333333333, ans=0.2 2023-10-03 16:52:12,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1337993.3333333333, ans=0.1 2023-10-03 16:52:13,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 16:52:13,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:52:13,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 16:52:13,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 16:52:14,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 16:52:15,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1337993.3333333333, ans=0.125 2023-10-03 16:52:16,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:52:20,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:52:21,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:52:24,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:52:25,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:52:26,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:52:29,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:52:29,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:52:30,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:52:34,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:52:38,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:52:38,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 16:52:41,061 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.30 vs. limit=10.0 2023-10-03 16:52:43,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 16:52:43,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:52:44,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 16:52:44,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:52:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:52:47,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:52:48,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:52:50,421 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.901e+02 2.086e+02 2.272e+02 3.701e+02, threshold=4.173e+02, percent-clipped=0.0 2023-10-03 16:52:50,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 16:52:55,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:52:57,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:52:58,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 16:52:58,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:53:00,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 16:53:01,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:53:03,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:53:04,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:06,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 16:53:06,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:06,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:53:08,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:53:11,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 16:53:11,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:11,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:53:11,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:53:12,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 16:53:12,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:53:12,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:53:13,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:53:14,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:14,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 16:53:15,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:53:22,137 INFO [train.py:1046] (2/4) Epoch 38, batch 4200, loss[loss=0.1703, simple_loss=0.2522, pruned_loss=0.04421, over 23275.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2372, pruned_loss=0.03966, over 4714009.42 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:53:22,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:53:23,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 16:53:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:53:27,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:53:29,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:53:29,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:53:29,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:53:33,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 16:53:35,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 16:53:35,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:37,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:53:39,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:53:41,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.15 vs. limit=15.0 2023-10-03 16:53:42,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:53:43,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:53:43,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:45,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 16:53:45,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:53:46,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:47,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:53:47,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:53:49,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:53:51,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 16:53:51,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:54,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:53:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:53:58,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:53:59,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:54:02,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:54:02,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 16:54:03,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:54:04,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1338460.0, ans=0.1 2023-10-03 16:54:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:54:08,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:54:10,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:54:15,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:54:18,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 16:54:21,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:54:24,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:54:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:27,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 16:54:31,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:54:35,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:54:35,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:54:36,546 INFO [train.py:1046] (2/4) Epoch 38, batch 4250, loss[loss=0.1643, simple_loss=0.2381, pruned_loss=0.04522, over 23843.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2354, pruned_loss=0.03927, over 4707609.90 frames. ], batch size: 179, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:54:39,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:45,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:54:45,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1338660.0, ans=0.0 2023-10-03 16:54:46,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 16:54:46,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:54:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:53,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:54:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:54:57,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:54:59,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:54:59,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:55:00,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:02,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:04,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:05,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:55:06,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:08,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 16:55:08,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=1338793.3333333333, ans=0.2 2023-10-03 16:55:10,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 16:55:10,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:55:12,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:13,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:55:13,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:14,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:16,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1338793.3333333333, ans=0.0 2023-10-03 16:55:18,903 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.873e+02 2.084e+02 2.323e+02 3.046e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-03 16:55:18,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:55:19,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:55:24,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:55:26,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:26,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 16:55:26,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:55:28,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 16:55:28,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:55:29,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1338860.0, ans=0.04949747468305833 2023-10-03 16:55:30,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:55:31,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:31,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:55:33,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 16:55:36,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:55:36,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:55:36,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1338926.6666666667, ans=0.0 2023-10-03 16:55:39,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:42,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:43,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:55:43,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:55:45,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:55:47,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:55:48,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:55:48,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 16:55:50,857 INFO [train.py:1046] (2/4) Epoch 38, batch 4300, loss[loss=0.1424, simple_loss=0.2191, pruned_loss=0.03282, over 22807.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2354, pruned_loss=0.0392, over 4707181.40 frames. ], batch size: 322, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:55:52,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:55:57,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:55:58,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:55:58,838 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:56:01,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:56:06,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1339060.0, ans=0.1 2023-10-03 16:56:07,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:56:07,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 16:56:08,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:56:11,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:56:11,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:56:11,605 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 16:56:13,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1339060.0, ans=0.2 2023-10-03 16:56:16,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:56:17,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:56:18,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1339126.6666666667, ans=0.1 2023-10-03 16:56:19,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 16:56:21,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:56:21,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 16:56:21,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 16:56:23,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:56:26,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:56:26,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:56:26,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:56:28,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:56:29,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:56:29,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 16:56:30,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 16:56:32,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:56:35,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:35,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:56:36,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:36,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:56:36,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 16:56:36,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 16:56:36,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 16:56:38,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:56:38,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 16:56:38,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 16:56:38,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=22.5 2023-10-03 16:56:42,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:56:44,898 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 16:56:44,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:56:45,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1339193.3333333333, ans=0.2 2023-10-03 16:56:46,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:56:46,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:56:49,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 16:56:49,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1339260.0, ans=0.125 2023-10-03 16:56:50,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:56:50,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:50,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:56:52,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:56:52,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:56:55,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:56:57,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:56:58,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:59,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:56:59,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1339260.0, ans=0.125 2023-10-03 16:57:03,491 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.54 vs. limit=15.0 2023-10-03 16:57:04,232 INFO [train.py:1046] (2/4) Epoch 38, batch 4350, loss[loss=0.1692, simple_loss=0.2407, pruned_loss=0.04886, over 23460.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2363, pruned_loss=0.0394, over 4707989.10 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:57:06,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 16:57:07,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:57:10,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:57:13,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:57:16,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:57:16,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:57:17,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1339393.3333333333, ans=0.1 2023-10-03 16:57:20,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:57:23,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:57:26,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:57:26,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:57:26,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1339393.3333333333, ans=0.125 2023-10-03 16:57:29,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:57:30,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1339393.3333333333, ans=0.0 2023-10-03 16:57:31,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:57:31,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:57:37,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 16:57:37,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:57:37,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:41,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:44,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 16:57:47,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:57:48,467 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.931e+02 2.144e+02 2.418e+02 3.398e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 16:57:48,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:57:54,506 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 16:57:54,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:57:56,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:57:57,365 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 16:57:58,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 16:57:58,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:57:58,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:00,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:58:00,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:02,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:58:02,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:58:04,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 16:58:04,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:04,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:58:04,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:04,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 16:58:06,252 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 16:58:06,257 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 16:58:06,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=1339593.3333333333, ans=12.0 2023-10-03 16:58:08,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 16:58:10,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:58:10,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:58:10,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:12,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:58:14,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 16:58:15,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1339593.3333333333, ans=0.0 2023-10-03 16:58:16,378 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 16:58:16,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:17,642 INFO [train.py:1046] (2/4) Epoch 38, batch 4400, loss[loss=0.163, simple_loss=0.2376, pruned_loss=0.04418, over 23798.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.237, pruned_loss=0.03982, over 4712519.02 frames. ], batch size: 212, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:58:20,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:58:20,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:21,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:58:23,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 16:58:23,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 16:58:24,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1339660.0, ans=0.0 2023-10-03 16:58:25,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 16:58:25,246 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 16:58:26,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:58:26,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:58:29,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 16:58:30,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:32,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:32,683 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 16:58:35,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:35,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 16:58:36,802 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 16:58:39,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 16:58:39,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 16:58:41,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 16:58:41,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:42,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:42,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:43,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:58:44,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1339726.6666666667, ans=0.125 2023-10-03 16:58:45,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 16:58:45,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 16:58:46,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:46,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:58:46,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:49,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:49,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:49,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 16:58:52,701 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 16:58:54,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:59,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:59:02,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 16:59:06,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:59:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:59:12,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:59:12,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 16:59:12,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:59:13,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:59:13,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:59:13,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:59:15,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1339926.6666666667, ans=0.1 2023-10-03 16:59:16,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 16:59:19,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 16:59:20,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 16:59:20,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:59:20,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 16:59:22,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:59:22,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1339926.6666666667, ans=0.0 2023-10-03 16:59:25,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:59:27,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 16:59:30,599 INFO [train.py:1046] (2/4) Epoch 38, batch 4450, loss[loss=0.1524, simple_loss=0.2282, pruned_loss=0.03832, over 23733.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2378, pruned_loss=0.03945, over 4733393.55 frames. ], batch size: 135, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:59:32,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:59:35,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:36,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:59:42,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:59:44,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:59:47,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:50,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:59:50,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.94 vs. limit=15.0 2023-10-03 16:59:52,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:59:52,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:59:54,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 16:59:54,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:59:55,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:55,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:59:55,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:59:58,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:00:02,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:02,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:03,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:00:03,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:00:05,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:00:09,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 17:00:10,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 17:00:10,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 17:00:10,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:00:12,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:00:14,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 17:00:15,717 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.913e+02 2.140e+02 2.381e+02 4.249e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-03 17:00:19,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:00:22,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:22,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 17:00:22,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:22,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:00:22,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:00:22,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:00:24,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:27,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:00:28,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 17:00:29,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1340260.0, ans=10.0 2023-10-03 17:00:30,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1340260.0, ans=0.0 2023-10-03 17:00:31,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:00:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:00:34,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:00:36,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:36,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 17:00:37,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:00:42,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 17:00:42,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:00:44,998 INFO [train.py:1046] (2/4) Epoch 38, batch 4500, loss[loss=0.1699, simple_loss=0.2452, pruned_loss=0.04724, over 23153.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2385, pruned_loss=0.03976, over 4730842.95 frames. ], batch size: 105, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:00:47,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:00:48,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 17:00:48,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 17:00:51,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:00:52,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.30 vs. limit=15.0 2023-10-03 17:00:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:55,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:00:56,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:00:56,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:00:57,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1340326.6666666667, ans=0.1 2023-10-03 17:00:58,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:00:58,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:01:09,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:01:09,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:01:10,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=15.0 2023-10-03 17:01:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:01:12,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:01:13,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:01:16,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1340460.0, ans=0.0 2023-10-03 17:01:22,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:01:26,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:01:30,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:01:30,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1340526.6666666667, ans=0.125 2023-10-03 17:01:33,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:01:33,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 17:01:35,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:35,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:01:37,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:01:37,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:01:38,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:01:39,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 17:01:39,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:01:39,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:40,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1340526.6666666667, ans=0.125 2023-10-03 17:01:44,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:01:44,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:01:47,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:50,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:01:50,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:01:51,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 17:01:53,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 17:01:53,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 17:01:55,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 17:01:57,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.23 vs. limit=15.0 2023-10-03 17:01:58,549 INFO [train.py:1046] (2/4) Epoch 38, batch 4550, loss[loss=0.1447, simple_loss=0.225, pruned_loss=0.0322, over 23549.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2374, pruned_loss=0.03969, over 4712045.38 frames. ], batch size: 134, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:01:58,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1340660.0, ans=0.0 2023-10-03 17:01:59,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 17:02:01,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:02:03,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:02:03,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:02:06,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:07,368 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=22.5 2023-10-03 17:02:11,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:02:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:02:14,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:14,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:02:14,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:15,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:15,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:02:21,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:02:23,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 17:02:24,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 17:02:24,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:02:26,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 17:02:27,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1340793.3333333333, ans=0.0 2023-10-03 17:02:30,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 17:02:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:02:33,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 17:02:34,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:02:35,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:35,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:36,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:02:38,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 17:02:40,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:02:42,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:42,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:02:44,103 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 1.957e+02 2.138e+02 2.462e+02 4.431e+02, threshold=4.276e+02, percent-clipped=1.0 2023-10-03 17:02:44,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:47,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 17:02:47,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 17:02:47,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:02:48,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 17:02:50,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 17:02:50,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:51,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-03 17:02:51,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:51,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:02:51,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:51,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1340860.0, ans=0.0 2023-10-03 17:02:53,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:02:53,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1340860.0, ans=0.2 2023-10-03 17:02:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:02:54,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 17:02:55,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:02:56,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:02:57,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 17:02:57,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:02:59,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 17:03:01,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:03:01,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:03:03,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:03:05,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:03:05,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:03:07,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:03:10,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:03:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:12,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:03:14,311 INFO [train.py:1046] (2/4) Epoch 38, batch 4600, loss[loss=0.1722, simple_loss=0.2564, pruned_loss=0.04399, over 24116.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2363, pruned_loss=0.03928, over 4709620.69 frames. ], batch size: 80, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:03:15,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:03:17,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:03:17,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:18,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 17:03:20,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:03:23,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:03:24,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:26,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 17:03:35,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1341060.0, ans=0.04949747468305833 2023-10-03 17:03:36,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:40,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:42,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:03:42,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:46,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1341126.6666666667, ans=0.125 2023-10-03 17:03:46,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1341126.6666666667, ans=0.2 2023-10-03 17:03:47,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 17:03:47,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:03:48,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:03:52,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:53,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:03:54,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:04:00,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 17:04:02,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:04:02,928 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.80 vs. limit=15.0 2023-10-03 17:04:03,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1341193.3333333333, ans=0.0 2023-10-03 17:04:05,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:05,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:07,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:07,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 17:04:09,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:10,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 17:04:11,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:11,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:13,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:14,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:04:15,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:16,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 17:04:16,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 17:04:17,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 17:04:17,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:18,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:04:18,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:20,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:27,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1341326.6666666667, ans=0.125 2023-10-03 17:04:28,951 INFO [train.py:1046] (2/4) Epoch 38, batch 4650, loss[loss=0.1499, simple_loss=0.2192, pruned_loss=0.04024, over 22790.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2367, pruned_loss=0.03935, over 4702010.67 frames. ], batch size: 322, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:04:29,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:04:31,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1341326.6666666667, ans=0.2 2023-10-03 17:04:32,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:04:32,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:32,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:04:32,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:32,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:04:33,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:36,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 17:04:40,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:04:41,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 17:04:41,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:04:43,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 17:04:43,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:04:44,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 17:04:44,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 17:04:44,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:45,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:04:48,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:04:49,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:49,887 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 17:04:51,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:52,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 17:04:53,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1341393.3333333333, ans=0.0 2023-10-03 17:04:57,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:57,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:04:58,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 17:05:00,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:05:00,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1341460.0, ans=0.0 2023-10-03 17:05:00,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1341460.0, ans=0.0 2023-10-03 17:05:03,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:05:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:10,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:05:13,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.819e+02 2.027e+02 2.299e+02 3.451e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 17:05:13,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:05:13,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:05:13,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:05:13,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1341526.6666666667, ans=0.025 2023-10-03 17:05:15,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 17:05:15,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 17:05:16,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 17:05:16,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 17:05:16,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1341526.6666666667, ans=0.0 2023-10-03 17:05:18,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:19,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1341526.6666666667, ans=0.125 2023-10-03 17:05:20,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1341526.6666666667, ans=0.0 2023-10-03 17:05:25,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:05:25,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:05:25,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 17:05:25,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:05:26,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:05:28,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:05:31,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:05:31,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:05:32,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:05:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:05:36,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:05:38,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 17:05:40,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:05:40,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 17:05:43,351 INFO [train.py:1046] (2/4) Epoch 38, batch 4700, loss[loss=0.1562, simple_loss=0.2394, pruned_loss=0.03649, over 23878.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2377, pruned_loss=0.03954, over 4696480.35 frames. ], batch size: 86, lr: 2.66e-03, grad_scale: 8.0 2023-10-03 17:05:47,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:48,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:48,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:05:49,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:05:51,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:05:57,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 17:05:57,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 17:06:00,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:06:01,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:06:03,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:08,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1341726.6666666667, ans=0.2 2023-10-03 17:06:09,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:06:11,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 17:06:13,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:06:19,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 17:06:19,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:06:22,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:26,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 17:06:26,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1341860.0, ans=0.125 2023-10-03 17:06:29,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:06:32,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:06:32,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 17:06:34,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:34,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:06:36,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:37,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:06:37,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 17:06:37,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1341860.0, ans=0.0 2023-10-03 17:06:38,868 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 17:06:40,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:06:42,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:42,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:42,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 17:06:43,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 17:06:49,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:06:50,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:06:53,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:06:55,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:06:55,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 17:06:56,619 INFO [train.py:1046] (2/4) Epoch 38, batch 4750, loss[loss=0.1625, simple_loss=0.2423, pruned_loss=0.04138, over 23270.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2379, pruned_loss=0.03945, over 4698834.81 frames. ], batch size: 93, lr: 2.66e-03, grad_scale: 8.0 2023-10-03 17:06:56,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:06:59,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 17:07:00,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:07:00,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:07:02,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:05,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1341993.3333333333, ans=0.125 2023-10-03 17:07:09,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 17:07:13,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:07:14,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 17:07:16,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:20,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:07:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:07:20,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:07:21,466 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 17:07:21,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 17:07:28,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 17:07:31,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:07:32,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1342126.6666666667, ans=0.0 2023-10-03 17:07:34,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:07:36,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:07:36,954 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 17:07:36,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:07:38,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:07:39,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.879e+02 2.097e+02 2.401e+02 3.213e+02, threshold=4.194e+02, percent-clipped=0.0 2023-10-03 17:07:41,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:07:43,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 17:07:43,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 17:07:44,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1342193.3333333333, ans=0.125 2023-10-03 17:07:45,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:07:45,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:07:45,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:07:46,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 17:07:46,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 17:07:47,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=1342193.3333333333, ans=12.0 2023-10-03 17:07:49,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 17:07:51,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:07:53,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:07:53,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 17:07:55,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:55,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:07:58,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:07:59,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:07:59,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:08:01,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:01,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 17:08:02,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 17:08:04,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 17:08:06,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:08:06,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:07,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 17:08:10,171 INFO [train.py:1046] (2/4) Epoch 38, batch 4800, loss[loss=0.1728, simple_loss=0.2465, pruned_loss=0.04955, over 22869.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2394, pruned_loss=0.03959, over 4704599.19 frames. ], batch size: 322, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:08:13,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:13,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:18,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:08:18,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1342326.6666666667, ans=0.125 2023-10-03 17:08:18,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1342326.6666666667, ans=0.2 2023-10-03 17:08:19,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:08:21,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:21,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 17:08:22,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:08:22,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:08:22,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:08:26,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:08:28,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:28,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1342393.3333333333, ans=0.07 2023-10-03 17:08:29,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:08:30,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:30,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:08:30,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:31,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:08:34,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:35,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:36,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:38,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:08:39,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 17:08:41,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:43,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 17:08:43,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 17:08:43,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:43,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:08:43,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:08:43,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:08:43,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1342460.0, ans=0.125 2023-10-03 17:08:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:08:46,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1342460.0, ans=0.0 2023-10-03 17:08:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:08:47,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:08:50,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:53,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:08:54,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:00,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 17:09:02,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:09:02,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:02,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:09:03,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:09:07,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:09:07,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1342593.3333333333, ans=0.2 2023-10-03 17:09:09,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:09:09,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:09,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:09:10,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:09:10,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:09:15,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:15,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:15,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:09:18,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 17:09:20,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 17:09:20,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:09:20,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:09:21,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:09:21,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:21,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1342593.3333333333, ans=0.0 2023-10-03 17:09:24,432 INFO [train.py:1046] (2/4) Epoch 38, batch 4850, loss[loss=0.1532, simple_loss=0.2303, pruned_loss=0.03807, over 20367.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2394, pruned_loss=0.03935, over 4707705.51 frames. ], batch size: 44, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:09:24,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:09:31,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 17:09:33,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:37,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:09:39,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:09:39,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1342726.6666666667, ans=0.125 2023-10-03 17:09:40,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:43,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:44,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:09:46,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:09:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 17:09:52,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:09:52,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1342793.3333333333, ans=0.125 2023-10-03 17:09:53,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:09:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:09:55,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:09:55,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 17:09:56,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:09:57,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:02,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:02,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 17:10:02,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 17:10:03,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:10:08,522 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.910e+02 2.157e+02 2.584e+02 3.262e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-03 17:10:11,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:10:12,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 17:10:14,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:10:14,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:10:15,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:10:17,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 17:10:17,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:17,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1342860.0, ans=0.125 2023-10-03 17:10:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 17:10:20,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:20,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:10:21,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 17:10:29,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:34,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:10:34,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:10:38,178 INFO [train.py:1046] (2/4) Epoch 38, batch 4900, loss[loss=0.161, simple_loss=0.2351, pruned_loss=0.04345, over 23865.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2382, pruned_loss=0.03886, over 4714802.19 frames. ], batch size: 195, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:10:41,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 17:10:41,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:10:45,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:10:47,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:47,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:10:49,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 17:10:52,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 17:10:57,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 17:10:57,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1343060.0, ans=0.125 2023-10-03 17:10:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 17:10:58,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:10:59,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:59,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:10:59,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:10:59,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:11:00,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 17:11:03,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 17:11:03,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:11:04,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:11:05,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:11:07,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1343126.6666666667, ans=0.125 2023-10-03 17:11:10,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:11:10,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:11:12,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:12,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 17:11:13,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:11:15,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:11:15,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 17:11:15,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 17:11:18,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 17:11:19,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:11:21,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:11:21,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:11:21,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:11:23,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 17:11:24,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:11:24,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 17:11:27,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:27,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:11:29,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:11:33,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 17:11:34,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:11:34,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 17:11:34,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 17:11:43,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:11:43,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:11:45,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 17:11:45,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:11:45,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:11:47,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:49,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:11:49,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:11:51,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:11:51,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 17:11:52,777 INFO [train.py:1046] (2/4) Epoch 38, batch 4950, loss[loss=0.1498, simple_loss=0.2412, pruned_loss=0.02922, over 24441.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2367, pruned_loss=0.03862, over 4711804.45 frames. ], batch size: 69, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:11:52,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:11:55,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1343326.6666666667, ans=0.0 2023-10-03 17:11:56,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:11:56,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:11:59,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 17:12:00,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 17:12:00,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:12:01,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 17:12:01,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:01,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:12:01,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:12:03,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:04,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:04,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:12:05,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:12:08,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:12:10,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:10,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:12:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:12:18,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:18,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:12:19,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:20,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:22,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:12:24,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 17:12:24,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 17:12:26,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:28,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:12:28,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:12:30,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:12:30,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:12:32,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:12:34,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:37,003 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.915e+02 2.064e+02 2.439e+02 4.072e+02, threshold=4.129e+02, percent-clipped=0.0 2023-10-03 17:12:37,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:12:38,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:12:38,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:40,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:41,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 17:12:41,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:12:41,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:12:45,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:12:46,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:12:46,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:12:46,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:48,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:12:48,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:12:51,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:12:51,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:12:53,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:54,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 17:12:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:12:59,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1343593.3333333333, ans=0.2 2023-10-03 17:13:03,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 17:13:03,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:13:07,165 INFO [train.py:1046] (2/4) Epoch 38, batch 5000, loss[loss=0.1381, simple_loss=0.2185, pruned_loss=0.02879, over 24453.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.03857, over 4705979.62 frames. ], batch size: 58, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:13:11,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:13:11,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:13:11,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 17:13:14,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 17:13:17,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:13:18,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 17:13:18,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:13:18,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:13:19,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1343660.0, ans=0.0 2023-10-03 17:13:20,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 17:13:22,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:13:23,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 17:13:23,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:24,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:13:24,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 17:13:26,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 17:13:28,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:13:28,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 17:13:28,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:13:28,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1343726.6666666667, ans=0.125 2023-10-03 17:13:29,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:29,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:13:29,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 17:13:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 17:13:32,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 17:13:32,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:32,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:33,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 17:13:33,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:13:35,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:35,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1343793.3333333333, ans=0.125 2023-10-03 17:13:35,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1343793.3333333333, ans=0.07 2023-10-03 17:13:36,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:38,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 17:13:39,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 17:13:39,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:13:39,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1343793.3333333333, ans=0.125 2023-10-03 17:13:39,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.24 vs. limit=15.0 2023-10-03 17:13:41,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:13:42,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1343793.3333333333, ans=0.1 2023-10-03 17:13:46,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 17:13:48,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:13:49,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:49,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:13:52,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 17:13:52,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:53,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:13:53,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:13:55,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 17:13:57,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:13:58,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:13:58,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1343860.0, ans=0.0 2023-10-03 17:13:59,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:01,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1343860.0, ans=0.125 2023-10-03 17:14:03,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 17:14:09,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:10,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.85 vs. limit=15.0 2023-10-03 17:14:10,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.01 vs. limit=15.0 2023-10-03 17:14:15,263 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:14:17,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:14:19,517 INFO [train.py:1046] (2/4) Epoch 38, batch 5050, loss[loss=0.1621, simple_loss=0.2433, pruned_loss=0.04043, over 23776.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2365, pruned_loss=0.03857, over 4711694.98 frames. ], batch size: 164, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:14:19,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:19,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:14:19,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:14:19,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:14:21,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:14:21,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:26,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 17:14:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:14:28,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:14:30,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:14:31,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 17:14:31,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:31,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:14:31,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1343993.3333333333, ans=0.1 2023-10-03 17:14:34,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:14:35,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:14:35,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:14:46,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 17:14:46,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:14:47,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:14:47,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 17:14:47,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1344126.6666666667, ans=0.0 2023-10-03 17:14:49,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:14:50,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:14:50,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:50,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:14:50,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 17:14:52,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 17:14:52,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:14:55,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:14:56,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.55 vs. limit=6.0 2023-10-03 17:14:59,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:15:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 17:15:02,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:15:02,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1344126.6666666667, ans=0.0 2023-10-03 17:15:02,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1344126.6666666667, ans=0.0 2023-10-03 17:15:03,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 17:15:05,036 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.875e+02 2.025e+02 2.172e+02 3.153e+02, threshold=4.049e+02, percent-clipped=0.0 2023-10-03 17:15:05,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:15:05,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:15:06,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:07,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:15:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:15:10,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:15:10,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:10,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1344193.3333333333, ans=0.04949747468305833 2023-10-03 17:15:12,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:15:12,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:15:12,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 17:15:13,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:15:16,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:15:17,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1344260.0, ans=0.125 2023-10-03 17:15:19,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:15:20,335 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 17:15:20,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:15:20,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:15:22,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:22,557 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 17:15:26,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:15:26,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 17:15:26,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:27,403 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.92 vs. limit=15.0 2023-10-03 17:15:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:30,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:30,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 17:15:33,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 17:15:35,072 INFO [train.py:1046] (2/4) Epoch 38, batch 5100, loss[loss=0.1614, simple_loss=0.2406, pruned_loss=0.04107, over 23355.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2366, pruned_loss=0.03828, over 4712655.72 frames. ], batch size: 93, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:15:36,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:15:36,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:15:37,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:15:39,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1344326.6666666667, ans=0.125 2023-10-03 17:15:41,711 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 17:15:43,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:15:44,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 17:15:44,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 17:15:44,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1344326.6666666667, ans=0.125 2023-10-03 17:15:46,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:15:46,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1344326.6666666667, ans=0.0 2023-10-03 17:15:47,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:15:48,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:15:50,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 17:15:50,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 17:15:53,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:53,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:15:58,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1344393.3333333333, ans=0.2 2023-10-03 17:15:59,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:16:02,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 17:16:02,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:16:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:16:06,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 17:16:08,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:10,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 17:16:13,205 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 17:16:14,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:14,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 17:16:14,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 17:16:17,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:16:24,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:16:27,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 17:16:27,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 17:16:28,926 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 17:16:29,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1344526.6666666667, ans=0.0 2023-10-03 17:16:30,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 17:16:30,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 17:16:35,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 17:16:38,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 17:16:39,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:16:42,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 17:16:43,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:16:43,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 17:16:48,152 INFO [train.py:1046] (2/4) Epoch 38, batch 5150, loss[loss=0.1597, simple_loss=0.2496, pruned_loss=0.03494, over 24363.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2382, pruned_loss=0.03883, over 4718983.26 frames. ], batch size: 77, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:16:49,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:16:49,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:16:49,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:16:49,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:16:49,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:16:51,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:16:52,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 17:16:52,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 17:16:52,675 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:16:53,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 17:16:53,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:16:53,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 17:16:55,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:16:57,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 17:16:58,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:16:58,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1344660.0, ans=0.1 2023-10-03 17:16:59,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:17:04,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:17:04,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 17:17:06,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:06,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:17:09,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:17:09,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:17:09,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:09,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.69 vs. limit=15.0 2023-10-03 17:17:10,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:17:10,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:17:10,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 17:17:13,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:17:13,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:17:14,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:17:16,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 17:17:16,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:17:21,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:17:23,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 17:17:26,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1344793.3333333333, ans=0.1 2023-10-03 17:17:29,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:17:31,770 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.952e+02 2.144e+02 2.442e+02 5.340e+02, threshold=4.289e+02, percent-clipped=2.0 2023-10-03 17:17:33,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:37,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:17:38,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:17:40,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 17:17:44,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:17:45,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:17:45,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:17:45,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1344926.6666666667, ans=0.125 2023-10-03 17:17:48,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:17:49,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:17:51,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 17:17:55,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.98 vs. limit=15.0 2023-10-03 17:17:55,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:57,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:17:58,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:58,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:17:59,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:17:59,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:17:59,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:17:59,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:18:01,107 INFO [train.py:1046] (2/4) Epoch 38, batch 5200, loss[loss=0.1372, simple_loss=0.217, pruned_loss=0.02873, over 24570.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2389, pruned_loss=0.03905, over 4726504.95 frames. ], batch size: 60, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:18:03,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:18:05,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:18:09,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:12,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 17:18:14,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:18:14,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1344993.3333333333, ans=0.0 2023-10-03 17:18:15,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:16,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:19,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:18:19,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:21,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 17:18:22,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:18:22,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.21 vs. limit=12.0 2023-10-03 17:18:23,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:18:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 17:18:28,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:18:28,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:18:29,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 17:18:30,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 17:18:31,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 17:18:32,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:18:32,895 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 17:18:32,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:34,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:18:34,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:18:36,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 17:18:37,233 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.82 vs. limit=15.0 2023-10-03 17:18:37,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:18:39,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:45,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 17:18:45,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 17:18:45,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 17:18:49,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 17:18:50,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:18:54,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:18:54,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:18:56,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 17:18:58,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:58,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:18:58,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:18:59,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:19:00,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:19:02,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:19:03,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:19:07,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:07,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:10,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:19:11,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 17:19:11,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:19:11,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:19:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:15,793 INFO [train.py:1046] (2/4) Epoch 38, batch 5250, loss[loss=0.1623, simple_loss=0.2207, pruned_loss=0.05198, over 19281.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2374, pruned_loss=0.03916, over 4703913.98 frames. ], batch size: 388, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:19:15,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:19:16,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:19:16,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1345326.6666666667, ans=0.125 2023-10-03 17:19:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:19:20,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:21,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:19:23,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:19:27,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:19:30,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:19:32,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:19:33,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:19:35,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 17:19:35,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:37,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:38,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1345393.3333333333, ans=0.125 2023-10-03 17:19:40,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1345393.3333333333, ans=0.1 2023-10-03 17:19:41,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1345393.3333333333, ans=0.0 2023-10-03 17:19:51,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1345460.0, ans=10.0 2023-10-03 17:19:56,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1345526.6666666667, ans=0.0 2023-10-03 17:19:57,842 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.916e+02 2.104e+02 2.391e+02 3.735e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 17:20:03,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1345526.6666666667, ans=0.0 2023-10-03 17:20:09,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1345593.3333333333, ans=0.125 2023-10-03 17:20:22,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1345660.0, ans=0.5 2023-10-03 17:20:22,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1345660.0, ans=0.2 2023-10-03 17:20:24,122 INFO [train.py:1046] (2/4) Epoch 38, batch 5300, loss[loss=0.1346, simple_loss=0.1886, pruned_loss=0.04029, over 19019.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2353, pruned_loss=0.03874, over 4706704.91 frames. ], batch size: 388, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:20:37,548 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:20:38,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:20:38,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 17:20:38,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 17:20:38,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:38,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:38,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:38,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:38,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:38,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:20:39,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:20:39,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:20:39,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 17:20:39,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 17:20:39,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 17:20:39,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:20:39,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 17:20:39,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 17:20:39,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:40,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:40,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:20:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:20:40,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:20:40,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:20:40,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:41,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:41,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:20:41,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:41,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:20:41,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:41,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:20:41,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 17:20:41,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:20:42,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:42,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 17:20:42,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 17:20:42,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:20:42,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:20:42,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 17:20:42,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 17:20:42,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:20:43,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:20:43,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:20:43,429 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 17:20:43,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 17:20:43,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:20:43,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:43,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 17:20:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 17:20:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 17:20:43,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:20:50,421 INFO [train.py:1046] (2/4) Epoch 39, batch 0, loss[loss=0.1615, simple_loss=0.2398, pruned_loss=0.04159, over 23270.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2398, pruned_loss=0.04159, over 23270.00 frames. ], batch size: 93, lr: 2.63e-03, grad_scale: 32.0 2023-10-03 17:20:50,422 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 17:21:02,120 INFO [train.py:1078] (2/4) Epoch 39, validation: loss=0.3329, simple_loss=0.2734, pruned_loss=0.1962, over 1125622.00 frames. 2023-10-03 17:21:02,121 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 17:21:02,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1345740.0, ans=0.125 2023-10-03 17:21:05,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 17:21:06,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:21:08,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:21:12,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:12,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:21:12,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:13,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 17:21:15,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 17:21:16,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:17,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:19,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1345806.6666666667, ans=0.1 2023-10-03 17:21:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:20,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1345806.6666666667, ans=0.125 2023-10-03 17:21:21,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:22,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:21:22,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:21:24,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 17:21:24,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:21:32,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:21:32,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:35,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 17:21:38,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:21:38,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:21:41,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:21:42,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1345873.3333333333, ans=0.0 2023-10-03 17:21:44,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:21:48,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:21:48,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1345940.0, ans=0.1 2023-10-03 17:21:54,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 17:21:54,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1345940.0, ans=0.125 2023-10-03 17:21:58,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 17:21:58,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:21:58,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:21:59,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:21:59,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:22:02,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 17:22:02,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1346006.6666666667, ans=0.125 2023-10-03 17:22:04,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:22:06,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:22:09,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:22:13,198 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 17:22:13,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:22:14,578 INFO [train.py:1046] (2/4) Epoch 39, batch 50, loss[loss=0.1667, simple_loss=0.2412, pruned_loss=0.04616, over 22785.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2403, pruned_loss=0.03964, over 1076628.46 frames. ], batch size: 322, lr: 2.63e-03, grad_scale: 32.0 2023-10-03 17:22:16,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:22:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:22:18,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 17:22:20,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:22:20,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:22:22,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:22:24,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:22:27,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:22:28,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1346140.0, ans=0.0 2023-10-03 17:22:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 17:22:29,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:35,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:22:37,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 17:22:38,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 17:22:41,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:22:41,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:22:41,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:42,529 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.895e+02 2.125e+02 2.422e+02 4.892e+02, threshold=4.250e+02, percent-clipped=3.0 2023-10-03 17:22:42,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:22:44,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:22:44,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:22:44,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:50,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:22:52,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:22:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:22:52,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1346206.6666666667, ans=0.125 2023-10-03 17:22:53,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 17:22:55,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:22:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:22:56,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 17:22:56,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:22:57,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 17:23:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:07,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:23:07,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:09,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:23:09,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:23:11,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 17:23:11,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 17:23:13,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:13,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:23:15,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:23:15,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:23:15,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 17:23:15,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 17:23:17,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 17:23:17,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:18,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:23:18,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 17:23:18,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 17:23:20,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:20,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:23:22,244 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.02 vs. limit=15.0 2023-10-03 17:23:22,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:23:22,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:23:24,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:23:26,983 INFO [train.py:1046] (2/4) Epoch 39, batch 100, loss[loss=0.1495, simple_loss=0.2253, pruned_loss=0.03681, over 23797.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2393, pruned_loss=0.03932, over 1875419.03 frames. ], batch size: 179, lr: 2.63e-03, grad_scale: 16.0 2023-10-03 17:23:27,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:23:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:23:31,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 17:23:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:35,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:23:36,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:23:36,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:23:36,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:23:36,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:23:36,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-10-03 17:23:37,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 17:23:40,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:23:40,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:41,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:41,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:23:44,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 17:23:46,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:46,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1346473.3333333333, ans=0.2 2023-10-03 17:23:47,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:48,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:23:50,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:23:50,880 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=12.0 2023-10-03 17:23:53,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 17:23:54,521 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 17:23:55,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:23:55,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:23:58,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:24:00,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:24:01,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:08,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:09,902 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 17:24:11,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 17:24:13,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1346606.6666666667, ans=0.0 2023-10-03 17:24:15,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:24:16,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:24:18,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:19,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:24,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:24:25,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:24:29,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:29,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:24:30,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:30,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:24:30,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:32,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 17:24:32,304 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 17:24:32,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:34,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:24:34,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:34,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:35,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 17:24:35,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:24:36,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:24:36,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:36,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:24:37,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:39,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:24:40,684 INFO [train.py:1046] (2/4) Epoch 39, batch 150, loss[loss=0.1453, simple_loss=0.2209, pruned_loss=0.03488, over 24314.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2392, pruned_loss=0.03902, over 2521425.03 frames. ], batch size: 56, lr: 2.63e-03, grad_scale: 16.0 2023-10-03 17:24:40,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:24:42,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1346740.0, ans=0.2 2023-10-03 17:24:44,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:46,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:24:46,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:24:47,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:50,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:50,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:53,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1346806.6666666667, ans=0.025 2023-10-03 17:24:54,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:24:54,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1346806.6666666667, ans=0.2 2023-10-03 17:24:55,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:58,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 17:24:58,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 17:24:58,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 17:25:01,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:25:01,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:25:02,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:25:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:25:04,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:04,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:04,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 17:25:06,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:07,378 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.926e+02 2.144e+02 2.350e+02 3.601e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-03 17:25:13,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:25:16,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:25:18,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 17:25:20,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:25:20,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:25:22,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:25:23,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:25:24,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:25:25,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:25:25,579 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=12.0 2023-10-03 17:25:26,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:26,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 17:25:31,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:33,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:33,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:25:33,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:25:34,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:36,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 17:25:37,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.41 vs. limit=15.0 2023-10-03 17:25:39,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:25:41,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:25:43,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:25:44,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:25:44,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 17:25:44,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:25:45,822 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 17:25:48,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:51,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:51,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:25:51,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 17:25:52,885 INFO [train.py:1046] (2/4) Epoch 39, batch 200, loss[loss=0.1538, simple_loss=0.2279, pruned_loss=0.03979, over 23334.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2406, pruned_loss=0.03944, over 3010651.20 frames. ], batch size: 119, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:25:52,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:25:52,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:55,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 17:25:57,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:25:59,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:59,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:03,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:26:03,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:26:03,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:26:11,027 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.08 vs. limit=22.5 2023-10-03 17:26:18,408 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:26:25,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1347206.6666666667, ans=0.1 2023-10-03 17:26:26,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:26:26,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:26:27,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:26:27,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:26:29,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 17:26:29,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:26:31,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:33,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:26:34,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:26:34,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:26:34,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 17:26:36,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:26:36,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:26:42,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:26:47,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:26:54,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:27:01,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:03,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 17:27:05,265 INFO [train.py:1046] (2/4) Epoch 39, batch 250, loss[loss=0.1567, simple_loss=0.2348, pruned_loss=0.03926, over 23599.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2399, pruned_loss=0.03919, over 3393346.43 frames. ], batch size: 135, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:27:05,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:27:05,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:27:05,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:27:05,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:27:05,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 17:27:06,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:27:06,891 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 17:27:08,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1347406.6666666667, ans=0.0 2023-10-03 17:27:09,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:10,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:27:12,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:12,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:27:14,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:27:14,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:17,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:27:20,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:27:30,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:27:31,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:27:32,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:27:32,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1347473.3333333333, ans=0.125 2023-10-03 17:27:33,344 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.891e+02 2.094e+02 2.550e+02 3.805e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-03 17:27:36,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.33 vs. limit=15.0 2023-10-03 17:27:37,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1347540.0, ans=0.125 2023-10-03 17:27:38,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:27:40,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:27:40,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:27:41,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:27:41,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1347540.0, ans=0.0 2023-10-03 17:27:43,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:27:43,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:27:43,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:27:47,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:27:49,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 17:27:49,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:27:50,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:27:52,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:27:52,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:27:52,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:27:52,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:27:53,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:27:55,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:27:55,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1347606.6666666667, ans=0.125 2023-10-03 17:27:56,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:27:58,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:27:58,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1347606.6666666667, ans=0.125 2023-10-03 17:28:00,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:28:03,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:28:03,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1347673.3333333333, ans=0.125 2023-10-03 17:28:06,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:28:10,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.82 vs. limit=15.0 2023-10-03 17:28:10,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:13,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:28:15,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1347673.3333333333, ans=0.2 2023-10-03 17:28:17,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 17:28:18,521 INFO [train.py:1046] (2/4) Epoch 39, batch 300, loss[loss=0.1503, simple_loss=0.2257, pruned_loss=0.03742, over 23276.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2375, pruned_loss=0.03838, over 3688093.61 frames. ], batch size: 119, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:28:18,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:28:18,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:28:20,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 17:28:21,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:28:23,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:28:23,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 17:28:27,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1347740.0, ans=0.025 2023-10-03 17:28:28,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:28:28,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:28:31,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:28:31,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 17:28:31,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1347806.6666666667, ans=0.125 2023-10-03 17:28:34,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:34,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:28:35,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 17:28:35,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:28:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:28:43,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:28:44,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 17:28:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 17:28:49,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:28:50,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:28:52,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:28:52,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 17:28:52,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:28:54,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:28:54,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.48 vs. limit=15.0 2023-10-03 17:28:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:28:55,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:28:57,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1347873.3333333333, ans=0.125 2023-10-03 17:28:58,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:28:58,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 17:28:59,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:29:01,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:03,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 17:29:05,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:08,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:29:11,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:29:11,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 17:29:16,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:16,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:29:16,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1348006.6666666667, ans=0.125 2023-10-03 17:29:19,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:20,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:29:20,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 17:29:20,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:29:21,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1348006.6666666667, ans=0.0 2023-10-03 17:29:22,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:23,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 17:29:25,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:25,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:26,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:29:28,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:28,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:28,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1348006.6666666667, ans=0.0 2023-10-03 17:29:31,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:29:31,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 17:29:32,507 INFO [train.py:1046] (2/4) Epoch 39, batch 350, loss[loss=0.1631, simple_loss=0.232, pruned_loss=0.04709, over 23791.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2363, pruned_loss=0.03822, over 3913362.85 frames. ], batch size: 195, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:29:34,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:39,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:29:42,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:44,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:47,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 17:29:49,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:29:49,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 17:29:52,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:52,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 17:29:54,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:56,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.66 vs. limit=6.0 2023-10-03 17:29:56,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 17:29:58,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:29:59,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:59,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:29:59,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1348140.0, ans=0.125 2023-10-03 17:29:59,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1348140.0, ans=0.0 2023-10-03 17:30:00,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.879e+02 2.125e+02 2.432e+02 3.754e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-03 17:30:02,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:02,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:02,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:30:02,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:30:03,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:30:04,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:30:12,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:30:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:30:13,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:30:14,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:19,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 17:30:19,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:30:22,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:22,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:24,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:30:26,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 17:30:28,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:29,890 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 17:30:31,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 17:30:31,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:34,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:30:34,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 17:30:35,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:36,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:30:38,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:39,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:39,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:41,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:45,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:30:47,669 INFO [train.py:1046] (2/4) Epoch 39, batch 400, loss[loss=0.1528, simple_loss=0.2297, pruned_loss=0.03791, over 23737.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2356, pruned_loss=0.03828, over 4090354.24 frames. ], batch size: 232, lr: 2.62e-03, grad_scale: 32.0 2023-10-03 17:30:49,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:30:49,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 17:30:49,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:50,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:30:51,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:30:52,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1348406.6666666667, ans=0.125 2023-10-03 17:30:53,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:30:56,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:56,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:30:59,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 17:31:00,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 17:31:00,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:31:02,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 17:31:03,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:31:04,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=15.0 2023-10-03 17:31:07,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:31:07,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:07,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 17:31:07,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:31:09,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:31:09,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:09,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1348473.3333333333, ans=0.125 2023-10-03 17:31:10,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:31:13,271 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 17:31:14,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 17:31:19,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:31:20,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:31:22,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 17:31:22,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 17:31:22,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1348540.0, ans=0.125 2023-10-03 17:31:25,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:31:28,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:31:32,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 17:31:33,317 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-10-03 17:31:36,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:31:38,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 17:31:39,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:41,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:31:41,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 17:31:45,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:31:46,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:31:48,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:31:52,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:31:52,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 17:31:54,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:31:55,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 17:31:58,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:31:58,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:32:01,538 INFO [train.py:1046] (2/4) Epoch 39, batch 450, loss[loss=0.1568, simple_loss=0.2471, pruned_loss=0.03331, over 24615.00 frames. ], tot_loss[loss=0.156, simple_loss=0.236, pruned_loss=0.03802, over 4239803.14 frames. ], batch size: 73, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:32:01,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 17:32:03,530 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.95 vs. limit=22.5 2023-10-03 17:32:04,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:32:04,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:32:04,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:32:06,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 17:32:06,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:32:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:32:07,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:32:07,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 17:32:07,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:32:10,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:32:10,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1348740.0, ans=0.015 2023-10-03 17:32:12,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:32:21,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:23,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:32:25,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 17:32:27,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 17:32:29,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:32:31,360 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.906e+02 2.093e+02 2.336e+02 3.263e+02, threshold=4.186e+02, percent-clipped=0.0 2023-10-03 17:32:31,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:31,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1348873.3333333333, ans=0.125 2023-10-03 17:32:32,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1348873.3333333333, ans=0.125 2023-10-03 17:32:34,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:32:37,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:32:37,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:32:40,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 17:32:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 17:32:43,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 17:32:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:32:44,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:32:44,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:32:45,998 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 17:32:46,007 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 17:32:47,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:48,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:32:50,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 17:32:53,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:32:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:32:55,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 17:32:56,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 17:32:58,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:33:00,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:33:00,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:33:01,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 17:33:04,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:33:05,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 17:33:05,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 17:33:07,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:33:10,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:33:13,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:33:14,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:33:16,013 INFO [train.py:1046] (2/4) Epoch 39, batch 500, loss[loss=0.1512, simple_loss=0.2293, pruned_loss=0.03658, over 24556.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03819, over 4338325.82 frames. ], batch size: 60, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:33:16,062 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 17:33:18,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:33:20,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:33:20,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:33:20,833 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 17:33:24,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 17:33:24,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:33:26,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:33:30,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:33:32,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:33:35,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:33:35,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:33:35,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:33:46,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:46,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:33:47,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:33:47,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:47,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 17:33:47,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:33:50,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:33:51,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:33:51,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:33:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:53,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 17:33:58,078 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 17:33:58,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:00,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:34:04,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 17:34:07,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:34:07,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1349273.3333333333, ans=0.0 2023-10-03 17:34:08,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:11,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:15,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:17,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:19,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 17:34:19,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:19,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 17:34:25,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:34:28,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:31,315 INFO [train.py:1046] (2/4) Epoch 39, batch 550, loss[loss=0.1775, simple_loss=0.2694, pruned_loss=0.04283, over 24393.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2386, pruned_loss=0.03924, over 4419534.95 frames. ], batch size: 77, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:34:32,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1349406.6666666667, ans=0.2 2023-10-03 17:34:34,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 17:34:35,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 17:34:35,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:35,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 17:34:36,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:34:36,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:38,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1349406.6666666667, ans=0.2 2023-10-03 17:34:39,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:39,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:34:41,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:34:41,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.97 vs. limit=6.0 2023-10-03 17:34:43,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:45,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 17:34:45,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:34:48,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:34:48,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:48,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1349473.3333333333, ans=0.1 2023-10-03 17:34:50,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1349473.3333333333, ans=0.1 2023-10-03 17:34:51,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:34:51,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:56,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 17:34:58,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 17:35:01,316 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.945e+02 2.178e+02 2.458e+02 4.129e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-03 17:35:01,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:35:04,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:35:05,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:35:07,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:35:09,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:09,829 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 17:35:11,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:35:12,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 17:35:12,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1349540.0, ans=0.0 2023-10-03 17:35:15,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:35:17,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:35:17,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:35:18,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:18,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 17:35:19,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 17:35:21,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:21,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:35:21,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:35:21,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:35:25,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:35:25,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:35:28,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:35:28,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:30,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 17:35:30,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:35:32,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:33,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:35:33,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:34,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:35:35,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 17:35:39,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1349673.3333333333, ans=0.95 2023-10-03 17:35:40,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 17:35:40,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1349673.3333333333, ans=0.125 2023-10-03 17:35:43,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 17:35:44,807 INFO [train.py:1046] (2/4) Epoch 39, batch 600, loss[loss=0.1624, simple_loss=0.2476, pruned_loss=0.03867, over 24500.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2394, pruned_loss=0.0394, over 4491696.64 frames. ], batch size: 66, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:35:44,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:35:44,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:35:46,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:50,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:35:53,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:35:55,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 17:35:57,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:35:59,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:36:01,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:03,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 17:36:04,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:36:07,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 17:36:09,749 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.16 vs. limit=15.0 2023-10-03 17:36:13,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:36:13,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:15,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:36:15,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1349873.3333333333, ans=0.0 2023-10-03 17:36:22,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:36:22,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:36:22,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:36:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:36:32,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:36:32,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:36:32,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:32,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1349940.0, ans=0.125 2023-10-03 17:36:40,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 17:36:44,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:36:44,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:36:49,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 17:36:49,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:36:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 17:36:53,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:36:53,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:36:59,590 INFO [train.py:1046] (2/4) Epoch 39, batch 650, loss[loss=0.1516, simple_loss=0.2306, pruned_loss=0.03626, over 23606.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.238, pruned_loss=0.03888, over 4561157.30 frames. ], batch size: 135, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:36:59,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 17:36:59,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:37:02,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:37:03,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:37:07,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:08,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 17:37:08,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:37:11,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:37:11,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:16,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:20,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 17:37:21,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:37:21,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:37:28,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 17:37:29,371 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.906e+02 2.088e+02 2.290e+02 3.410e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 17:37:30,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:30,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:32,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:37:34,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:37:38,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:37:38,551 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 17:37:38,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:38,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:37:40,492 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.93 vs. limit=15.0 2023-10-03 17:37:41,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:42,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:37:42,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:37:42,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:37:42,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 17:37:45,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:37:45,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:37:47,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:37:47,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:37:47,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:37:49,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 17:37:50,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 17:37:50,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:50,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:37:51,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:37:51,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:37:53,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:59,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:59,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:38:01,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:38:02,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:38:03,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 17:38:04,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:38:11,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:38:12,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:38:12,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:38:12,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:38:14,242 INFO [train.py:1046] (2/4) Epoch 39, batch 700, loss[loss=0.149, simple_loss=0.2267, pruned_loss=0.03569, over 23477.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03854, over 4594261.99 frames. ], batch size: 120, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:38:19,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 17:38:19,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 17:38:21,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 17:38:21,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:23,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1350406.6666666667, ans=0.125 2023-10-03 17:38:24,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:38:24,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 17:38:29,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:38:31,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1350473.3333333333, ans=0.0 2023-10-03 17:38:32,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:38:34,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:35,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:38:36,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:38:38,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:41,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 17:38:41,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:38:43,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 17:38:47,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 17:38:50,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:38:51,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:38:51,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:38:52,497 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.16 vs. limit=22.5 2023-10-03 17:38:55,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:38:57,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 17:39:00,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:02,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:39:02,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 17:39:06,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:39:08,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:11,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:15,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:39:17,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 17:39:20,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 17:39:20,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 17:39:23,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:24,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:39:26,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:39:28,097 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.28 vs. limit=15.0 2023-10-03 17:39:28,848 INFO [train.py:1046] (2/4) Epoch 39, batch 750, loss[loss=0.1508, simple_loss=0.2378, pruned_loss=0.03194, over 24468.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2369, pruned_loss=0.03852, over 4633690.42 frames. ], batch size: 66, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:39:28,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 17:39:32,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 17:39:34,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 17:39:34,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 17:39:35,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 17:39:36,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 17:39:36,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:39:37,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 17:39:38,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:39,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:39:41,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:39:43,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:43,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:39:44,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:39:47,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:39:47,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:39:48,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:39:50,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:39:50,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1350806.6666666667, ans=0.1 2023-10-03 17:39:51,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:51,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 17:39:53,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:39:53,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:56,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:57,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:39:58,637 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.002e+02 2.322e+02 2.663e+02 4.203e+02, threshold=4.644e+02, percent-clipped=1.0 2023-10-03 17:39:58,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 17:39:58,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:00,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 17:40:00,185 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 17:40:02,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 17:40:02,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:40:02,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:40:03,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:40:11,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:40:11,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:11,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:40:13,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:40:13,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:15,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 17:40:15,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:40:15,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1350940.0, ans=0.2 2023-10-03 17:40:17,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 17:40:17,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:40:18,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1350940.0, ans=0.2 2023-10-03 17:40:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:40:21,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 17:40:21,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:24,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:40:25,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:40:25,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1350940.0, ans=0.025 2023-10-03 17:40:26,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:40:28,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:40:32,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 17:40:32,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:40:33,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:40:36,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:40:37,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:39,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:39,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:40:43,528 INFO [train.py:1046] (2/4) Epoch 39, batch 800, loss[loss=0.1535, simple_loss=0.2316, pruned_loss=0.03773, over 20423.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2369, pruned_loss=0.03854, over 4649443.03 frames. ], batch size: 44, lr: 2.62e-03, grad_scale: 32.0 2023-10-03 17:40:46,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:46,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:47,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:40:47,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:50,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:50,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:40:52,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:57,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:40:57,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:40:59,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 17:41:01,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:01,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:41:02,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:41:02,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:41:02,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 17:41:02,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:02,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 17:41:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:07,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:09,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:41:09,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:41:12,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:13,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:16,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:41:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:41:18,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 17:41:21,378 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 17:41:22,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 17:41:22,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:41:22,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:41:24,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:25,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:41:30,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 17:41:30,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 17:41:32,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:41:33,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:41:37,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:41:39,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:40,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 17:41:40,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:41:40,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1351340.0, ans=0.125 2023-10-03 17:41:43,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 17:41:49,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:41:51,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=1351340.0, ans=0.025 2023-10-03 17:41:52,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:41:53,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 17:41:55,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:41:55,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:56,650 INFO [train.py:1046] (2/4) Epoch 39, batch 850, loss[loss=0.1432, simple_loss=0.2214, pruned_loss=0.03253, over 23526.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2378, pruned_loss=0.03893, over 4661082.24 frames. ], batch size: 134, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:41:56,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 17:41:56,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:41:58,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:59,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:01,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:42:02,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:42:04,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 17:42:04,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1351406.6666666667, ans=0.125 2023-10-03 17:42:05,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 17:42:05,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 17:42:08,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:42:08,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:42:09,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:09,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:42:09,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:42:13,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1351473.3333333333, ans=0.0 2023-10-03 17:42:14,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:42:14,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:14,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 17:42:18,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 17:42:22,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:42:23,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.64 vs. limit=15.0 2023-10-03 17:42:24,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 17:42:24,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1351540.0, ans=0.0 2023-10-03 17:42:26,740 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.921e+02 2.098e+02 2.491e+02 3.402e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 17:42:28,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 17:42:28,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1351540.0, ans=0.1 2023-10-03 17:42:31,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 17:42:34,612 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 17:42:34,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:42:34,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:42:34,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 17:42:37,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:39,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:39,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 17:42:40,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:42:41,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:43,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:42:43,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:42:45,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:42:46,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:42:46,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 17:42:48,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:42:48,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:42:49,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:42:49,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:42:50,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:55,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:56,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:42:57,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:42:59,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:00,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:43:06,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.50 vs. limit=15.0 2023-10-03 17:43:08,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:43:08,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1351740.0, ans=0.125 2023-10-03 17:43:09,830 INFO [train.py:1046] (2/4) Epoch 39, batch 900, loss[loss=0.1656, simple_loss=0.2515, pruned_loss=0.03989, over 24423.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.239, pruned_loss=0.03937, over 4660616.21 frames. ], batch size: 77, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:43:09,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:43:11,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 17:43:11,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:43:11,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:43:12,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.61 vs. limit=15.0 2023-10-03 17:43:13,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 17:43:19,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:43:20,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:20,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 17:43:23,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:43:25,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 17:43:26,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 17:43:27,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:43:27,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:43:27,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:43:29,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:43:31,314 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.63 vs. limit=15.0 2023-10-03 17:43:35,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1351806.6666666667, ans=0.125 2023-10-03 17:43:39,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:43:39,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:39,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:43:43,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:43:47,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 17:43:50,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:43:53,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:43:54,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:43:54,812 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 17:43:54,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 17:44:01,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:44:01,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:44:01,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:44:08,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:08,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 17:44:11,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:44:13,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1352006.6666666667, ans=0.95 2023-10-03 17:44:14,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 17:44:15,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:44:15,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:17,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:44:17,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 17:44:21,974 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 17:44:23,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 17:44:23,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 17:44:24,777 INFO [train.py:1046] (2/4) Epoch 39, batch 950, loss[loss=0.1722, simple_loss=0.2557, pruned_loss=0.04437, over 24331.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2387, pruned_loss=0.03909, over 4679615.90 frames. ], batch size: 77, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:44:26,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:26,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1352073.3333333333, ans=0.1 2023-10-03 17:44:30,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 17:44:33,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:44:36,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:36,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:36,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:44:39,238 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 17:44:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:45,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:44:45,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:44:45,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:44:45,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 17:44:46,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:44:46,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1352140.0, ans=0.125 2023-10-03 17:44:47,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.76 vs. limit=22.5 2023-10-03 17:44:47,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:49,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 17:44:49,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:53,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:53,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:53,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:55,211 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.980e+02 2.279e+02 2.726e+02 3.992e+02, threshold=4.557e+02, percent-clipped=0.0 2023-10-03 17:44:55,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 17:44:56,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:44:58,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:45:00,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:45:02,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:45:02,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:45:08,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 17:45:08,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1352273.3333333333, ans=0.0 2023-10-03 17:45:10,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 17:45:10,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:45:11,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:11,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:45:15,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 17:45:15,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:45:19,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:19,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:19,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 17:45:19,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:45:19,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:45:19,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1352273.3333333333, ans=0.0 2023-10-03 17:45:20,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 17:45:22,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1352340.0, ans=0.0 2023-10-03 17:45:24,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:45:26,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:45:30,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:45:33,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 17:45:33,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 17:45:38,060 INFO [train.py:1046] (2/4) Epoch 39, batch 1000, loss[loss=0.1633, simple_loss=0.2509, pruned_loss=0.03791, over 24573.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03908, over 4684525.52 frames. ], batch size: 71, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:45:38,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:41,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 17:45:41,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:45:44,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:45:47,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 17:45:47,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 17:45:51,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:45:51,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:45:53,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:55,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 17:46:00,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 17:46:02,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 17:46:02,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:03,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1352473.3333333333, ans=0.125 2023-10-03 17:46:04,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 17:46:07,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 17:46:07,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 17:46:08,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1352540.0, ans=0.2 2023-10-03 17:46:09,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:10,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:18,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:46:20,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:46:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:20,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:20,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 17:46:20,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:21,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:46:21,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:46:23,163 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 17:46:26,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 17:46:27,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 17:46:27,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.95 vs. limit=15.0 2023-10-03 17:46:28,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 17:46:30,715 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=22.5 2023-10-03 17:46:31,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:46:37,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:37,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:46:39,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:40,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:46:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 17:46:43,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:46:43,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 17:46:44,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 17:46:46,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:46:46,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:48,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:46:50,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:46:50,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.82 vs. limit=15.0 2023-10-03 17:46:53,101 INFO [train.py:1046] (2/4) Epoch 39, batch 1050, loss[loss=0.1532, simple_loss=0.2424, pruned_loss=0.03194, over 24648.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2356, pruned_loss=0.03882, over 4683251.57 frames. ], batch size: 73, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:46:53,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:55,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:46:57,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:46:58,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:46:58,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:47:00,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:47:01,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:47:04,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:47:05,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.78 vs. limit=5.0 2023-10-03 17:47:06,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:47:07,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:47:08,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:47:09,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1352806.6666666667, ans=0.0 2023-10-03 17:47:10,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:47:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 17:47:12,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:47:13,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 17:47:13,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:47:13,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 17:47:13,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:47:19,261 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:47:20,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:47:21,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:47:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:47:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 17:47:24,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 17:47:25,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.919e+02 2.049e+02 2.517e+02 3.582e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-03 17:47:25,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:47:27,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 17:47:27,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1352873.3333333333, ans=0.0 2023-10-03 17:47:28,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 17:47:30,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:47:34,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 17:47:35,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 17:47:37,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:47:38,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:47:41,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:47:41,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1352940.0, ans=0.0 2023-10-03 17:47:46,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 17:47:48,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 17:47:48,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 17:47:48,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:47:50,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:47:51,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 17:47:55,232 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.36 vs. limit=10.0 2023-10-03 17:47:55,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:47:57,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:47:57,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:47:57,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:47:57,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:01,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:01,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 17:48:03,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:48:04,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 17:48:04,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 17:48:04,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:48:06,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1353073.3333333333, ans=0.025 2023-10-03 17:48:07,645 INFO [train.py:1046] (2/4) Epoch 39, batch 1100, loss[loss=0.1521, simple_loss=0.2264, pruned_loss=0.03894, over 24385.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2357, pruned_loss=0.03875, over 4689185.09 frames. ], batch size: 58, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:48:09,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:48:14,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:48:17,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=15.0 2023-10-03 17:48:19,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:48:20,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:48:21,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:48:21,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 17:48:21,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1353140.0, ans=0.125 2023-10-03 17:48:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:48:24,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:48:27,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:48:27,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1353140.0, ans=0.125 2023-10-03 17:48:30,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1353140.0, ans=0.125 2023-10-03 17:48:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:48:32,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 17:48:32,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1353140.0, ans=0.05 2023-10-03 17:48:33,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 17:48:34,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:48:34,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:48:36,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:48:38,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:48:42,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:48:44,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1353206.6666666667, ans=0.0 2023-10-03 17:48:44,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1353206.6666666667, ans=0.125 2023-10-03 17:48:45,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 17:48:45,734 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 17:48:47,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:48,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:51,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:48:51,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:48:52,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 17:48:52,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1353273.3333333333, ans=0.04949747468305833 2023-10-03 17:48:53,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:48:53,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:48:53,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:48:53,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 17:48:59,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:48:59,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 17:49:01,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:49:05,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:49:08,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 17:49:08,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:49:10,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:11,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1353340.0, ans=0.125 2023-10-03 17:49:11,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1353340.0, ans=0.2 2023-10-03 17:49:12,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:49:12,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:49:14,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 17:49:16,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:49:16,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:49:16,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1353340.0, ans=0.125 2023-10-03 17:49:18,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 17:49:19,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:49:20,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 17:49:22,617 INFO [train.py:1046] (2/4) Epoch 39, batch 1150, loss[loss=0.1439, simple_loss=0.2292, pruned_loss=0.02927, over 24429.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.236, pruned_loss=0.03875, over 4689806.64 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:49:22,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:49:22,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:49:22,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:49:29,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:29,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1353406.6666666667, ans=0.1 2023-10-03 17:49:31,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:49:32,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:49:32,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:49:32,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 17:49:33,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:49:35,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 17:49:36,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:36,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:49:42,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 17:49:44,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:48,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:49,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:49:49,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 17:49:49,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:49:50,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:49:55,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.879e+02 2.025e+02 2.261e+02 4.283e+02, threshold=4.051e+02, percent-clipped=1.0 2023-10-03 17:49:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 17:49:56,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:57,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:50:03,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:50:10,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:50:10,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 17:50:10,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1353606.6666666667, ans=0.0 2023-10-03 17:50:11,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:11,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:18,343 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 17:50:21,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:27,755 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 17:50:31,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:50:33,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:50:33,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:50:34,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:50:35,958 INFO [train.py:1046] (2/4) Epoch 39, batch 1200, loss[loss=0.1573, simple_loss=0.25, pruned_loss=0.03228, over 24320.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2371, pruned_loss=0.03908, over 4681118.71 frames. ], batch size: 74, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:50:38,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:50:41,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:50:41,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:50:41,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1353740.0, ans=0.125 2023-10-03 17:50:43,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:50:43,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:50:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:50:45,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:50:46,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1353740.0, ans=0.07 2023-10-03 17:50:48,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:50:50,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:50:50,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:53,999 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 17:50:57,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 17:51:00,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:51:02,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:51:04,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:51:04,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:51:04,381 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 17:51:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:51:07,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-10-03 17:51:12,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:51:12,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:51:13,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 17:51:15,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:51:15,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1353873.3333333333, ans=0.2 2023-10-03 17:51:18,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 17:51:23,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 17:51:23,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:51:25,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:51:25,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:51:26,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:51:28,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:51:28,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:51:28,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:51:28,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 17:51:29,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:51:29,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:51:29,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 17:51:32,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:51:32,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:51:35,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:51:37,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:51:39,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 17:51:42,356 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 17:51:43,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:51:46,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:51:48,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:51:49,623 INFO [train.py:1046] (2/4) Epoch 39, batch 1250, loss[loss=0.1405, simple_loss=0.2206, pruned_loss=0.03019, over 24414.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2376, pruned_loss=0.03951, over 4690965.73 frames. ], batch size: 58, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:51:49,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:51:55,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 17:51:59,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:51:59,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:00,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 17:52:03,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:52:03,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1354140.0, ans=0.0 2023-10-03 17:52:04,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:52:09,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:52:09,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:10,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:52:10,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:52:12,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:52:16,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:52:16,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:52:16,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:52:17,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:52:17,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:21,936 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.935e+02 2.085e+02 2.357e+02 3.026e+02, threshold=4.170e+02, percent-clipped=0.0 2023-10-03 17:52:23,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:23,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:52:24,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.36 vs. limit=15.0 2023-10-03 17:52:28,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 17:52:29,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:52:31,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1354206.6666666667, ans=0.2 2023-10-03 17:52:32,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:52:33,415 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.70 vs. limit=15.0 2023-10-03 17:52:34,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 17:52:34,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:34,121 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 17:52:34,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:34,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:38,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:39,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:40,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:52:41,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 17:52:41,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 17:52:42,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 17:52:44,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1354273.3333333333, ans=0.1 2023-10-03 17:52:46,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:52:47,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 17:52:47,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:49,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 17:52:49,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:52:52,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 17:52:52,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:52:52,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:52:52,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 17:52:54,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:52:54,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 17:52:59,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:53:00,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:53:01,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:53:03,232 INFO [train.py:1046] (2/4) Epoch 39, batch 1300, loss[loss=0.1461, simple_loss=0.2279, pruned_loss=0.03218, over 24303.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2382, pruned_loss=0.0396, over 4696350.28 frames. ], batch size: 61, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:53:03,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1354406.6666666667, ans=0.125 2023-10-03 17:53:03,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1354406.6666666667, ans=0.125 2023-10-03 17:53:04,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:53:06,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1354406.6666666667, ans=0.125 2023-10-03 17:53:07,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:53:07,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 17:53:11,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:53:12,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:53:14,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:53:15,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:53:17,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:53:17,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 17:53:21,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:53:21,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:53:23,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 17:53:26,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.74 vs. limit=15.0 2023-10-03 17:53:27,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:53:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:53:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:53:32,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:53:34,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:53:35,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:53:36,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:53:36,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 17:53:40,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:53:40,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:53:42,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 17:53:43,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:53:45,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:53:47,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:53:49,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 17:53:49,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:53:49,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 17:53:52,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:53:56,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:53:57,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:53:58,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 17:54:01,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 17:54:01,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 17:54:04,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1354673.3333333333, ans=0.2 2023-10-03 17:54:04,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1354673.3333333333, ans=0.0 2023-10-03 17:54:05,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:54:06,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1354673.3333333333, ans=0.0 2023-10-03 17:54:07,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 17:54:09,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:54:14,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1354740.0, ans=0.2 2023-10-03 17:54:14,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.50 vs. limit=22.5 2023-10-03 17:54:15,385 INFO [train.py:1046] (2/4) Epoch 39, batch 1350, loss[loss=0.1519, simple_loss=0.2443, pruned_loss=0.02976, over 24281.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03917, over 4716837.70 frames. ], batch size: 74, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:54:15,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 17:54:16,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.62 vs. limit=15.0 2023-10-03 17:54:17,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1354740.0, ans=0.125 2023-10-03 17:54:18,740 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.26 vs. limit=10.0 2023-10-03 17:54:19,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:54:20,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:54:23,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:54:24,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:54:28,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:54:28,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:54:32,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:54:32,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 17:54:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:54:33,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:54:35,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1354806.6666666667, ans=0.07 2023-10-03 17:54:36,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 17:54:37,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:54:39,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:54:39,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 17:54:41,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1354806.6666666667, ans=0.0 2023-10-03 17:54:42,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 17:54:43,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 17:54:46,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:54:46,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 17:54:48,894 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.867e+02 2.085e+02 2.496e+02 4.197e+02, threshold=4.170e+02, percent-clipped=1.0 2023-10-03 17:54:56,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:55:04,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1354940.0, ans=0.125 2023-10-03 17:55:04,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1354940.0, ans=0.0 2023-10-03 17:55:05,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:55:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:05,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 17:55:08,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:09,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 17:55:09,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:55:10,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:55:12,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:55:14,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 17:55:15,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:55:18,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1355006.6666666667, ans=0.125 2023-10-03 17:55:20,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 17:55:23,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 17:55:29,754 INFO [train.py:1046] (2/4) Epoch 39, batch 1400, loss[loss=0.1523, simple_loss=0.2273, pruned_loss=0.03864, over 24471.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2363, pruned_loss=0.03858, over 4716596.56 frames. ], batch size: 58, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:55:29,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 17:55:31,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:35,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:55:35,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:55:37,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.62 vs. limit=22.5 2023-10-03 17:55:39,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 17:55:41,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 17:55:48,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1355140.0, ans=0.2 2023-10-03 17:55:49,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:55:51,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:55:51,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.47 vs. limit=15.0 2023-10-03 17:55:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:55:54,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:55:58,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:55:59,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 17:56:05,680 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.86 vs. limit=15.0 2023-10-03 17:56:05,839 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.74 vs. limit=15.0 2023-10-03 17:56:07,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:09,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:11,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 17:56:13,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:56:14,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:56:14,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:56:15,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:56:17,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:56:17,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:56:18,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:56:20,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 17:56:20,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:56:22,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.51 vs. limit=22.5 2023-10-03 17:56:24,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:29,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:56:36,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 17:56:37,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:56:38,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:56:40,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.82 vs. limit=15.0 2023-10-03 17:56:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 17:56:42,692 INFO [train.py:1046] (2/4) Epoch 39, batch 1450, loss[loss=0.1658, simple_loss=0.2476, pruned_loss=0.04203, over 23983.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.0383, over 4721172.13 frames. ], batch size: 86, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:56:42,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:56:42,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:56:44,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1355406.6666666667, ans=0.125 2023-10-03 17:56:46,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:56:49,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:56:49,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:49,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 17:56:55,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:56:57,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:56:58,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:56:58,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 17:57:00,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:57:01,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 17:57:01,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:03,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:03,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 17:57:05,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:57:05,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:57:05,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:57:05,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:06,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:57:08,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:10,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:11,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1355540.0, ans=0.0 2023-10-03 17:57:13,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:57:14,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:57:16,181 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.879e+02 2.005e+02 2.239e+02 6.029e+02, threshold=4.010e+02, percent-clipped=2.0 2023-10-03 17:57:16,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:57:17,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:19,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:19,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:57:19,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:19,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:21,354 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.24 vs. limit=6.0 2023-10-03 17:57:21,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 17:57:26,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:57:29,213 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 17:57:30,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:57:32,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:57:33,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:57:35,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 17:57:38,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:38,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 17:57:40,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 17:57:41,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:57:45,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:57:46,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:57:48,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 17:57:49,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 17:57:49,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 17:57:51,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:51,609 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.81 vs. limit=15.0 2023-10-03 17:57:52,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:57:55,814 INFO [train.py:1046] (2/4) Epoch 39, batch 1500, loss[loss=0.1809, simple_loss=0.2603, pruned_loss=0.05078, over 23230.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2371, pruned_loss=0.03829, over 4732929.18 frames. ], batch size: 93, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:58:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 17:58:01,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:58:01,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:58:01,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1355740.0, ans=0.0 2023-10-03 17:58:05,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:58:05,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:58:06,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:58:08,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 17:58:08,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:58:09,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:58:09,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:58:11,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:58:13,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:58:14,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:58:18,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:58:18,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 17:58:19,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:58:19,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:58:21,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:58:24,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 17:58:28,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 17:58:29,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:58:29,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 17:58:30,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1355873.3333333333, ans=0.07 2023-10-03 17:58:33,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:58:34,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:58:35,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:58:35,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:58:39,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 17:58:39,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:58:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:58:39,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 17:58:40,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:58:46,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:58:46,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 17:58:50,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:58:50,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1355940.0, ans=0.125 2023-10-03 17:58:51,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:58:55,565 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 17:58:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:58:56,904 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 17:58:58,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:58:59,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:58:59,784 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 17:58:59,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:59:03,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 17:59:05,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,660 INFO [train.py:1046] (2/4) Epoch 39, batch 1550, loss[loss=0.1582, simple_loss=0.2498, pruned_loss=0.0333, over 24312.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.238, pruned_loss=0.03872, over 4737675.36 frames. ], batch size: 74, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:59:10,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:59:10,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:59:10,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,982 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:59:12,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:59:13,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 17:59:15,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 17:59:15,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:59:15,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 17:59:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 17:59:17,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:59:19,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:19,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:59:19,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:59:20,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:21,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:22,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.67 vs. limit=22.5 2023-10-03 17:59:24,772 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 17:59:24,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:59:26,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:59:26,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:59:27,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1356140.0, ans=0.125 2023-10-03 17:59:28,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:59:28,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1356140.0, ans=0.125 2023-10-03 17:59:29,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 17:59:30,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:59:30,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 17:59:32,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 17:59:32,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 17:59:33,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:59:35,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:59:38,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:59:41,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 17:59:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 17:59:43,436 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:59:43,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.48 vs. limit=6.0 2023-10-03 17:59:44,432 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.873e+02 2.058e+02 2.310e+02 4.421e+02, threshold=4.116e+02, percent-clipped=1.0 2023-10-03 17:59:47,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:59:51,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:59:51,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:59:51,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:59:52,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 17:59:57,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:59:58,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:00,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:00:00,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.04 vs. limit=15.0 2023-10-03 18:00:02,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:00:03,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:00:03,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 18:00:04,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:00:07,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:00:07,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:08,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 18:00:08,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 18:00:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:17,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 18:00:20,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.03 vs. limit=15.0 2023-10-03 18:00:21,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:00:23,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:23,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 18:00:24,601 INFO [train.py:1046] (2/4) Epoch 39, batch 1600, loss[loss=0.1462, simple_loss=0.2279, pruned_loss=0.0323, over 24328.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2383, pruned_loss=0.03862, over 4736837.94 frames. ], batch size: 61, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:00:24,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:00:26,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:00:26,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:00:26,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:00:27,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:00:28,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:30,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 18:00:31,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 18:00:31,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 18:00:34,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:00:36,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 18:00:36,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:00:39,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:00:41,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1356473.3333333333, ans=0.09899494936611666 2023-10-03 18:00:43,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:00:45,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 18:00:49,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:00:49,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 18:00:50,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:51,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 18:00:57,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 18:01:04,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:01:05,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 18:01:07,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:01:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:01:07,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:01:08,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1356606.6666666667, ans=0.125 2023-10-03 18:01:10,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1356606.6666666667, ans=0.0 2023-10-03 18:01:11,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 18:01:16,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:01:17,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:01:19,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:19,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:20,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:01:22,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:01:22,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:01:24,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:01:30,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:01:33,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 18:01:33,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:01:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 18:01:37,241 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.38 vs. limit=10.0 2023-10-03 18:01:37,826 INFO [train.py:1046] (2/4) Epoch 39, batch 1650, loss[loss=0.1283, simple_loss=0.2129, pruned_loss=0.02191, over 24613.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2385, pruned_loss=0.03872, over 4731898.36 frames. ], batch size: 60, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:01:41,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:01:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:01:42,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:01:42,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 18:01:42,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 18:01:42,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 18:01:42,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 18:01:42,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1356740.0, ans=0.0 2023-10-03 18:01:47,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:49,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:01:49,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:01:49,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:01:52,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:01:52,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 18:01:54,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:01:54,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:01:54,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:01:54,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:01:56,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 18:01:56,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 18:02:01,159 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:02:02,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:02:03,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:02:08,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1356873.3333333333, ans=0.125 2023-10-03 18:02:11,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 18:02:12,619 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.940e+02 2.128e+02 2.413e+02 3.924e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-03 18:02:12,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:14,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 18:02:15,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1356873.3333333333, ans=0.125 2023-10-03 18:02:18,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:20,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:02:20,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:02:21,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1356873.3333333333, ans=0.1 2023-10-03 18:02:22,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:22,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:02:22,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:25,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:02:25,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:27,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:02:27,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:02:28,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:02:28,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:02:31,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:02:33,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 18:02:35,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:02:35,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 18:02:36,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 18:02:36,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 18:02:36,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:02:37,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:02:37,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:38,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:38,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 18:02:42,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1357006.6666666667, ans=0.04949747468305833 2023-10-03 18:02:43,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:44,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:02:44,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1357006.6666666667, ans=0.07 2023-10-03 18:02:45,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:47,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 18:02:49,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1357006.6666666667, ans=0.125 2023-10-03 18:02:51,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:51,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:02:51,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 18:02:52,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:02:52,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:02:52,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:02:53,988 INFO [train.py:1046] (2/4) Epoch 39, batch 1700, loss[loss=0.1474, simple_loss=0.2104, pruned_loss=0.04221, over 22696.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2371, pruned_loss=0.03922, over 4713005.20 frames. ], batch size: 322, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:02:54,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1357073.3333333333, ans=0.125 2023-10-03 18:02:55,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:02:55,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:02:55,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 18:02:56,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:03:03,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:03:07,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:03:14,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:03:14,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:03:14,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:03:15,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:03:17,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 18:03:19,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:03:20,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:21,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:03:23,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:03:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 18:03:26,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 18:03:27,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:28,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 18:03:30,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:03:37,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:03:37,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1357273.3333333333, ans=0.2 2023-10-03 18:03:38,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:03:38,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:03:40,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:03:40,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 18:03:42,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:03:44,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:44,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 18:03:46,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:03:46,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:03:47,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:47,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:03:51,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:03:51,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:03:52,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:03:52,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:03:54,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:03:58,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:03:59,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 18:03:59,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:02,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:04:02,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 18:04:06,626 INFO [train.py:1046] (2/4) Epoch 39, batch 1750, loss[loss=0.1709, simple_loss=0.2589, pruned_loss=0.0415, over 24584.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03882, over 4718088.96 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:04:09,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:12,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:04:12,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:04:14,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 18:04:14,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:04:17,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:04:17,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:20,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 18:04:21,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.50 vs. limit=15.0 2023-10-03 18:04:24,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:04:26,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 18:04:26,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:04:26,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:04:27,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1357473.3333333333, ans=0.2 2023-10-03 18:04:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:04:31,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 18:04:32,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:04:32,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 18:04:34,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1357473.3333333333, ans=0.0 2023-10-03 18:04:39,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:04:41,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.897e+02 2.069e+02 2.382e+02 4.230e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 18:04:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:04:44,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:48,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:48,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:50,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:04:52,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:52,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-10-03 18:04:53,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:04:53,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:04:55,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 18:04:57,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:05:00,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 18:05:01,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:05:03,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:03,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:05:06,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:05:07,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 18:05:07,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:05:10,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:05:10,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1357673.3333333333, ans=0.125 2023-10-03 18:05:10,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1357673.3333333333, ans=0.1 2023-10-03 18:05:13,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:16,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:05:18,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:05:18,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 18:05:18,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:05:19,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1357740.0, ans=0.2 2023-10-03 18:05:21,279 INFO [train.py:1046] (2/4) Epoch 39, batch 1800, loss[loss=0.1482, simple_loss=0.2258, pruned_loss=0.03529, over 24389.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2359, pruned_loss=0.03831, over 4713294.47 frames. ], batch size: 58, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:05:21,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:05:21,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:21,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:05:21,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:05:21,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:05:26,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:05:26,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1357740.0, ans=0.125 2023-10-03 18:05:27,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:05:28,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:05:31,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:05:33,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:05:34,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:05:37,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:05:38,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:40,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:40,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:05:41,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:05:42,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 18:05:43,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:05:48,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:05:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 18:05:53,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 18:05:53,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 18:05:53,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:05:55,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:55,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:57,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:06:01,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1357873.3333333333, ans=0.0 2023-10-03 18:06:04,130 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 18:06:05,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:06:06,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:07,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 18:06:08,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 18:06:08,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:06:09,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:06:10,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1357940.0, ans=0.125 2023-10-03 18:06:11,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:06:15,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 18:06:22,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:06:24,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 18:06:24,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:06:24,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:06:25,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:06:25,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 18:06:27,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:06:28,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:06:31,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 18:06:31,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:06:34,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:06:34,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:06:34,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:35,318 INFO [train.py:1046] (2/4) Epoch 39, batch 1850, loss[loss=0.1416, simple_loss=0.226, pruned_loss=0.02861, over 24652.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2365, pruned_loss=0.03833, over 4712006.95 frames. ], batch size: 68, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:06:36,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:38,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:06:40,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:06:40,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:06:43,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:06:43,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:06:49,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:06:51,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 18:06:53,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1358140.0, ans=0.125 2023-10-03 18:06:54,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 18:06:56,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1358140.0, ans=0.2 2023-10-03 18:06:57,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 18:07:00,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:07:00,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 18:07:00,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 18:07:05,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1358206.6666666667, ans=0.125 2023-10-03 18:07:08,964 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.977e+02 2.198e+02 2.562e+02 3.885e+02, threshold=4.397e+02, percent-clipped=0.0 2023-10-03 18:07:10,690 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:07:11,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:07:13,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 18:07:15,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.74 vs. limit=6.0 2023-10-03 18:07:16,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:07:16,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:07:19,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 18:07:20,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:20,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:07:21,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1358273.3333333333, ans=0.125 2023-10-03 18:07:23,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:07:24,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:07:25,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:07:28,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:07:30,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:30,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:07:30,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:07:31,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:07:33,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:07:36,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 18:07:36,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-10-03 18:07:37,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:07:40,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:07:40,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:07:40,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 18:07:40,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 18:07:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 18:07:43,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 18:07:44,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:07:44,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:07:44,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:07:44,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:46,024 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 18:07:46,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:07:47,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:47,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:07:48,758 INFO [train.py:1046] (2/4) Epoch 39, batch 1900, loss[loss=0.16, simple_loss=0.2505, pruned_loss=0.03475, over 24508.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2368, pruned_loss=0.03859, over 4704488.71 frames. ], batch size: 66, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:07:48,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:07:50,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:07:51,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 18:07:55,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:55,286 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 18:07:55,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:07:55,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:08:00,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1358406.6666666667, ans=0.1 2023-10-03 18:08:03,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:08:06,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:08:07,387 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 18:08:08,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 18:08:10,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:08:10,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:08:10,261 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 18:08:10,289 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 18:08:13,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 18:08:14,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:08:16,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1358473.3333333333, ans=0.125 2023-10-03 18:08:17,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 18:08:18,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 18:08:28,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 18:08:30,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 18:08:30,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:08:32,582 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 18:08:32,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 18:08:32,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1358606.6666666667, ans=0.1 2023-10-03 18:08:33,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 18:08:35,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 18:08:35,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:08:35,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1358606.6666666667, ans=0.0 2023-10-03 18:08:37,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.60 vs. limit=15.0 2023-10-03 18:08:39,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 18:08:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:08:44,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:08:44,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 18:08:46,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:08:48,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 18:08:49,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:08:55,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:08:55,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:08:55,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:08:56,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:08:58,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:08:59,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:09:00,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:09:02,732 INFO [train.py:1046] (2/4) Epoch 39, batch 1950, loss[loss=0.1666, simple_loss=0.2434, pruned_loss=0.04488, over 23464.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2379, pruned_loss=0.03909, over 4718229.86 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:09:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:09:04,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:09:06,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:09:06,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:09:06,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:09:07,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:09:11,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:09:12,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1358740.0, ans=0.0 2023-10-03 18:09:13,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:09:14,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:14,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:09:16,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 18:09:16,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 18:09:17,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:18,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:09:21,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:09:21,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:23,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:09:26,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:09:26,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:09:26,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:09:28,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:30,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:34,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:09:34,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:09:34,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:09:34,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 18:09:35,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:09:35,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:09:35,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:35,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1358873.3333333333, ans=0.0 2023-10-03 18:09:37,504 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 1.953e+02 2.190e+02 2.399e+02 3.415e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-03 18:09:39,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:41,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:09:45,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:09:48,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:09:48,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:09:48,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 18:09:49,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:09:53,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:09:54,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:09:55,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:10:04,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:05,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:08,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:10,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:10:12,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:10:12,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:10:14,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 18:10:14,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:10:15,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:10:15,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 18:10:17,157 INFO [train.py:1046] (2/4) Epoch 39, batch 2000, loss[loss=0.1886, simple_loss=0.2667, pruned_loss=0.05524, over 24362.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2393, pruned_loss=0.03978, over 4710234.43 frames. ], batch size: 77, lr: 2.61e-03, grad_scale: 32.0 2023-10-03 18:10:17,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:10:20,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:10:20,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:10:20,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-10-03 18:10:21,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:10:22,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:10:24,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:28,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 18:10:28,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:10:32,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:10:33,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 18:10:35,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:10:35,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:10:38,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:10:39,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 18:10:41,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:43,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:43,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:45,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 18:10:46,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:10:47,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1359206.6666666667, ans=0.0 2023-10-03 18:10:48,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 18:10:48,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:10:50,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:10:52,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 18:10:52,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:52,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:10:53,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:10:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 18:10:58,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 18:10:58,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:10:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:00,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.33 vs. limit=15.0 2023-10-03 18:11:01,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:01,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1359273.3333333333, ans=0.125 2023-10-03 18:11:02,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:11:02,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:11:04,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:11:06,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:11:08,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:08,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:11:08,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:08,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:10,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:11:12,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 18:11:15,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:11:16,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:19,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1359340.0, ans=0.0 2023-10-03 18:11:20,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:20,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:11:20,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1359340.0, ans=0.125 2023-10-03 18:11:25,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:26,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:11:26,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:28,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:11:28,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:11:29,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:29,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:30,755 INFO [train.py:1046] (2/4) Epoch 39, batch 2050, loss[loss=0.1485, simple_loss=0.2139, pruned_loss=0.04154, over 23363.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2386, pruned_loss=0.03975, over 4705487.31 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:11:34,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:11:36,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:41,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:11:43,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:11:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:44,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:11:45,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 18:11:45,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:11:47,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1359473.3333333333, ans=0.125 2023-10-03 18:11:48,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:48,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:11:57,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:11:57,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:58,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 18:11:59,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:12:01,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 18:12:01,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:12:06,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:12:07,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:08,629 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.929e+02 2.107e+02 2.301e+02 3.275e+02, threshold=4.215e+02, percent-clipped=0.0 2023-10-03 18:12:08,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:12:10,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:12:12,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:12:12,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:12:12,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:12:16,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:18,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:12:21,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:12:22,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:12:25,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:12:31,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:12:32,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 18:12:37,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:12:39,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:12:41,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:12:42,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 18:12:45,285 INFO [train.py:1046] (2/4) Epoch 39, batch 2100, loss[loss=0.1393, simple_loss=0.1944, pruned_loss=0.04206, over 19466.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2372, pruned_loss=0.03949, over 4691011.16 frames. ], batch size: 388, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:12:46,599 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 18:12:46,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:12:46,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:46,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:12:48,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:12:48,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 18:12:48,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 18:12:50,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:12:53,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:12:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:12:57,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:12:57,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:12:57,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 18:12:58,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:12:59,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.05 vs. limit=22.5 2023-10-03 18:12:59,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 18:12:59,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 18:13:01,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:13:02,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 18:13:02,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 18:13:02,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1359806.6666666667, ans=0.2 2023-10-03 18:13:09,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 18:13:09,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:13:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:13:11,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:13:14,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:13:14,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1359873.3333333333, ans=0.125 2023-10-03 18:13:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 18:13:15,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:16,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 18:13:18,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 18:13:19,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:19,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 18:13:19,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 18:13:20,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 18:13:22,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:13:23,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:13:24,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.30 vs. limit=15.0 2023-10-03 18:13:26,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:13:28,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:13:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:31,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:31,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 18:13:31,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:31,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:32,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:32,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 18:13:34,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 18:13:35,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 18:13:41,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:13:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:13:46,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 18:13:50,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1360006.6666666667, ans=0.5 2023-10-03 18:13:52,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:53,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:13:54,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:13:54,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:13:54,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 18:13:55,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:13:57,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:57,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:13:59,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:13:59,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:00,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 18:14:02,215 INFO [train.py:1046] (2/4) Epoch 39, batch 2150, loss[loss=0.1507, simple_loss=0.2237, pruned_loss=0.03883, over 23515.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2364, pruned_loss=0.03902, over 4699991.81 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:14:02,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 18:14:02,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:05,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:05,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:14:05,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:14:06,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:14:11,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 18:14:13,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:13,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:16,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:14:16,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:16,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:14:18,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:19,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:14:19,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:14:23,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:23,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 18:14:28,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:29,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:14:31,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:31,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:31,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:31,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1360206.6666666667, ans=0.1 2023-10-03 18:14:32,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:14:32,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:32,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:14:33,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.05 vs. limit=12.0 2023-10-03 18:14:33,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:14:33,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 18:14:35,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:14:37,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:37,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:39,432 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.915e+02 2.126e+02 2.508e+02 4.642e+02, threshold=4.251e+02, percent-clipped=1.0 2023-10-03 18:14:39,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:14:42,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:14:44,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:46,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:14:46,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.86 vs. limit=15.0 2023-10-03 18:14:47,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 18:14:47,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:14:50,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:50,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:53,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:53,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:14:53,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:54,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:54,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 18:14:56,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 18:14:56,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:14:57,496 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 18:14:57,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:58,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:14:58,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 18:14:58,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:14:58,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 18:15:00,776 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 18:15:00,776 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 18:15:00,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 18:15:02,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:02,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:15:02,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:15:03,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:04,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:15:06,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:06,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:15,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:15:15,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 18:15:17,344 INFO [train.py:1046] (2/4) Epoch 39, batch 2200, loss[loss=0.1432, simple_loss=0.2222, pruned_loss=0.03209, over 24589.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2363, pruned_loss=0.03878, over 4705997.38 frames. ], batch size: 60, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:15:20,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:15:24,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:25,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:15:26,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:15:27,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:15:28,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1360406.6666666667, ans=0.2 2023-10-03 18:15:29,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:29,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:15:29,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 18:15:33,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 18:15:34,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:15:40,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 18:15:43,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:45,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:15:47,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:15:50,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:15:50,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 18:15:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:15:54,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:56,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 18:15:57,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1360540.0, ans=0.0 2023-10-03 18:15:58,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:16:00,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:02,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:16:03,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:04,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 18:16:06,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:06,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 18:16:08,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:08,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:16:08,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:10,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:16:12,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:12,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:12,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:12,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:16:14,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:16:15,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:16:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:16:20,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:16:22,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:16:22,853 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 18:16:23,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.whiten.whitening_limit, batch_count=1360673.3333333333, ans=12.0 2023-10-03 18:16:24,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:16:25,608 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 18:16:26,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:16:26,989 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 18:16:30,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:16:30,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:16:31,495 INFO [train.py:1046] (2/4) Epoch 39, batch 2250, loss[loss=0.1446, simple_loss=0.2319, pruned_loss=0.02861, over 24614.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2378, pruned_loss=0.03945, over 4712420.14 frames. ], batch size: 60, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:16:31,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:16:32,950 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 18:16:34,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:16:35,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:16:37,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1360740.0, ans=0.2 2023-10-03 18:16:38,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1360740.0, ans=0.125 2023-10-03 18:16:42,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:16:44,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:16:47,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:16:48,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:16:50,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:16:53,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 18:16:53,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:54,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:16:57,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 18:16:57,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:57,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:16:58,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:17:06,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:17:07,582 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.878e+02 2.042e+02 2.352e+02 3.262e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-03 18:17:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:17:07,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:17:09,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 18:17:10,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:17:11,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:17:16,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:17:18,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:17:19,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:17:19,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:17:22,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:17:24,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:17:27,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:17:29,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:17:34,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:17:34,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:17:35,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:17:37,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1361006.6666666667, ans=0.125 2023-10-03 18:17:39,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:17:39,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.69 vs. limit=15.0 2023-10-03 18:17:40,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1361006.6666666667, ans=0.035 2023-10-03 18:17:41,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:17:41,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 18:17:41,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:43,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:17:44,278 INFO [train.py:1046] (2/4) Epoch 39, batch 2300, loss[loss=0.1824, simple_loss=0.2549, pruned_loss=0.05494, over 23582.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.238, pruned_loss=0.03916, over 4719590.60 frames. ], batch size: 256, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:17:44,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 18:17:45,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1361073.3333333333, ans=0.1 2023-10-03 18:17:47,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:17:49,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:55,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:55,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:17:58,217 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 18:17:58,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.06 vs. limit=22.5 2023-10-03 18:17:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:17:59,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1361140.0, ans=0.2 2023-10-03 18:18:05,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:18:06,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:18:06,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:06,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:06,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 18:18:08,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:18:09,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:18:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:18:15,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:18:18,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:18:22,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:18:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:18:28,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:30,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:18:30,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:18:35,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:18:36,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:18:36,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:18:36,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 18:18:40,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:18:40,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:42,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:18:42,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:18:42,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:18:42,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 18:18:42,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:18:43,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 18:18:43,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:18:43,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:43,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 18:18:49,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:18:52,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:18:56,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:18:56,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:18:57,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:18:59,026 INFO [train.py:1046] (2/4) Epoch 39, batch 2350, loss[loss=0.1657, simple_loss=0.2502, pruned_loss=0.04056, over 23536.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2389, pruned_loss=0.03945, over 4712300.41 frames. ], batch size: 135, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:19:00,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:19:00,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:19:00,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:19:00,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 18:19:04,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:19:04,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 18:19:08,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1361406.6666666667, ans=0.125 2023-10-03 18:19:10,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 18:19:12,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:19:15,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:15,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:16,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:19:16,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:19:18,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 18:19:20,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:19:26,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 18:19:27,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:19:31,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:19:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:19:34,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:19:35,812 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.969e+02 2.119e+02 2.541e+02 4.388e+02, threshold=4.238e+02, percent-clipped=2.0 2023-10-03 18:19:35,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 18:19:37,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:19:38,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:19:39,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:19:39,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:19:42,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:19:43,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 18:19:43,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:19:45,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1361606.6666666667, ans=0.2 2023-10-03 18:19:46,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:46,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:19:49,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 18:19:49,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:19:51,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1361606.6666666667, ans=0.1 2023-10-03 18:19:54,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 18:19:54,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:19:58,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 18:20:00,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.38 vs. limit=15.0 2023-10-03 18:20:01,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 18:20:02,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:20:02,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:20:02,777 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 18:20:02,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 18:20:05,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 18:20:07,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:20:11,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:20:13,063 INFO [train.py:1046] (2/4) Epoch 39, batch 2400, loss[loss=0.1438, simple_loss=0.2252, pruned_loss=0.03116, over 24327.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2385, pruned_loss=0.03948, over 4714756.49 frames. ], batch size: 61, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:20:14,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:20:16,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:20:17,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 18:20:17,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 18:20:22,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:20:22,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:20:26,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 18:20:26,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:20:29,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 18:20:32,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1361806.6666666667, ans=0.0 2023-10-03 18:20:33,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:38,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 18:20:42,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:20:46,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 18:20:48,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:20:49,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:53,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1361873.3333333333, ans=0.1 2023-10-03 18:20:54,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1361873.3333333333, ans=0.1 2023-10-03 18:20:56,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:20:57,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 18:20:58,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:21:01,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:03,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:21:03,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1361940.0, ans=0.125 2023-10-03 18:21:05,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:06,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:21:06,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:21:06,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:21:06,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:07,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:21:07,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:21:13,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:21:13,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:21:13,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 18:21:14,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 18:21:14,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1362006.6666666667, ans=0.04949747468305833 2023-10-03 18:21:16,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:21:16,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:18,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 18:21:18,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 18:21:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 18:21:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 18:21:19,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 18:21:21,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:21:21,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:21:21,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:21:22,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 18:21:24,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:21:25,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:21:27,239 INFO [train.py:1046] (2/4) Epoch 39, batch 2450, loss[loss=0.1473, simple_loss=0.2221, pruned_loss=0.03621, over 23222.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2366, pruned_loss=0.03883, over 4710341.13 frames. ], batch size: 119, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:21:30,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:21:30,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:21:34,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:34,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:21:35,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 18:21:36,648 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.63 vs. limit=15.0 2023-10-03 18:21:41,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:21:42,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:44,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:21:44,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:21:44,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:21:45,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 18:21:46,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1362140.0, ans=0.2 2023-10-03 18:21:50,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:53,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:21:53,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:21:58,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:21:58,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:21:58,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1362206.6666666667, ans=0.125 2023-10-03 18:22:00,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:00,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1362206.6666666667, ans=0.025 2023-10-03 18:22:01,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:22:03,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 18:22:03,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:22:04,681 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.889e+02 2.081e+02 2.399e+02 4.481e+02, threshold=4.162e+02, percent-clipped=1.0 2023-10-03 18:22:10,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:12,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:22:12,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:12,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:22:12,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:14,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:22:14,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 18:22:15,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1362273.3333333333, ans=0.1 2023-10-03 18:22:15,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1362273.3333333333, ans=0.07 2023-10-03 18:22:16,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:16,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:22:20,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:22:20,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:20,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1362273.3333333333, ans=0.2 2023-10-03 18:22:24,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:22:25,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 18:22:25,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:22:29,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:22:29,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 18:22:29,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:22:30,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:22:30,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1362340.0, ans=0.0 2023-10-03 18:22:34,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:22:35,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:37,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:22:40,305 INFO [train.py:1046] (2/4) Epoch 39, batch 2500, loss[loss=0.1441, simple_loss=0.2174, pruned_loss=0.03535, over 23630.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2359, pruned_loss=0.03865, over 4705643.16 frames. ], batch size: 149, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:22:40,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 18:22:41,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:22:47,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:22:55,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:22:55,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:57,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:22:57,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 18:23:04,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:23:05,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:23:05,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:23:05,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:23:07,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 18:23:08,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:08,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:23:09,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 18:23:09,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:09,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 18:23:10,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:14,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:23:14,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:23:16,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:23:18,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 18:23:18,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:23:20,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:23,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:27,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:30,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:23:35,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:23:36,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 18:23:36,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:23:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:23:38,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:23:38,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:23:41,747 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 18:23:41,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 18:23:41,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 18:23:42,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1362673.3333333333, ans=0.125 2023-10-03 18:23:43,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:46,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 18:23:46,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 18:23:46,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:23:46,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 18:23:46,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1362673.3333333333, ans=0.2 2023-10-03 18:23:50,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 18:23:52,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1362673.3333333333, ans=0.1 2023-10-03 18:23:55,072 INFO [train.py:1046] (2/4) Epoch 39, batch 2550, loss[loss=0.1543, simple_loss=0.2265, pruned_loss=0.04103, over 23749.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2364, pruned_loss=0.03879, over 4710230.44 frames. ], batch size: 179, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:23:55,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:23:55,951 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-10-03 18:23:56,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:23:56,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:23:58,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:23:59,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 18:23:59,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:24:04,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 18:24:05,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:24:08,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:09,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:24:09,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 18:24:10,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:24:10,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:24:11,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:24:14,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1362806.6666666667, ans=0.0 2023-10-03 18:24:15,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:24:15,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 18:24:15,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:24:15,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:15,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 18:24:25,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1362873.3333333333, ans=0.0 2023-10-03 18:24:27,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:24:33,651 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.933e+02 2.149e+02 2.358e+02 3.387e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-03 18:24:33,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:24:33,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:33,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:24:35,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:24:36,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1362873.3333333333, ans=0.2 2023-10-03 18:24:42,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:24:45,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:24:46,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:24:46,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:24:46,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:24:46,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:24:49,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1362940.0, ans=0.0 2023-10-03 18:24:50,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:24:50,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:55,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:24:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 18:24:55,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:24:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:58,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:24:59,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:25:00,300 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.63 vs. limit=15.0 2023-10-03 18:25:01,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:06,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:25:07,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1363073.3333333333, ans=0.0 2023-10-03 18:25:08,877 INFO [train.py:1046] (2/4) Epoch 39, batch 2600, loss[loss=0.1627, simple_loss=0.2508, pruned_loss=0.03735, over 24641.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2375, pruned_loss=0.039, over 4725694.72 frames. ], batch size: 68, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:25:08,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:10,454 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 18:25:12,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1363073.3333333333, ans=0.125 2023-10-03 18:25:13,269 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 18:25:14,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:25:14,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 18:25:14,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 18:25:14,665 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 18:25:17,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:25:17,923 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 18:25:19,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 18:25:19,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 18:25:22,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:25:24,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 18:25:26,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 18:25:27,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:25:27,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 18:25:29,469 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 18:25:29,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 18:25:40,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:25:40,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:40,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:25:40,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 18:25:40,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1363206.6666666667, ans=0.125 2023-10-03 18:25:42,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:25:47,814 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 18:25:53,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:53,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:25:53,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 18:25:55,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:25:55,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:25:55,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 18:25:57,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:25:57,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:25:57,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1363273.3333333333, ans=0.125 2023-10-03 18:25:58,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:02,971 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 18:26:03,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:03,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:26:09,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:26:10,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:26:11,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 18:26:13,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:26:14,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:26:15,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:26:22,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 18:26:23,439 INFO [train.py:1046] (2/4) Epoch 39, batch 2650, loss[loss=0.1515, simple_loss=0.2343, pruned_loss=0.03431, over 24659.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2381, pruned_loss=0.03927, over 4712898.11 frames. ], batch size: 65, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:26:23,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:23,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:26:27,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 18:26:27,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:26:30,971 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 18:26:30,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:26:32,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:32,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:26:34,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:26:37,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:38,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 18:26:38,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:26:39,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:26:40,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 18:26:40,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1363473.3333333333, ans=0.2 2023-10-03 18:26:41,990 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 18:26:44,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:26:49,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 18:26:49,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:26:50,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 18:26:52,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:26:52,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:26:52,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:26:53,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:26:55,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 18:26:55,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 18:26:58,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:27:01,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 18:27:01,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:27:02,907 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 2.015e+02 2.271e+02 2.609e+02 3.479e+02, threshold=4.541e+02, percent-clipped=0.0 2023-10-03 18:27:03,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:04,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:27:04,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:27:06,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:27:07,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:27:09,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:27:10,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:27:11,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:27:13,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:27:14,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:15,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:27:16,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:18,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:27:19,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:27:20,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:22,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:27:22,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:22,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 18:27:25,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1363673.3333333333, ans=0.0 2023-10-03 18:27:25,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1363673.3333333333, ans=0.2 2023-10-03 18:27:26,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:27:28,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:29,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:31,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:32,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:27:32,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:34,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1363673.3333333333, ans=0.1 2023-10-03 18:27:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:27:36,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 18:27:38,190 INFO [train.py:1046] (2/4) Epoch 39, batch 2700, loss[loss=0.1552, simple_loss=0.2376, pruned_loss=0.03638, over 23374.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2389, pruned_loss=0.03934, over 4714358.86 frames. ], batch size: 119, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:27:39,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:27:40,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1363740.0, ans=0.125 2023-10-03 18:27:41,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 18:27:43,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:27:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:44,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:45,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:27:45,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:46,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:27:46,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:27:47,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 18:27:48,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:27:48,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1363740.0, ans=0.2 2023-10-03 18:27:50,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:27:50,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:27:50,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:54,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:27:54,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 18:27:56,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:27:59,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1363806.6666666667, ans=0.0 2023-10-03 18:28:00,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:28:00,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:07,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1363873.3333333333, ans=0.0 2023-10-03 18:28:08,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:28:08,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:28:09,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-10-03 18:28:09,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:28:09,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:28:12,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:15,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:28:15,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:28:15,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:28:19,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:19,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:28:27,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1363940.0, ans=0.125 2023-10-03 18:28:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:28:28,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:28:33,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:28:33,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:36,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:37,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:38,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:28:39,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:28:40,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:40,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:28:43,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:28:44,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:44,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:48,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 18:28:49,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:51,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1364073.3333333333, ans=0.125 2023-10-03 18:28:52,248 INFO [train.py:1046] (2/4) Epoch 39, batch 2750, loss[loss=0.1476, simple_loss=0.2302, pruned_loss=0.0325, over 24305.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2384, pruned_loss=0.03914, over 4710248.19 frames. ], batch size: 61, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:28:52,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:28:52,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 18:28:55,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 18:28:55,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:57,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:28:59,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:59,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1364073.3333333333, ans=0.015 2023-10-03 18:29:01,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:01,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:29:01,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:05,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:06,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1364140.0, ans=0.125 2023-10-03 18:29:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:29:07,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:29:07,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:07,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 18:29:07,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:29:07,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:29:10,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1364140.0, ans=0.2 2023-10-03 18:29:13,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 18:29:14,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:29:14,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:14,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:29:16,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:29:16,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:29:16,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1364140.0, ans=0.1 2023-10-03 18:29:19,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:29:20,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:20,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:22,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:29:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:29:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:29:23,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:23,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:29:31,657 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.877e+02 2.030e+02 2.207e+02 3.015e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-03 18:29:31,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:33,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:29:33,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:37,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:37,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:29:39,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:29:41,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.38 vs. limit=15.0 2023-10-03 18:29:45,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:29:45,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:29:45,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 18:29:50,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:52,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 18:29:57,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:29:59,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:29:59,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 18:29:59,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:30:01,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:30:02,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 18:30:02,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:30:05,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 18:30:05,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:06,860 INFO [train.py:1046] (2/4) Epoch 39, batch 2800, loss[loss=0.1376, simple_loss=0.185, pruned_loss=0.04512, over 19063.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2371, pruned_loss=0.0389, over 4695105.36 frames. ], batch size: 389, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:30:06,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:08,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 18:30:08,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:08,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:11,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:11,603 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 18:30:11,604 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 18:30:11,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1364406.6666666667, ans=0.0 2023-10-03 18:30:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:17,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:30:17,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:30:20,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:30:21,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 18:30:23,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 18:30:24,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 18:30:26,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:26,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:30:26,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:30:30,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:30:30,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:30,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:30:32,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:30:39,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:30:41,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:42,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:42,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1364540.0, ans=0.0 2023-10-03 18:30:43,656 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.51 vs. limit=22.5 2023-10-03 18:30:44,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:30:44,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1364540.0, ans=0.1 2023-10-03 18:30:44,608 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.19 vs. limit=10.0 2023-10-03 18:30:45,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:30:51,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:30:51,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 18:30:51,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:51,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1364606.6666666667, ans=0.125 2023-10-03 18:30:52,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:30:52,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:30:55,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1364606.6666666667, ans=0.125 2023-10-03 18:30:56,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:58,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:02,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:31:04,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:31:04,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:04,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:31:04,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:31:05,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:31:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:31:06,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 18:31:07,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:07,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:31:07,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:09,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 18:31:11,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:11,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:31:12,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:31:12,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 18:31:18,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:31:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:31:20,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:31:20,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-10-03 18:31:21,327 INFO [train.py:1046] (2/4) Epoch 39, batch 2850, loss[loss=0.1529, simple_loss=0.2424, pruned_loss=0.03165, over 24082.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2363, pruned_loss=0.03858, over 4694404.36 frames. ], batch size: 80, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:31:21,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:31:25,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:31:25,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:31:26,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:31:30,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:31,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:32,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:31:34,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 18:31:41,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 18:31:41,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:31:43,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 18:31:43,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:46,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 18:31:46,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 18:31:47,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:57,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1364873.3333333333, ans=0.0 2023-10-03 18:31:59,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:00,731 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.914e+02 2.254e+02 2.782e+02 3.876e+02, threshold=4.507e+02, percent-clipped=0.0 2023-10-03 18:32:00,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:32:00,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:32:02,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:32:02,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:32:03,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:32:05,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:32:05,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 18:32:06,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:32:06,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:32:08,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:08,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:09,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:09,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1364940.0, ans=0.2 2023-10-03 18:32:11,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:11,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:12,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:32:15,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:32:15,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:17,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:18,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:32:24,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:32:25,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 18:32:25,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1365006.6666666667, ans=0.125 2023-10-03 18:32:26,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 18:32:28,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:32:28,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:32:28,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 18:32:29,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:32:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:32:29,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:32:31,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:32:31,426 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 18:32:32,087 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 18:32:32,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:32:33,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:34,610 INFO [train.py:1046] (2/4) Epoch 39, batch 2900, loss[loss=0.1567, simple_loss=0.231, pruned_loss=0.04119, over 23349.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2359, pruned_loss=0.0384, over 4689100.29 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:32:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:32:39,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:32:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:32:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 18:32:42,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1365073.3333333333, ans=0.125 2023-10-03 18:32:44,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:44,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 18:32:44,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 18:32:47,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:32:47,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:32:48,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1365140.0, ans=0.125 2023-10-03 18:32:49,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:51,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:54,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:32:55,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:58,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:32:58,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 18:32:59,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:33:01,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:03,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 18:33:04,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 18:33:08,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:33:08,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 18:33:08,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:33:11,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.03 vs. limit=15.0 2023-10-03 18:33:11,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:33:11,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:33:13,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:33:14,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:16,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:33:20,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.70 vs. limit=10.0 2023-10-03 18:33:20,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:22,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 18:33:22,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 18:33:22,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:33:25,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:33:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 18:33:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:33:28,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1365273.3333333333, ans=0.2 2023-10-03 18:33:34,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:41,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:33:41,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:33:43,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 18:33:44,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1365340.0, ans=0.0 2023-10-03 18:33:46,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:46,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 18:33:47,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:33:47,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:33:49,289 INFO [train.py:1046] (2/4) Epoch 39, batch 2950, loss[loss=0.1464, simple_loss=0.2323, pruned_loss=0.03029, over 24311.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2369, pruned_loss=0.03839, over 4705070.25 frames. ], batch size: 61, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:33:52,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:33:53,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 18:33:55,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:33:55,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:55,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:33:55,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1365406.6666666667, ans=0.2 2023-10-03 18:33:56,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:33:59,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 18:33:59,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 18:34:00,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:34:00,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:34:05,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:34:06,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:34:10,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:34:10,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:34:10,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1365473.3333333333, ans=0.0 2023-10-03 18:34:14,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:34:14,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:34:14,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1365473.3333333333, ans=0.1 2023-10-03 18:34:15,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:34:17,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:34:17,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:34:18,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 18:34:24,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 18:34:24,168 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 18:34:25,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:34:27,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 18:34:28,556 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.923e+02 2.142e+02 2.477e+02 3.548e+02, threshold=4.284e+02, percent-clipped=0.0 2023-10-03 18:34:28,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 18:34:28,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:34:28,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:34:28,714 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 18:34:28,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:34:28,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1365540.0, ans=0.0 2023-10-03 18:34:31,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 18:34:32,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:34:32,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:34:35,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:34:35,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:34:37,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:37,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 18:34:38,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:34:38,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 18:34:42,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1365606.6666666667, ans=0.2 2023-10-03 18:34:43,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:45,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:34:46,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 18:34:46,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:34:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 18:34:49,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1365673.3333333333, ans=0.125 2023-10-03 18:34:51,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:34:51,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:34:52,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:34:53,134 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.41 vs. limit=15.0 2023-10-03 18:34:54,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:54,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:34:55,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:34:57,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:34:57,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:34:57,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:34:58,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:35:00,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:35:01,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:35:02,757 INFO [train.py:1046] (2/4) Epoch 39, batch 3000, loss[loss=0.1565, simple_loss=0.2301, pruned_loss=0.04145, over 23786.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2378, pruned_loss=0.0385, over 4720467.97 frames. ], batch size: 164, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:35:02,758 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 18:35:14,728 INFO [train.py:1078] (2/4) Epoch 39, validation: loss=0.3532, simple_loss=0.2838, pruned_loss=0.2113, over 1125622.00 frames. 2023-10-03 18:35:14,728 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 18:35:14,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 18:35:16,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:35:19,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:35:19,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:35:23,615 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 18:35:23,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 18:35:25,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:35:27,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:35:27,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 18:35:27,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:35:32,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:35:32,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1365806.6666666667, ans=0.125 2023-10-03 18:35:44,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:35:46,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1365873.3333333333, ans=0.125 2023-10-03 18:35:48,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1365873.3333333333, ans=0.0 2023-10-03 18:35:48,940 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:35:50,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 18:35:52,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:35:52,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1365873.3333333333, ans=0.0 2023-10-03 18:35:52,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1365873.3333333333, ans=0.0 2023-10-03 18:35:53,863 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.85 vs. limit=15.0 2023-10-03 18:35:54,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:35:54,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:35:54,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:35:58,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:35:58,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 18:35:59,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 18:36:00,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:36:00,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:36:02,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:36:02,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:36:02,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1365940.0, ans=0.125 2023-10-03 18:36:03,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:03,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:36:06,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:36:06,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:36:06,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:36:08,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:36:11,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 18:36:12,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:36:12,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:36:17,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.51 vs. limit=10.0 2023-10-03 18:36:17,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:17,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:18,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1366006.6666666667, ans=0.2 2023-10-03 18:36:20,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 18:36:20,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 18:36:20,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:36:20,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 18:36:21,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:36:23,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 18:36:24,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:36:26,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:36:26,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 18:36:28,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 18:36:28,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:36:29,434 INFO [train.py:1046] (2/4) Epoch 39, batch 3050, loss[loss=0.1481, simple_loss=0.2265, pruned_loss=0.03484, over 23377.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2378, pruned_loss=0.03834, over 4722051.61 frames. ], batch size: 119, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:36:29,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:36:30,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:30,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:36:30,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:32,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:36:33,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 18:36:33,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1366073.3333333333, ans=0.1 2023-10-03 18:36:33,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1366073.3333333333, ans=0.1 2023-10-03 18:36:35,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:36:38,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:36:38,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:36:41,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:43,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 18:36:48,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 18:36:48,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 18:36:50,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:36:52,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:36:55,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:57,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:36:57,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:36:58,244 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.90 vs. limit=12.0 2023-10-03 18:37:00,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:37:01,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:37:01,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:01,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:37:01,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:03,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:37:03,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1366206.6666666667, ans=0.5 2023-10-03 18:37:04,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:07,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 18:37:08,974 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.946e+02 2.164e+02 2.475e+02 3.663e+02, threshold=4.327e+02, percent-clipped=0.0 2023-10-03 18:37:09,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:37:09,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:37:11,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:37:11,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:37:13,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:37:13,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:20,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:20,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:26,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:28,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:37:28,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:29,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:37:31,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:37:31,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:37:31,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 18:37:33,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:37:33,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:34,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1366340.0, ans=0.0 2023-10-03 18:37:35,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 18:37:36,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:42,835 INFO [train.py:1046] (2/4) Epoch 39, batch 3100, loss[loss=0.1429, simple_loss=0.2275, pruned_loss=0.02918, over 24501.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2377, pruned_loss=0.03844, over 4724544.14 frames. ], batch size: 63, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:37:42,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:44,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:37:45,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:37:47,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 18:37:49,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 18:37:51,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 18:37:53,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:37:57,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:37:57,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:59,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 18:38:03,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:05,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1366473.3333333333, ans=0.125 2023-10-03 18:38:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 18:38:08,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1366473.3333333333, ans=0.07 2023-10-03 18:38:12,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:38:12,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:14,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:38:14,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:38:15,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 18:38:16,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:38:18,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 18:38:18,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:38:18,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:21,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 18:38:23,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:38:25,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:38:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 18:38:28,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 18:38:28,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:30,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:31,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:38:31,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:31,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:38:33,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:38:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:38:36,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:38:37,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:38:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:37,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 18:38:40,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:38:42,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 18:38:43,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:38:44,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 18:38:45,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:38:45,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:46,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 18:38:46,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1366673.3333333333, ans=0.2 2023-10-03 18:38:55,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 18:38:56,918 INFO [train.py:1046] (2/4) Epoch 39, batch 3150, loss[loss=0.1592, simple_loss=0.2356, pruned_loss=0.04139, over 14331.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.03825, over 4707429.77 frames. ], batch size: 30, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:38:58,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:38:58,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:59,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:38:59,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:39:01,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 18:39:01,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:01,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 18:39:04,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 18:39:04,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1366740.0, ans=0.125 2023-10-03 18:39:05,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:06,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1366740.0, ans=0.125 2023-10-03 18:39:07,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1366740.0, ans=0.125 2023-10-03 18:39:07,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1366740.0, ans=0.125 2023-10-03 18:39:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 18:39:08,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1366740.0, ans=0.125 2023-10-03 18:39:10,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 18:39:10,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1366806.6666666667, ans=0.125 2023-10-03 18:39:12,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:39:12,456 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:39:13,335 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 18:39:13,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 18:39:14,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 18:39:16,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 18:39:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 18:39:16,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:16,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:39:19,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:19,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 18:39:22,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:22,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:24,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:39:24,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:39:29,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 18:39:29,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:39:30,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:39:30,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:39:30,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 18:39:31,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1366873.3333333333, ans=0.0 2023-10-03 18:39:34,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 18:39:34,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:39:35,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 18:39:37,010 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.857e+02 2.083e+02 2.385e+02 4.094e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 18:39:37,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:39:37,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:39:37,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:39:38,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:39:38,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:39:38,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 18:39:39,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:39:39,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:41,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:39:41,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:39:43,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 18:39:43,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:39:45,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 18:39:45,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:47,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 18:39:47,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 18:39:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:39:49,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:39:50,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 18:39:52,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 18:39:53,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:39:56,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:39:57,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:58,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:40:02,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:40:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 18:40:10,899 INFO [train.py:1046] (2/4) Epoch 39, batch 3200, loss[loss=0.1628, simple_loss=0.2357, pruned_loss=0.0449, over 23609.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2355, pruned_loss=0.03823, over 4706571.08 frames. ], batch size: 256, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:40:11,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:40:11,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:40:13,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:15,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:40:15,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 18:40:18,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:40:24,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:40:26,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:29,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.16 vs. limit=15.0 2023-10-03 18:40:36,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:40:45,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 18:40:46,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:40:50,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 18:40:51,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:40:54,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:40:54,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:40:56,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:40:58,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 18:41:00,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 18:41:01,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 18:41:04,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 18:41:05,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:41:06,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1367273.3333333333, ans=0.1 2023-10-03 18:41:10,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1367340.0, ans=0.0 2023-10-03 18:41:13,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:41:14,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:16,070 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 18:41:16,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:41:19,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:41:22,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 18:41:22,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 18:41:25,936 INFO [train.py:1046] (2/4) Epoch 39, batch 3250, loss[loss=0.1644, simple_loss=0.2482, pruned_loss=0.04029, over 24551.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2354, pruned_loss=0.03863, over 4703999.19 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:41:25,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 18:41:27,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 18:41:28,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:41:30,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:41:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 18:41:31,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:41:31,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:32,983 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 18:41:37,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:41:39,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:41:47,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:41:48,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 18:41:48,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:41:49,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:49,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:41:51,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:41:51,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:41:54,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:41:54,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:41:54,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:41:58,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:41:59,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:42:01,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:42:02,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:42:03,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:42:03,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:42:03,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:42:05,207 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.905e+02 2.104e+02 2.395e+02 3.285e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 18:42:09,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 18:42:09,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:42:09,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:42:10,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:12,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:42:18,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:42:22,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1367673.3333333333, ans=0.09899494936611666 2023-10-03 18:42:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:42:25,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:25,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 18:42:25,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:42:25,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:42:25,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:25,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1367673.3333333333, ans=0.0 2023-10-03 18:42:29,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 18:42:29,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 18:42:29,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:42:30,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:31,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:42:31,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 18:42:33,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:42:33,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1367673.3333333333, ans=10.0 2023-10-03 18:42:36,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:42:36,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:42:37,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 18:42:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:42:38,905 INFO [train.py:1046] (2/4) Epoch 39, batch 3300, loss[loss=0.1415, simple_loss=0.2221, pruned_loss=0.03046, over 24311.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.237, pruned_loss=0.03902, over 4704588.95 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:42:40,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:42:40,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 18:42:41,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:42:41,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 18:42:43,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 18:42:45,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 18:42:46,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:42:50,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:42:50,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:53,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:42:53,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:42:54,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1367806.6666666667, ans=0.125 2023-10-03 18:42:59,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:42:59,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:43:02,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 18:43:03,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:03,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:03,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:04,957 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 18:43:05,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:43:06,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:43:06,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:43:06,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:07,597 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 18:43:11,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:43:11,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:43:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:13,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 18:43:14,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 18:43:14,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:16,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:43:19,239 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 18:43:20,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 18:43:20,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:43:22,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 18:43:24,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:43:27,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:43:28,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:43:31,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:31,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:31,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:43:32,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:43:34,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:43:34,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:35,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:43:38,549 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 18:43:38,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 18:43:40,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:43:40,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:43:40,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:41,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:41,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:42,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:43:44,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:44,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:43:45,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:46,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:43:48,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 18:43:50,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:43:51,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:52,788 INFO [train.py:1046] (2/4) Epoch 39, batch 3350, loss[loss=0.1367, simple_loss=0.2207, pruned_loss=0.02639, over 24306.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.238, pruned_loss=0.03952, over 4702223.84 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:43:52,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:43:52,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:43:55,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:56,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:56,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:01,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:44:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:02,864 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:44:03,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:44:06,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:06,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:44:08,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:44:08,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1368140.0, ans=0.125 2023-10-03 18:44:09,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:44:09,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 18:44:12,134 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 18:44:12,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:44:16,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 18:44:16,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 18:44:16,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1368140.0, ans=0.2 2023-10-03 18:44:17,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:44:17,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:44:18,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:18,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 18:44:18,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:19,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:44:22,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:24,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:26,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:44:31,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:31,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1368206.6666666667, ans=0.1 2023-10-03 18:44:31,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1368206.6666666667, ans=0.0 2023-10-03 18:44:32,928 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.940e+02 2.156e+02 2.529e+02 3.650e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 18:44:33,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:34,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1368206.6666666667, ans=0.125 2023-10-03 18:44:37,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:44:37,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:40,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:40,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:40,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1368273.3333333333, ans=0.125 2023-10-03 18:44:41,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:43,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 18:44:43,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:44:44,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 18:44:44,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:44:45,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 18:44:46,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:48,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:54,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:55,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 18:44:55,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:44:56,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:44:58,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:45:03,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.54 vs. limit=15.0 2023-10-03 18:45:04,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:45:06,152 INFO [train.py:1046] (2/4) Epoch 39, batch 3400, loss[loss=0.1579, simple_loss=0.2391, pruned_loss=0.03835, over 23694.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2392, pruned_loss=0.03976, over 4714861.95 frames. ], batch size: 149, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:45:07,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 18:45:07,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:45:07,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:45:09,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:09,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 18:45:09,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1368406.6666666667, ans=0.0 2023-10-03 18:45:10,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:45:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 18:45:11,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:45:13,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:45:13,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1368406.6666666667, ans=0.125 2023-10-03 18:45:14,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:45:15,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:45:15,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 18:45:18,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 18:45:18,907 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 18:45:19,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1368473.3333333333, ans=0.025 2023-10-03 18:45:20,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:21,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:45:21,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:45:22,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:45:23,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:45:28,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:45:29,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 18:45:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:45:37,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:45:38,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:38,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:45:42,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:45:46,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 18:45:50,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:52,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:52,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 18:45:53,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:45:53,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:54,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:45:54,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:45:59,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:46:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:46:03,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:46:05,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:46:07,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 18:46:13,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:46:16,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 18:46:19,271 INFO [train.py:1046] (2/4) Epoch 39, batch 3450, loss[loss=0.1595, simple_loss=0.2501, pruned_loss=0.03442, over 24553.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.239, pruned_loss=0.03976, over 4707646.92 frames. ], batch size: 71, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:46:20,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 18:46:22,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:46:25,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:46:25,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 18:46:27,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:46:31,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:46:36,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:46:36,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:46:38,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:46:38,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:46:39,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:46:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 18:46:50,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 18:46:50,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:46:50,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:46:53,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:46:59,748 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.732e+02 1.946e+02 2.153e+02 2.434e+02 3.371e+02, threshold=4.307e+02, percent-clipped=0.0 2023-10-03 18:46:59,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 18:46:59,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:47:03,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:47:03,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:47:04,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:47:05,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:47:07,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 18:47:07,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:47:08,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:47:11,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:47:14,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 18:47:17,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:47:21,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1369006.6666666667, ans=0.1 2023-10-03 18:47:22,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:47:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:25,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:30,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:30,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:47:31,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:47:33,223 INFO [train.py:1046] (2/4) Epoch 39, batch 3500, loss[loss=0.1432, simple_loss=0.2229, pruned_loss=0.03179, over 17388.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2376, pruned_loss=0.03895, over 4708178.27 frames. ], batch size: 37, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:47:33,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:47:37,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:41,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:47:43,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 18:47:43,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1369073.3333333333, ans=0.125 2023-10-03 18:47:44,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:47:45,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 18:47:47,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1369140.0, ans=0.125 2023-10-03 18:47:48,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:48,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 18:47:51,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:47:54,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:47:54,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:47:54,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:47:54,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:47:55,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:55,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:47:55,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 18:48:00,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:00,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:48:01,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:48:05,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:05,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 18:48:05,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:48:09,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:48:09,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:48:12,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:13,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:48:13,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:48:13,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1369206.6666666667, ans=0.125 2023-10-03 18:48:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 18:48:16,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 18:48:16,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 18:48:16,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:48:17,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:19,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:48:19,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:48:23,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:48:24,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:48:31,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:48:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 18:48:33,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 18:48:33,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:48:33,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1369340.0, ans=0.0 2023-10-03 18:48:34,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:48:36,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:48:36,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:39,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 18:48:39,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:48:39,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=1369340.0, ans=0.02 2023-10-03 18:48:40,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:48:43,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 18:48:44,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 18:48:45,730 INFO [train.py:1046] (2/4) Epoch 39, batch 3550, loss[loss=0.1569, simple_loss=0.2386, pruned_loss=0.03757, over 23312.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2359, pruned_loss=0.0387, over 4688981.70 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:48:45,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:47,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:48:47,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:48:47,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:48:47,527 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:48:50,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:48:58,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:48:59,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 18:49:01,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:49:03,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:49:04,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:06,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:49:06,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:49:09,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:49:09,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:49:09,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:49:09,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:49:11,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:49:16,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:49:16,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:49:18,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:49:18,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:49:19,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:49:19,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 18:49:19,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:19,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:21,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:49:24,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1369540.0, ans=0.125 2023-10-03 18:49:25,663 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.892e+02 2.071e+02 2.363e+02 4.365e+02, threshold=4.143e+02, percent-clipped=1.0 2023-10-03 18:49:26,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1369540.0, ans=0.2 2023-10-03 18:49:27,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:49:27,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:49:28,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:49:31,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 18:49:31,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:49:33,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 18:49:33,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:49:34,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:49:34,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:49:38,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 18:49:38,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:49:46,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:49:46,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 18:49:47,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:49:51,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:52,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 18:49:59,627 INFO [train.py:1046] (2/4) Epoch 39, batch 3600, loss[loss=0.1471, simple_loss=0.2265, pruned_loss=0.03387, over 23359.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2354, pruned_loss=0.03852, over 4689061.91 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 32.0 2023-10-03 18:49:59,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 18:49:59,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:50:01,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:50:02,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:50:02,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:50:04,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:50:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:50:09,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:09,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:50:11,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:50:11,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:11,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 18:50:15,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:50:16,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:18,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1369806.6666666667, ans=0.125 2023-10-03 18:50:19,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:50:22,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:50:23,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:50:24,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:50:24,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 18:50:25,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:50:28,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:28,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:50:31,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:32,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:50:34,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:50:36,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 18:50:41,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1369940.0, ans=0.2 2023-10-03 18:50:42,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:50:44,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:50:44,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 18:50:44,563 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:50:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:50:53,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:56,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:51:02,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:51:02,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:51:02,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 18:51:03,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 18:51:04,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 18:51:04,991 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:51:08,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:51:08,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:51:08,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 18:51:09,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:51:09,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:51:09,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:51:10,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 18:51:10,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 18:51:12,181 INFO [train.py:1046] (2/4) Epoch 39, batch 3650, loss[loss=0.1757, simple_loss=0.2614, pruned_loss=0.04504, over 24085.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.236, pruned_loss=0.03852, over 4697601.78 frames. ], batch size: 80, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:51:14,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:51:16,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 18:51:19,504 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:51:20,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 18:51:22,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:51:24,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1370140.0, ans=0.125 2023-10-03 18:51:24,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1370140.0, ans=0.125 2023-10-03 18:51:26,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 18:51:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 18:51:31,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:51:31,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:51:31,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:51:34,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:51:34,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:51:34,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1370140.0, ans=0.1 2023-10-03 18:51:36,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 18:51:36,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:51:36,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:51:37,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 18:51:37,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:51:37,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:51:38,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:51:40,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:51:43,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 18:51:43,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 18:51:43,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:51:46,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 18:51:47,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:51:47,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:51:53,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:51:54,268 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.929e+02 2.156e+02 2.406e+02 3.410e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 18:51:55,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:51:55,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:51:57,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:51:59,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:52:01,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:52:03,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:52:05,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:05,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:52:07,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:52:08,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:52:09,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:52:15,659 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 18:52:18,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:52:19,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:52:21,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:52:21,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:22,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:52:23,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:25,215 INFO [train.py:1046] (2/4) Epoch 39, batch 3700, loss[loss=0.1627, simple_loss=0.2341, pruned_loss=0.04563, over 23745.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2369, pruned_loss=0.03891, over 4700803.49 frames. ], batch size: 164, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:52:25,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 18:52:25,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:28,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:52:29,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:52:30,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:52:33,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:33,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 18:52:33,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:35,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 18:52:35,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:52:39,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:52:42,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:52:42,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:52:45,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:52:45,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:46,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:52:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:52:49,300 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 18:52:55,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:52:55,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:52:56,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:52:56,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 18:52:56,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:52:59,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:01,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 18:53:01,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:04,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:53:08,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:09,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:53:11,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 18:53:13,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:53:15,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 18:53:15,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:53:15,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 18:53:18,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:53:19,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:53:22,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:53:24,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 18:53:24,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:53:24,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:53:24,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:53:25,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:53:28,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:53:29,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 18:53:31,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 18:53:33,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:53:33,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:35,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:53:35,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:53:39,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:40,450 INFO [train.py:1046] (2/4) Epoch 39, batch 3750, loss[loss=0.1632, simple_loss=0.2371, pruned_loss=0.04471, over 23747.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2384, pruned_loss=0.0394, over 4709156.49 frames. ], batch size: 164, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:53:40,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:53:42,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:53:43,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 18:53:44,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1370740.0, ans=0.0 2023-10-03 18:53:45,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 18:53:48,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:53:48,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 18:53:49,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:53:50,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:52,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:53,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:53:57,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:54:02,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:54:02,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:54:04,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:54:07,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:54:08,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 18:54:10,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:54:11,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:54:11,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:54:12,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1370873.3333333333, ans=0.0 2023-10-03 18:54:14,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 18:54:17,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 18:54:19,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:54:20,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:54:22,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:54:23,402 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.003e+02 2.174e+02 2.569e+02 4.236e+02, threshold=4.347e+02, percent-clipped=0.0 2023-10-03 18:54:27,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:54:29,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:54:31,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 18:54:33,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:54:37,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:54:38,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:54:41,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:54:44,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:54:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:54:49,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:54:49,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:54:50,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:54:54,789 INFO [train.py:1046] (2/4) Epoch 39, batch 3800, loss[loss=0.1419, simple_loss=0.2259, pruned_loss=0.02898, over 24319.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.238, pruned_loss=0.0392, over 4719416.62 frames. ], batch size: 61, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:54:58,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:55:02,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:03,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:55:03,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 18:55:05,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:55:07,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:08,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:55:11,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 18:55:11,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:11,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:55:12,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:55:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:55:13,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1371140.0, ans=0.125 2023-10-03 18:55:14,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:14,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 18:55:19,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 18:55:19,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:55:20,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:23,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:55:23,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:55:25,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:55:25,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:27,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:30,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:33,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:55:33,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 18:55:36,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:55:42,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:55:49,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:55:51,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 18:55:53,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 18:55:53,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:54,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1371340.0, ans=0.0 2023-10-03 18:55:55,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:55:56,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:58,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 18:56:01,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 18:56:02,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 18:56:02,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:04,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:56:09,366 INFO [train.py:1046] (2/4) Epoch 39, batch 3850, loss[loss=0.1416, simple_loss=0.224, pruned_loss=0.02957, over 24588.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2368, pruned_loss=0.03884, over 4728048.53 frames. ], batch size: 60, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:56:09,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:56:10,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:56:16,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:56:17,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 18:56:17,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:56:19,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:22,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:56:23,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:56:26,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:56:27,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 18:56:30,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:34,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:35,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:56:35,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:56:39,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:40,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:56:40,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:56:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:56:42,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:56:42,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1371540.0, ans=0.125 2023-10-03 18:56:43,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:56:44,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:44,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:56:46,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 18:56:46,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 18:56:46,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:56:48,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:49,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:56:50,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:50,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 18:56:52,337 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.999e+02 2.176e+02 2.496e+02 3.894e+02, threshold=4.352e+02, percent-clipped=0.0 2023-10-03 18:56:52,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 18:56:52,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1371606.6666666667, ans=0.125 2023-10-03 18:56:54,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1371606.6666666667, ans=0.0 2023-10-03 18:56:55,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:56:56,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 18:56:56,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:56:59,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1371606.6666666667, ans=0.2 2023-10-03 18:57:00,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:01,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1371606.6666666667, ans=0.0 2023-10-03 18:57:02,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:57:06,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:07,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 18:57:10,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 18:57:12,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:12,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:15,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:57:15,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:57:15,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:17,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:17,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:57:17,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 18:57:19,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:57:20,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 18:57:20,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:20,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:23,169 INFO [train.py:1046] (2/4) Epoch 39, batch 3900, loss[loss=0.1416, simple_loss=0.2269, pruned_loss=0.02822, over 24606.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2354, pruned_loss=0.03855, over 4721758.31 frames. ], batch size: 60, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:57:23,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:57:24,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:26,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:57:26,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:26,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:27,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:57:27,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 18:57:27,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:31,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:57:31,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:57:33,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:57:35,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:57:37,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:57:37,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:40,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:57:40,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 18:57:40,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:57:43,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 18:57:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:44,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 18:57:45,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 18:57:49,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:57:51,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:57:51,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:57:53,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:57:57,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:58:00,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:58:00,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1371873.3333333333, ans=0.0 2023-10-03 18:58:02,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:58:02,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:58:04,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:58:07,317 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:58:10,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:58:11,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:58:15,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1371940.0, ans=0.1 2023-10-03 18:58:18,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:58:18,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1371940.0, ans=0.125 2023-10-03 18:58:19,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:58:27,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:58:27,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1372006.6666666667, ans=0.125 2023-10-03 18:58:30,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:58:31,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 18:58:31,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 18:58:31,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:58:33,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1372006.6666666667, ans=0.125 2023-10-03 18:58:34,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 18:58:34,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:58:35,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 18:58:37,055 INFO [train.py:1046] (2/4) Epoch 39, batch 3950, loss[loss=0.1608, simple_loss=0.2317, pruned_loss=0.04497, over 22656.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2354, pruned_loss=0.03836, over 4716565.30 frames. ], batch size: 322, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:58:40,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:58:42,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 18:58:42,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:58:45,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:58:46,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:58:51,150 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 18:58:51,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:58:52,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 18:58:52,579 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 18:58:52,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:58:54,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:58:54,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:58:54,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:58:56,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1372140.0, ans=0.2 2023-10-03 18:58:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 18:58:58,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1372140.0, ans=0.0 2023-10-03 18:59:00,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:59:01,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:59:01,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:59:02,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:59:02,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:59:06,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1372206.6666666667, ans=0.0 2023-10-03 18:59:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:59:12,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:59:14,531 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:59:18,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 18:59:21,280 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.928e+02 2.107e+02 2.479e+02 4.773e+02, threshold=4.215e+02, percent-clipped=2.0 2023-10-03 18:59:21,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1372273.3333333333, ans=0.1 2023-10-03 18:59:22,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 18:59:22,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 18:59:22,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:59:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:59:26,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1372273.3333333333, ans=0.125 2023-10-03 18:59:31,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:59:31,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:59:31,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:59:31,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:59:32,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 18:59:34,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1372273.3333333333, ans=0.125 2023-10-03 18:59:37,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:59:38,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:59:43,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 18:59:52,975 INFO [train.py:1046] (2/4) Epoch 39, batch 4000, loss[loss=0.1521, simple_loss=0.2283, pruned_loss=0.03793, over 23308.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2358, pruned_loss=0.03874, over 4702658.25 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:59:53,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:01,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:05,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:06,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:00:07,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:08,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 19:00:09,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:00:10,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 19:00:10,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:00:10,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 19:00:12,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:15,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:00:16,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:00:16,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:00:16,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:00:16,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:00:19,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:00:20,789 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 19:00:22,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:00:22,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:24,928 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 19:00:25,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:00:25,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:00:32,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 19:00:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:00:33,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:00:35,225 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 19:00:36,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:00:36,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 19:00:36,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:00:38,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:39,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:00:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:00:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:00:42,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:00:42,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 19:00:44,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:44,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1372606.6666666667, ans=0.0 2023-10-03 19:00:46,107 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 19:00:50,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:00:51,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1372673.3333333333, ans=0.05 2023-10-03 19:00:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 19:00:54,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:00:55,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:57,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:00:57,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:02,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:01:03,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:01:04,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 19:01:06,073 INFO [train.py:1046] (2/4) Epoch 39, batch 4050, loss[loss=0.1445, simple_loss=0.2243, pruned_loss=0.03228, over 24275.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.236, pruned_loss=0.0388, over 4708446.87 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:01:06,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:01:06,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:08,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:01:09,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:01:09,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:01:14,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:01:18,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:01:18,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 19:01:21,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:01:21,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:01:21,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1372806.6666666667, ans=0.0 2023-10-03 19:01:25,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:27,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:01:29,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 19:01:32,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 19:01:32,556 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 19:01:33,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:01:41,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.83 vs. limit=15.0 2023-10-03 19:01:41,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 19:01:43,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:01:46,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:49,651 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.853e+02 2.035e+02 2.339e+02 3.700e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 19:01:49,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:49,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:01:49,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:52,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1372940.0, ans=0.0 2023-10-03 19:01:53,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:01:57,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 19:01:57,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:01:59,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:02:01,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 19:02:07,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:02:13,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.15 vs. limit=15.0 2023-10-03 19:02:13,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 19:02:15,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:02:15,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:02:15,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 19:02:15,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 19:02:15,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:17,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1373006.6666666667, ans=0.125 2023-10-03 19:02:18,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:02:19,745 INFO [train.py:1046] (2/4) Epoch 39, batch 4100, loss[loss=0.1459, simple_loss=0.2199, pruned_loss=0.03592, over 24340.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2367, pruned_loss=0.03925, over 4716907.78 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:02:19,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:21,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:02:27,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 19:02:28,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 19:02:30,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 19:02:31,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 19:02:31,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:31,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:32,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:02:34,393 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 19:02:37,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:02:37,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:02:37,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:38,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:02:42,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:02:43,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:02:43,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:02:45,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 19:02:45,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:45,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:02:46,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:02:46,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:02:46,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 19:02:50,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:02:51,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 19:02:53,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:02:54,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:02:54,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 19:02:57,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:02:57,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:02:57,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:03:00,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 19:03:02,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:03:02,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:03:05,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 19:03:05,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:03:05,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:03:07,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:03:13,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:16,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:03:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:03:26,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:03:26,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:03:29,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:03:32,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:03:33,858 INFO [train.py:1046] (2/4) Epoch 39, batch 4150, loss[loss=0.1497, simple_loss=0.232, pruned_loss=0.03375, over 24581.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03943, over 4708258.06 frames. ], batch size: 60, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:03:35,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:03:36,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:03:38,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:03:38,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:03:39,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 19:03:40,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:42,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 19:03:42,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 19:03:42,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 19:03:44,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:48,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:03:48,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:03:55,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1373473.3333333333, ans=0.125 2023-10-03 19:03:56,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:03:56,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:03:57,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:03:59,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:03:59,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:03:59,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1373473.3333333333, ans=0.125 2023-10-03 19:04:00,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:04:03,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:04:07,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:04:07,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 19:04:08,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 19:04:10,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:04:11,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 19:04:11,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:04:12,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:04:13,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1373540.0, ans=0.0 2023-10-03 19:04:13,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1373540.0, ans=0.125 2023-10-03 19:04:14,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:16,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:04:17,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.22 vs. limit=15.0 2023-10-03 19:04:17,650 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.923e+02 2.108e+02 2.478e+02 3.681e+02, threshold=4.216e+02, percent-clipped=0.0 2023-10-03 19:04:18,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1373606.6666666667, ans=0.125 2023-10-03 19:04:19,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 19:04:20,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1373606.6666666667, ans=0.1 2023-10-03 19:04:22,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:04:24,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:04:25,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 19:04:27,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:04:27,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 19:04:28,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:04:29,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:04:29,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:31,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 19:04:31,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:04:31,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:04:31,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1373673.3333333333, ans=0.2 2023-10-03 19:04:32,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:04:35,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 19:04:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:35,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:04:36,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:04:36,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 19:04:36,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:04:38,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 19:04:38,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:04:39,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:41,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 19:04:41,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:04:46,933 INFO [train.py:1046] (2/4) Epoch 39, batch 4200, loss[loss=0.1286, simple_loss=0.2059, pruned_loss=0.02569, over 21229.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2358, pruned_loss=0.03919, over 4701119.55 frames. ], batch size: 46, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:04:47,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:04:49,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 19:04:50,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:04:52,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:04:55,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:04:55,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:04:55,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:04:58,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 19:05:01,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 19:05:01,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:02,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:05:02,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1373806.6666666667, ans=0.2 2023-10-03 19:05:06,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:05:07,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1373806.6666666667, ans=0.125 2023-10-03 19:05:08,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1373806.6666666667, ans=0.125 2023-10-03 19:05:09,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:05:11,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:05:11,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:11,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 19:05:11,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:05:13,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:14,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:05:14,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:05:16,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:05:16,182 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:05:18,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 19:05:18,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:23,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:05:25,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:05:26,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:05:28,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:05:29,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:05:31,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 19:05:31,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:05:32,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:05:36,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:05:39,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:05:43,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:05:44,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1373940.0, ans=0.0 2023-10-03 19:05:46,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 19:05:48,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:05:48,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1374006.6666666667, ans=0.125 2023-10-03 19:05:53,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:05:53,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:05:55,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 19:05:59,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:06:01,968 INFO [train.py:1046] (2/4) Epoch 39, batch 4250, loss[loss=0.1114, simple_loss=0.1669, pruned_loss=0.02797, over 19347.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.235, pruned_loss=0.03885, over 4700463.71 frames. ], batch size: 388, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:06:03,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:06:03,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:06:06,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:06,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1374073.3333333333, ans=0.125 2023-10-03 19:06:11,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:06:11,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 19:06:12,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:06:15,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:18,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:06:26,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:26,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:26,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1374140.0, ans=0.1 2023-10-03 19:06:27,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:06:27,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:06:28,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:29,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1374140.0, ans=0.125 2023-10-03 19:06:30,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:31,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:33,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:06:33,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:06:36,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 19:06:36,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1374206.6666666667, ans=0.0 2023-10-03 19:06:37,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1374206.6666666667, ans=0.2 2023-10-03 19:06:38,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 19:06:38,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:39,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:06:40,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:41,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:06:41,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:41,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:41,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1374206.6666666667, ans=0.125 2023-10-03 19:06:44,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:06:45,935 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.890e+02 2.070e+02 2.259e+02 3.425e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-03 19:06:46,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:06:52,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:06:53,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:06:54,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 19:06:54,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:06:56,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 19:06:56,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:06:58,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:07:01,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:07:01,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:07:02,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 19:07:03,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.80 vs. limit=10.0 2023-10-03 19:07:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:07:04,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:07:08,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:07:10,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:07:12,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:07:13,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:07:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:07:15,098 INFO [train.py:1046] (2/4) Epoch 39, batch 4300, loss[loss=0.155, simple_loss=0.2485, pruned_loss=0.03072, over 24666.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2357, pruned_loss=0.03839, over 4710623.41 frames. ], batch size: 73, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:07:15,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:07:16,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:07:16,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 19:07:16,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:07:22,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:07:22,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:07:26,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:07:33,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:07:33,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 19:07:36,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:07:37,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:07:38,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:07:38,866 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 19:07:41,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:07:44,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:07:44,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1374540.0, ans=0.2 2023-10-03 19:07:45,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 19:07:45,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:07:46,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 19:07:48,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:07:50,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:07:53,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:07:53,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:07:53,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:07:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:07:56,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:07:56,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 19:07:58,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 19:08:00,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1374606.6666666667, ans=0.1 2023-10-03 19:08:01,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:08:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:04,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:08:04,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:04,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:08:04,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 19:08:04,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 19:08:04,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1374606.6666666667, ans=0.05 2023-10-03 19:08:05,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 19:08:05,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:08:05,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 19:08:05,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 19:08:09,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:08:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 19:08:11,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1374606.6666666667, ans=15.0 2023-10-03 19:08:12,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:08:12,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1374673.3333333333, ans=0.1 2023-10-03 19:08:15,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:15,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:08:16,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 19:08:18,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:08:18,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:20,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:08:20,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:08:21,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:08:23,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:08:23,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:25,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:26,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:08:28,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.13 vs. limit=15.0 2023-10-03 19:08:29,485 INFO [train.py:1046] (2/4) Epoch 39, batch 4350, loss[loss=0.15, simple_loss=0.2369, pruned_loss=0.03155, over 24653.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2363, pruned_loss=0.03874, over 4702876.76 frames. ], batch size: 65, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:08:32,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 19:08:32,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:08:38,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:08:41,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:42,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:08:42,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:08:45,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:08:48,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:51,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:08:51,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:08:54,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:08:56,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:08:57,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:09:05,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 19:09:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:06,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:10,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:12,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 19:09:13,508 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.954e+02 2.148e+02 2.453e+02 3.516e+02, threshold=4.296e+02, percent-clipped=0.0 2023-10-03 19:09:15,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:17,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:09:19,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.63 vs. limit=15.0 2023-10-03 19:09:21,008 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 19:09:22,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:09:23,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:09:25,044 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 19:09:25,119 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 19:09:25,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:09:25,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:27,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:09:27,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1375006.6666666667, ans=0.1 2023-10-03 19:09:28,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:09:28,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:09:28,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:09:31,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 19:09:33,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:33,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:33,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:33,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 19:09:33,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1375006.6666666667, ans=0.125 2023-10-03 19:09:34,670 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 19:09:34,674 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 19:09:34,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 19:09:37,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:09:37,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:09:37,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:09:38,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:09:40,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 19:09:40,789 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.38 vs. limit=15.0 2023-10-03 19:09:41,651 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 19:09:41,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:42,944 INFO [train.py:1046] (2/4) Epoch 39, batch 4400, loss[loss=0.1538, simple_loss=0.2413, pruned_loss=0.03312, over 24645.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2372, pruned_loss=0.03914, over 4710668.66 frames. ], batch size: 68, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:09:43,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1375073.3333333333, ans=0.125 2023-10-03 19:09:45,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:09:45,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:47,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:49,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 19:09:49,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 19:09:49,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 19:09:50,370 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 19:09:50,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1375073.3333333333, ans=0.125 2023-10-03 19:09:51,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:09:51,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:09:53,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 19:09:56,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:56,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:58,335 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 19:09:59,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:09:59,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 19:09:59,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1375140.0, ans=0.125 2023-10-03 19:10:01,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 19:10:05,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 19:10:06,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 19:10:07,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 19:10:08,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:09,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:10:09,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:10:11,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:10:12,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 19:10:12,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 19:10:13,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:10:16,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:10:16,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:10:16,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1375206.6666666667, ans=0.0 2023-10-03 19:10:18,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:18,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:10:18,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 19:10:19,399 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 19:10:24,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:30,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:10:32,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 19:10:36,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:10:37,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:10:41,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:10:42,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 19:10:42,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:10:42,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:10:42,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:10:43,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:10:47,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 19:10:50,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 19:10:52,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 19:10:52,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:10:52,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 19:10:53,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:10:56,639 INFO [train.py:1046] (2/4) Epoch 39, batch 4450, loss[loss=0.205, simple_loss=0.2723, pruned_loss=0.06882, over 19174.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2383, pruned_loss=0.03971, over 4705503.35 frames. ], batch size: 388, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:10:56,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:10:58,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 19:11:01,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:11:03,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:04,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:11:10,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:10,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:11:15,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:17,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:11:19,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:11:20,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:11:21,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 19:11:21,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:11:21,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.71 vs. limit=15.0 2023-10-03 19:11:22,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:23,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:11:23,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:11:26,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:11:31,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:31,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:32,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:11:32,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:11:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:11:37,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 19:11:38,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 19:11:38,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 19:11:38,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:11:41,517 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.929e+02 2.104e+02 2.393e+02 3.740e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 19:11:41,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:41,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 19:11:45,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:11:49,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:50,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 19:11:50,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:50,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:11:51,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:11:51,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:53,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:53,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1375606.6666666667, ans=0.125 2023-10-03 19:11:56,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:11:56,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 19:11:59,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:11:59,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1375673.3333333333, ans=0.1 2023-10-03 19:12:00,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:12:00,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:12:03,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:12:03,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:12:06,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:12:09,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 19:12:10,909 INFO [train.py:1046] (2/4) Epoch 39, batch 4500, loss[loss=0.1426, simple_loss=0.2155, pruned_loss=0.03489, over 24423.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03996, over 4700498.82 frames. ], batch size: 58, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:12:12,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:12:14,596 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.92 vs. limit=22.5 2023-10-03 19:12:15,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:12:16,293 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.03 vs. limit=22.5 2023-10-03 19:12:16,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 19:12:16,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 19:12:18,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:12:24,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:12:24,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:12:24,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:12:26,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:12:26,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:12:27,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:12:31,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1375806.6666666667, ans=6.0 2023-10-03 19:12:39,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:12:39,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:12:42,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:12:43,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:12:43,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:12:49,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:12:54,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:12:57,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:13:00,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:13:01,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 19:13:03,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:03,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:06,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:06,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:13:06,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:13:07,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 19:13:07,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:13:07,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:13,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:13:15,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:13:16,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:19,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:13:19,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:13:19,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 19:13:22,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 19:13:23,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 19:13:25,973 INFO [train.py:1046] (2/4) Epoch 39, batch 4550, loss[loss=0.1553, simple_loss=0.2276, pruned_loss=0.04153, over 23973.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2371, pruned_loss=0.03929, over 4701732.04 frames. ], batch size: 196, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:13:26,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 19:13:27,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 19:13:28,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:13:33,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:13:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:13:36,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:13:38,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:13:40,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:40,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1376140.0, ans=0.0 2023-10-03 19:13:43,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:13:43,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:13:43,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:45,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:13:46,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:13:48,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:13:51,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 19:13:52,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 19:13:52,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:13:54,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 19:14:00,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 19:14:00,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:14:01,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 19:14:04,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:14:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:08,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:09,908 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.921e+02 2.129e+02 2.564e+02 3.877e+02, threshold=4.258e+02, percent-clipped=0.0 2023-10-03 19:14:09,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:14:10,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1376273.3333333333, ans=0.1 2023-10-03 19:14:11,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 19:14:13,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:14:15,789 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.37 vs. limit=15.0 2023-10-03 19:14:16,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:16,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:14:17,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:14:19,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 19:14:20,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 19:14:20,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:14:21,457 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.07 vs. limit=15.0 2023-10-03 19:14:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 19:14:23,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 19:14:23,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:14:25,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:14:25,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:14:25,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:25,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:14:27,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:14:28,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 19:14:31,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:14:31,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 19:14:31,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 19:14:31,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:14:31,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 19:14:35,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:14:35,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:14:37,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:14:37,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:38,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:14:40,462 INFO [train.py:1046] (2/4) Epoch 39, batch 4600, loss[loss=0.1431, simple_loss=0.2301, pruned_loss=0.02806, over 24440.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2357, pruned_loss=0.03908, over 4698729.42 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:14:40,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:14:41,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:14:42,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1376406.6666666667, ans=0.125 2023-10-03 19:14:44,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:14:46,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:14:47,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:14:47,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:14:47,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:14:49,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 19:14:49,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:14:53,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:14:54,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:14:57,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:04,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 19:15:04,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:07,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:11,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:15:11,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:15:16,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1376540.0, ans=0.09899494936611666 2023-10-03 19:15:17,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 19:15:17,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:15:17,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:15:23,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:23,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:15:24,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:15:28,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 19:15:28,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:15:34,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1376606.6666666667, ans=0.1 2023-10-03 19:15:36,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:36,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:15:38,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:38,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 19:15:38,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:38,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 19:15:39,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:40,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:40,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1376673.3333333333, ans=0.2 2023-10-03 19:15:42,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:42,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:15:43,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:45,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 19:15:45,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 19:15:45,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 19:15:45,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:15:46,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:15:46,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:15:48,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:53,989 INFO [train.py:1046] (2/4) Epoch 39, batch 4650, loss[loss=0.1698, simple_loss=0.2547, pruned_loss=0.04247, over 24019.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2359, pruned_loss=0.03886, over 4699411.28 frames. ], batch size: 86, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:15:55,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1376740.0, ans=0.0 2023-10-03 19:15:56,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:16:00,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:16:00,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:16:01,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:16:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:16:01,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:16:05,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 19:16:05,799 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-03 19:16:10,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:16:12,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 19:16:13,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:16:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 19:16:15,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:16:15,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 19:16:16,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 19:16:16,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:16,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:16:18,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1376806.6666666667, ans=0.0 2023-10-03 19:16:19,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:16:20,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:20,723 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 19:16:24,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:25,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 19:16:26,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:26,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:16:28,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 19:16:28,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1376873.3333333333, ans=0.0 2023-10-03 19:16:29,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:16:33,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:16:36,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:38,879 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.924e+02 2.085e+02 2.377e+02 3.719e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 19:16:39,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1376940.0, ans=0.1 2023-10-03 19:16:40,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:42,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:45,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:16:46,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 19:16:47,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 19:16:48,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 19:16:48,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 19:16:49,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1376940.0, ans=0.1 2023-10-03 19:16:49,856 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.47 vs. limit=6.0 2023-10-03 19:16:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:16:56,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:16:56,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:16:58,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 19:16:58,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:59,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:16:59,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:16:59,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1377006.6666666667, ans=0.025 2023-10-03 19:17:00,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:17:04,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:17:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:17:05,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:17:08,719 INFO [train.py:1046] (2/4) Epoch 39, batch 4700, loss[loss=0.147, simple_loss=0.2255, pruned_loss=0.03422, over 23683.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2368, pruned_loss=0.03893, over 4708995.25 frames. ], batch size: 232, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:17:08,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:17:10,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:17:10,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:17:10,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1377073.3333333333, ans=0.125 2023-10-03 19:17:11,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 19:17:13,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:17:14,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 19:17:21,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1377140.0, ans=0.125 2023-10-03 19:17:22,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.10 vs. limit=15.0 2023-10-03 19:17:23,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:23,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:17:23,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:17:24,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:17:24,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1377140.0, ans=0.1 2023-10-03 19:17:26,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:17:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 19:17:29,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 19:17:31,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:33,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:17:33,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:17:35,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:42,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:17:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:17:46,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:17:51,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 19:17:52,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:17:55,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:01,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 19:18:02,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:02,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1377273.3333333333, ans=0.125 2023-10-03 19:18:05,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:18:05,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 19:18:07,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:07,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:11,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:18:11,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:18:11,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 19:18:12,279 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-10-03 19:18:13,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1377340.0, ans=0.2 2023-10-03 19:18:14,512 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 19:18:14,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:17,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:17,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:17,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 19:18:19,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:21,984 INFO [train.py:1046] (2/4) Epoch 39, batch 4750, loss[loss=0.1578, simple_loss=0.236, pruned_loss=0.03979, over 23285.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2372, pruned_loss=0.03905, over 4720118.72 frames. ], batch size: 105, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:18:22,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 19:18:26,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:18:27,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:30,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:30,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:18:33,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 19:18:33,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:18:36,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 19:18:36,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:18:38,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:38,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:18:44,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 19:18:48,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:18:50,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 19:18:51,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:18:55,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:55,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:55,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:57,302 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 19:18:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 19:19:02,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 19:19:04,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:06,316 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.868e+02 2.072e+02 2.356e+02 4.427e+02, threshold=4.144e+02, percent-clipped=1.0 2023-10-03 19:19:08,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:19:08,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 19:19:08,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:19:09,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:19:13,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:19:14,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 19:19:14,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 19:19:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:19:16,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:19:16,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:18,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:19:18,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 19:19:20,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 19:19:23,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:19:26,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:19:26,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 19:19:27,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:19:27,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:19:31,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:19:32,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:32,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:19:36,595 INFO [train.py:1046] (2/4) Epoch 39, batch 4800, loss[loss=0.1774, simple_loss=0.2465, pruned_loss=0.05416, over 23666.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2378, pruned_loss=0.0392, over 4709706.13 frames. ], batch size: 232, lr: 2.60e-03, grad_scale: 32.0 2023-10-03 19:19:36,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:19:36,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 19:19:38,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 19:19:39,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 19:19:42,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:19:42,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:19:42,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 19:19:45,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1377740.0, ans=0.125 2023-10-03 19:19:48,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:48,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:19:54,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:19:55,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:56,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:56,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 19:19:58,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:19:58,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:20:01,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:20:04,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:05,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:20:07,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:07,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 19:20:07,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:09,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:11,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:12,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:16,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:16,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:20:16,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:20:17,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:19,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 19:20:19,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 19:20:21,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:21,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:20:21,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:20:21,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:20:21,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:20:23,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:20:23,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:20:25,142 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.79 vs. limit=15.0 2023-10-03 19:20:27,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:20:30,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:30,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1377940.0, ans=0.125 2023-10-03 19:20:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:20:37,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 19:20:37,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:37,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:37,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:20:38,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:39,275 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=15.0 2023-10-03 19:20:42,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:20:44,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:20:44,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:44,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:20:46,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:20:46,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:20:49,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1378073.3333333333, ans=0.5 2023-10-03 19:20:49,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1378073.3333333333, ans=0.125 2023-10-03 19:20:51,024 INFO [train.py:1046] (2/4) Epoch 39, batch 4850, loss[loss=0.1574, simple_loss=0.245, pruned_loss=0.03486, over 24420.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2386, pruned_loss=0.03983, over 4694180.84 frames. ], batch size: 69, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:20:51,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:20:51,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:51,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:52,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 19:20:53,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.86 vs. limit=15.0 2023-10-03 19:20:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 19:20:55,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:55,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:55,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:20:55,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:59,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:21:02,886 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.11 vs. limit=22.5 2023-10-03 19:21:04,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 19:21:07,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:21:11,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:21:11,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:21:11,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:21:17,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:21:19,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:21:20,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:21:20,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 19:21:22,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:21:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:21:25,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:21:25,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:21:25,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 19:21:27,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:21:27,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:28,363 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.97 vs. limit=15.0 2023-10-03 19:21:33,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:33,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 19:21:34,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 19:21:34,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:21:34,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1378273.3333333333, ans=0.1 2023-10-03 19:21:36,940 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.898e+02 2.062e+02 2.421e+02 3.265e+02, threshold=4.123e+02, percent-clipped=0.0 2023-10-03 19:21:39,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:21:41,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 19:21:41,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:21:41,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:21:44,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:21:45,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 19:21:46,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:47,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 19:21:47,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:21:47,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1378273.3333333333, ans=0.125 2023-10-03 19:21:49,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:21:49,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 19:21:57,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:04,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:22:04,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:05,440 INFO [train.py:1046] (2/4) Epoch 39, batch 4900, loss[loss=0.1319, simple_loss=0.1907, pruned_loss=0.03658, over 19100.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2375, pruned_loss=0.03949, over 4686601.26 frames. ], batch size: 388, lr: 2.59e-03, grad_scale: 16.0 2023-10-03 19:22:06,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 19:22:06,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:22:11,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1378406.6666666667, ans=0.125 2023-10-03 19:22:12,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:14,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:22:14,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:22:18,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 19:22:24,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 19:22:25,565 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-10-03 19:22:28,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 19:22:28,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 19:22:28,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:22:28,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:22:29,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:22:30,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:30,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:22:30,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 19:22:34,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 19:22:35,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:22:35,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:22:36,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:22:38,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:22:39,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:40,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1378540.0, ans=0.1 2023-10-03 19:22:41,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:41,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 19:22:42,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:22:42,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:42,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 19:22:42,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 19:22:47,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 19:22:49,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:22:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:22:51,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:22:51,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:52,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 19:22:52,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:22:53,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 19:22:56,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:57,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:22:59,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:23:01,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1378606.6666666667, ans=0.1 2023-10-03 19:23:02,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 19:23:02,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:23:04,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 19:23:05,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 19:23:13,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:23:13,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:23:15,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 19:23:15,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:23:15,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:23:15,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1378673.3333333333, ans=0.125 2023-10-03 19:23:17,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:23:20,019 INFO [train.py:1046] (2/4) Epoch 39, batch 4950, loss[loss=0.151, simple_loss=0.2428, pruned_loss=0.02962, over 24660.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2362, pruned_loss=0.03886, over 4703335.48 frames. ], batch size: 73, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:23:20,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:23:20,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:23:20,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:23:20,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 19:23:20,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:23:22,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1378740.0, ans=0.1 2023-10-03 19:23:23,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:23:23,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:23:27,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 19:23:27,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 19:23:28,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:23:28,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 19:23:29,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:29,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:23:30,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:23:30,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:32,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:23:33,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:23:36,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:23:36,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:23:39,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:39,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:23:42,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:23:46,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:47,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:23:49,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:49,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1378873.3333333333, ans=0.125 2023-10-03 19:23:50,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:52,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:23:53,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 19:23:54,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1378873.3333333333, ans=0.125 2023-10-03 19:23:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 19:23:57,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:01,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:24:01,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:24:03,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:24:03,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:24:04,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:24:05,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:24:07,183 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.841e+02 1.990e+02 2.204e+02 4.323e+02, threshold=3.980e+02, percent-clipped=1.0 2023-10-03 19:24:07,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:24:08,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:24:10,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:10,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:11,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 19:24:12,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:24:15,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:24:18,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1379006.6666666667, ans=0.125 2023-10-03 19:24:19,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:24:20,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:24:20,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:24:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:21,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:24:21,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:24:24,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:24:24,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:24:24,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:24:26,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 19:24:29,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:24:34,106 INFO [train.py:1046] (2/4) Epoch 39, batch 5000, loss[loss=0.1438, simple_loss=0.2005, pruned_loss=0.04357, over 19139.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2356, pruned_loss=0.03837, over 4712476.04 frames. ], batch size: 388, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:24:34,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 19:24:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 19:24:41,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:41,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:24:41,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 19:24:42,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 19:24:44,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:24:47,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 19:24:47,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:24:47,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:24:48,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 19:24:48,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:49,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:24:49,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 19:24:49,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:24:51,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:24:52,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 19:24:53,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 19:24:54,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:24:54,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 19:24:54,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:24:55,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:24:55,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:24:55,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 19:24:55,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 19:24:56,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1379140.0, ans=0.125 2023-10-03 19:24:57,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 19:24:58,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:58,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:00,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 19:25:00,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:25:02,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:25:05,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 19:25:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 19:25:07,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:25:09,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:25:12,171 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 19:25:15,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:25:16,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:16,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:20,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 19:25:20,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:25:20,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:25:22,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:25:23,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 19:25:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:25:28,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:25:28,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:25:34,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 19:25:39,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:39,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1379340.0, ans=0.0 2023-10-03 19:25:44,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.68 vs. limit=15.0 2023-10-03 19:25:46,988 INFO [train.py:1046] (2/4) Epoch 39, batch 5050, loss[loss=0.1554, simple_loss=0.2441, pruned_loss=0.03339, over 24556.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2363, pruned_loss=0.03883, over 4705806.52 frames. ], batch size: 71, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:25:50,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:25:51,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:51,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:25:51,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:25:51,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:25:51,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:25:53,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:55,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:55,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 19:25:57,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:26:00,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:26:02,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:26:02,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 19:26:04,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:26:05,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:26:06,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:26:08,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:26:08,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:26:11,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1379473.3333333333, ans=0.1 2023-10-03 19:26:18,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 19:26:18,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:26:19,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:26:19,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 19:26:21,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:26:23,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:23,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:26:24,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:26:24,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 19:26:25,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 19:26:25,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:27,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:26:28,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1379540.0, ans=0.0 2023-10-03 19:26:31,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:31,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 19:26:32,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:26:34,001 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.073e+02 2.452e+02 3.577e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-03 19:26:37,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 19:26:38,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:26:38,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:26:40,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:26:40,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:26:42,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:26:43,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:26:43,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:43,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:26:43,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:26:45,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 19:26:46,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:26:46,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:26:46,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1379673.3333333333, ans=0.07 2023-10-03 19:26:50,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:26:50,596 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 19:26:50,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:26:52,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:26:52,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1379673.3333333333, ans=0.07 2023-10-03 19:26:53,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:53,934 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 19:26:55,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:26:55,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 19:26:55,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:57,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1379673.3333333333, ans=0.0 2023-10-03 19:27:00,716 INFO [train.py:1046] (2/4) Epoch 39, batch 5100, loss[loss=0.1466, simple_loss=0.2237, pruned_loss=0.03475, over 24369.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2365, pruned_loss=0.03878, over 4706041.81 frames. ], batch size: 56, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:27:00,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:27:00,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:00,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 19:27:01,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1379740.0, ans=0.1 2023-10-03 19:27:02,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 19:27:06,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:06,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:06,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:27:08,864 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 19:27:09,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1379740.0, ans=0.125 2023-10-03 19:27:10,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:27:13,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 19:27:15,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 19:27:16,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:17,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:27:19,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:27:19,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 19:27:20,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 19:27:23,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:27:25,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:27:29,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:33,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 19:27:33,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:34,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:36,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:27:38,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:39,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:39,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 19:27:41,002 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 19:27:41,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:42,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 19:27:42,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 19:27:45,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:52,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:27:55,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 19:27:57,122 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 19:27:57,136 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 19:28:00,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 19:28:00,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:28:02,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 19:28:06,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 19:28:07,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:28:10,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:28:12,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 19:28:13,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:28:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 19:28:15,610 INFO [train.py:1046] (2/4) Epoch 39, batch 5150, loss[loss=0.152, simple_loss=0.2473, pruned_loss=0.02835, over 24351.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.237, pruned_loss=0.03901, over 4702858.51 frames. ], batch size: 74, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:28:18,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:28:18,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:28:18,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:28:19,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:28:20,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:28:21,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:28:21,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 19:28:21,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 19:28:21,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 19:28:22,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:28:22,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 19:28:23,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:28:24,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 19:28:27,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:28:28,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:28:31,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1380140.0, ans=0.125 2023-10-03 19:28:32,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:28:32,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 19:28:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:28:34,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:28:36,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:28:36,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:28:36,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:28:37,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:28:37,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:28:37,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 19:28:40,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:28:40,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:28:43,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:28:43,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1380206.6666666667, ans=0.125 2023-10-03 19:28:45,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 19:28:45,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:28:49,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:28:52,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 19:28:55,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:01,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:29:02,719 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.915e+02 2.051e+02 2.368e+02 4.802e+02, threshold=4.101e+02, percent-clipped=1.0 2023-10-03 19:29:02,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:29:05,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:05,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:29:07,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 19:29:10,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:29:12,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:29:12,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:29:14,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1380340.0, ans=0.0 2023-10-03 19:29:16,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:18,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:29:18,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 19:29:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:29:23,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1380340.0, ans=0.0 2023-10-03 19:29:24,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:29:26,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:29:27,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:29:27,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:29:27,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:29:27,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:29:28,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:29:30,175 INFO [train.py:1046] (2/4) Epoch 39, batch 5200, loss[loss=0.1836, simple_loss=0.2683, pruned_loss=0.0495, over 24314.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2381, pruned_loss=0.0398, over 4699756.92 frames. ], batch size: 77, lr: 2.59e-03, grad_scale: 16.0 2023-10-03 19:29:30,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1380406.6666666667, ans=0.1 2023-10-03 19:29:31,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:29:33,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:29:36,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:38,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1380406.6666666667, ans=0.2 2023-10-03 19:29:39,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 19:29:41,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:29:41,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:29:41,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1380406.6666666667, ans=0.125 2023-10-03 19:29:43,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:45,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:29:45,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:29:46,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 19:29:49,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:29:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:52,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 19:29:55,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:29:55,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1380473.3333333333, ans=0.125 2023-10-03 19:29:57,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:29:58,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 19:29:58,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 19:30:01,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 19:30:02,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:30:02,515 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 19:30:02,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:30:03,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:05,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:30:07,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 19:30:07,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:30:08,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:30:12,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 19:30:12,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 19:30:12,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 19:30:17,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 19:30:17,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:30:22,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:30:22,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:30:24,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 19:30:24,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:30:24,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 19:30:24,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:24,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1380606.6666666667, ans=0.125 2023-10-03 19:30:25,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:30:29,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:30:30,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:30:33,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:30:35,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:30:35,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:40,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:30:42,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 19:30:43,596 INFO [train.py:1046] (2/4) Epoch 39, batch 5250, loss[loss=0.1572, simple_loss=0.2492, pruned_loss=0.03265, over 24434.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2377, pruned_loss=0.03937, over 4700324.66 frames. ], batch size: 69, lr: 2.59e-03, grad_scale: 4.0 2023-10-03 19:30:43,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:30:43,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:30:43,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:45,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:30:45,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:30:47,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:30:47,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1380740.0, ans=0.125 2023-10-03 19:30:50,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:30:52,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:30:52,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:30:55,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1380740.0, ans=0.125 2023-10-03 19:30:59,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:31:00,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:31:02,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:31:04,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:31:04,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 19:31:06,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:31:06,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:31:11,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1380806.6666666667, ans=0.1 2023-10-03 19:31:12,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1380873.3333333333, ans=0.125 2023-10-03 19:31:26,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.87 vs. limit=15.0 2023-10-03 19:31:28,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1380940.0, ans=0.0 2023-10-03 19:31:31,832 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.831e+02 2.019e+02 2.223e+02 3.571e+02, threshold=4.037e+02, percent-clipped=0.0 2023-10-03 19:31:33,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1380940.0, ans=0.125 2023-10-03 19:31:44,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1381006.6666666667, ans=0.125 2023-10-03 19:31:45,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1381006.6666666667, ans=0.5 2023-10-03 19:31:53,162 INFO [train.py:1046] (2/4) Epoch 39, batch 5300, loss[loss=0.1706, simple_loss=0.2507, pruned_loss=0.04529, over 23403.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2363, pruned_loss=0.03913, over 4693050.66 frames. ], batch size: 93, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:32:08,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:32:08,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 19:32:08,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 19:32:08,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:08,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:08,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:08,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:09,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:09,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:09,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:09,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:32:09,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:32:09,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 19:32:09,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 19:32:09,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 19:32:09,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:32:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 19:32:09,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 19:32:09,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:10,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:10,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:32:10,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:32:10,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:32:11,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:32:11,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:11,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:11,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:32:11,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:11,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:32:11,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:11,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:32:11,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 19:32:11,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:32:12,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:12,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 19:32:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 19:32:12,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:32:12,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 19:32:12,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 19:32:12,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:32:13,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:32:13,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:32:13,601 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 19:32:13,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 19:32:13,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:32:13,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:13,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 19:32:13,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 19:32:13,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 19:32:14,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:32:21,178 INFO [train.py:1046] (2/4) Epoch 40, batch 0, loss[loss=0.1526, simple_loss=0.2291, pruned_loss=0.03807, over 23373.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2291, pruned_loss=0.03807, over 23373.00 frames. ], batch size: 106, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:32:21,179 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 19:32:32,915 INFO [train.py:1078] (2/4) Epoch 40, validation: loss=0.3547, simple_loss=0.2733, pruned_loss=0.2181, over 1125622.00 frames. 2023-10-03 19:32:32,916 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 19:32:34,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 19:32:34,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:32:37,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:32:41,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:41,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:32:41,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:42,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 19:32:44,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 19:32:48,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:48,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:53,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:55,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:32:55,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:32:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 19:32:58,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:33:07,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:33:07,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:33:09,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 19:33:13,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:33:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:33:14,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:33:16,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:33:20,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:33:21,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1381360.0, ans=0.1 2023-10-03 19:33:25,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 19:33:29,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 19:33:29,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:33:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:30,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:33:30,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:33:33,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 19:33:36,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:36,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:39,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:33:42,252 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 19:33:44,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:33:46,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:33:47,706 INFO [train.py:1046] (2/4) Epoch 40, batch 50, loss[loss=0.162, simple_loss=0.2519, pruned_loss=0.03607, over 23411.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2414, pruned_loss=0.04004, over 1065427.81 frames. ], batch size: 93, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:33:47,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:33:47,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 19:33:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:33:49,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:33:52,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:33:53,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:33:55,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:33:58,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 19:33:59,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:05,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:34:06,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 19:34:07,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 19:34:07,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1381560.0, ans=0.1 2023-10-03 19:34:09,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:34:10,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:34:10,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:11,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:34:13,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:34:13,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:34:13,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:19,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.937e+02 2.112e+02 2.333e+02 3.745e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 19:34:21,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:34:22,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:34:23,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:34:24,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 19:34:26,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:34:26,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:34:26,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 19:34:26,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:34:29,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 19:34:32,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=1381693.3333333333, ans=0.2 2023-10-03 19:34:35,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.93 vs. limit=6.0 2023-10-03 19:34:38,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:34:38,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:34:39,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.79 vs. limit=15.0 2023-10-03 19:34:39,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:34:41,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:34:41,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:34:42,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 19:34:43,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 19:34:45,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:34:45,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:34:46,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:34:48,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:34:48,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 19:34:48,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 19:34:49,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 19:34:50,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:34:50,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:34:50,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 19:34:52,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 19:34:52,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:34:54,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:34:55,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:34:55,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:34:57,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:35:01,787 INFO [train.py:1046] (2/4) Epoch 40, batch 100, loss[loss=0.1664, simple_loss=0.2546, pruned_loss=0.03907, over 24649.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.24, pruned_loss=0.03975, over 1868235.71 frames. ], batch size: 68, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:35:01,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:35:03,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:35:06,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 19:35:06,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:35:12,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:35:12,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:35:12,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:35:12,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:35:12,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:35:13,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 19:35:14,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:35:14,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:16,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:35:16,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:35:20,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 19:35:20,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:21,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:35:23,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:35:25,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:35:29,179 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 19:35:29,202 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 19:35:29,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1381960.0, ans=0.07 2023-10-03 19:35:32,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:35:32,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:35:37,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:35:38,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:40,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:42,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1381960.0, ans=0.1 2023-10-03 19:35:44,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:46,137 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 19:35:47,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 19:35:47,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1382026.6666666667, ans=0.125 2023-10-03 19:35:50,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:35:51,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:35:54,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:57,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:35:59,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:36:00,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:36:03,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:05,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:05,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:05,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:36:07,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:08,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 19:36:08,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 19:36:08,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:10,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:36:11,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:11,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:11,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 19:36:11,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:36:11,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:36:11,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:11,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1382093.3333333333, ans=0.125 2023-10-03 19:36:13,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:13,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:14,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:36:14,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:36:16,168 INFO [train.py:1046] (2/4) Epoch 40, batch 150, loss[loss=0.1462, simple_loss=0.2332, pruned_loss=0.02963, over 24672.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2388, pruned_loss=0.03856, over 2512509.08 frames. ], batch size: 68, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:36:16,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1382160.0, ans=0.125 2023-10-03 19:36:17,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:20,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:36:20,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:20,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:20,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1382160.0, ans=0.0 2023-10-03 19:36:23,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1382160.0, ans=0.125 2023-10-03 19:36:24,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:24,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:26,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:36:28,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:32,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 19:36:32,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 19:36:33,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1382226.6666666667, ans=0.1 2023-10-03 19:36:34,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 19:36:35,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:36:35,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:36:37,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:36:38,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:38,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:38,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:40,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:41,824 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 19:36:45,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:49,007 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.890e+02 2.037e+02 2.278e+02 3.667e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-03 19:36:49,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:52,594 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=12.0 2023-10-03 19:36:53,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:36:53,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 19:36:56,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:36:56,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:56,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:36:59,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:36:59,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:37:01,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:37:01,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:02,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 19:37:07,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:08,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:08,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:37:08,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:37:08,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1382360.0, ans=0.125 2023-10-03 19:37:11,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:13,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 19:37:15,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:37:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:37:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:37:18,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.15 vs. limit=12.0 2023-10-03 19:37:20,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:37:20,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 19:37:20,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:37:20,293 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 19:37:21,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1382426.6666666667, ans=0.125 2023-10-03 19:37:22,243 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.51 vs. limit=15.0 2023-10-03 19:37:23,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:37:26,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:37:26,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:37:30,421 INFO [train.py:1046] (2/4) Epoch 40, batch 200, loss[loss=0.1754, simple_loss=0.2554, pruned_loss=0.04768, over 23779.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2397, pruned_loss=0.0394, over 3005548.25 frames. ], batch size: 212, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:37:30,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 19:37:30,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:37:30,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:33,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 19:37:35,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:37:36,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:43,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:37:43,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:37:43,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:55,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1382560.0, ans=0.125 2023-10-03 19:38:02,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:38:02,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:38:03,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:38:04,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1382626.6666666667, ans=0.07 2023-10-03 19:38:05,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:38:05,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 19:38:05,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:38:08,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:08,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:38:09,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:38:09,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:38:11,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 19:38:12,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:38:13,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:16,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1382693.3333333333, ans=0.125 2023-10-03 19:38:17,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:38:22,743 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.18 vs. limit=6.0 2023-10-03 19:38:23,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:38:28,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:29,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:38:36,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 19:38:39,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:39,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:38:39,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:38:41,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:38:41,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 19:38:43,788 INFO [train.py:1046] (2/4) Epoch 40, batch 250, loss[loss=0.1466, simple_loss=0.2125, pruned_loss=0.0404, over 23637.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2388, pruned_loss=0.03912, over 3375852.38 frames. ], batch size: 256, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:38:43,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:38:43,861 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 19:38:47,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:48,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:38:48,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:48,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:50,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1382826.6666666667, ans=0.0 2023-10-03 19:38:51,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:38:53,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:54,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:38:54,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1382826.6666666667, ans=0.125 2023-10-03 19:38:57,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:39:07,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:39:10,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:39:12,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:39:14,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1382960.0, ans=0.125 2023-10-03 19:39:16,358 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.944e+02 2.127e+02 2.472e+02 3.844e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 19:39:16,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:39:16,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:39:18,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:39:19,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:39:21,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:39:21,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:39:21,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:39:21,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1382960.0, ans=0.1 2023-10-03 19:39:23,495 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:39:24,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:39:27,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 19:39:27,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:39:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:39:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:39:28,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:39:29,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:39:31,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:39:31,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:39:32,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:39:34,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:39:34,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:39:40,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:39:43,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:39:46,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:39:50,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:39:53,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:39:55,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 19:39:56,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:39:58,061 INFO [train.py:1046] (2/4) Epoch 40, batch 300, loss[loss=0.1422, simple_loss=0.2268, pruned_loss=0.02885, over 24632.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03856, over 3676959.05 frames. ], batch size: 68, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:39:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:39:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 19:39:59,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:40:00,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:40:00,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 19:40:03,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:40:05,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:40:06,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:40:08,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 19:40:09,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:40:11,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:40:11,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 19:40:11,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:40:14,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-10-03 19:40:16,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:40:19,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:40:19,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 19:40:25,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 19:40:25,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:26,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:40:28,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:28,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 19:40:28,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:40:31,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:40:32,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:40:33,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:40:36,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 19:40:36,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 19:40:38,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:40:41,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:44,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 19:40:45,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:40:49,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:40:53,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:40:53,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 19:40:57,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:57,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:41:00,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:41:01,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:41:01,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 19:41:01,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:41:01,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:04,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 19:41:05,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:41:05,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:07,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:41:08,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:08,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:11,606 INFO [train.py:1046] (2/4) Epoch 40, batch 350, loss[loss=0.1487, simple_loss=0.2268, pruned_loss=0.03525, over 23378.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2348, pruned_loss=0.03804, over 3890650.23 frames. ], batch size: 119, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:41:13,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:41:13,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 19:41:14,786 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:41:16,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:20,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:41:22,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:22,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:26,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1383560.0, ans=0.2 2023-10-03 19:41:27,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 19:41:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:41:29,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 19:41:32,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:32,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 19:41:32,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:35,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 19:41:38,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:41:39,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:41:42,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:41:42,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:41:44,058 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.933e+02 2.130e+02 2.384e+02 3.625e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 19:41:44,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:41:44,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:45,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:41:45,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:41:47,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:53,287 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.50 vs. limit=6.0 2023-10-03 19:41:53,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:41:54,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:41:54,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:41:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:00,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.25 vs. limit=15.0 2023-10-03 19:42:00,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 19:42:00,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:42:02,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1383693.3333333333, ans=0.125 2023-10-03 19:42:03,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:03,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:42:06,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 19:42:08,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:10,178 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 19:42:10,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1383760.0, ans=0.125 2023-10-03 19:42:10,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1383760.0, ans=0.05 2023-10-03 19:42:12,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 19:42:12,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:12,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1383760.0, ans=0.125 2023-10-03 19:42:14,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:42:14,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 19:42:16,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:19,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:42:21,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:23,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:23,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:24,933 INFO [train.py:1046] (2/4) Epoch 40, batch 400, loss[loss=0.1518, simple_loss=0.2428, pruned_loss=0.03039, over 24460.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2349, pruned_loss=0.03762, over 4078137.37 frames. ], batch size: 69, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:42:25,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:28,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:42:28,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1383826.6666666667, ans=0.2 2023-10-03 19:42:29,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:42:29,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 19:42:29,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:31,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:32,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:42:34,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:35,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:36,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:38,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 19:42:40,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 19:42:40,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:40,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1383893.3333333333, ans=0.125 2023-10-03 19:42:41,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 19:42:41,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1383893.3333333333, ans=0.125 2023-10-03 19:42:42,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:45,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:42:45,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:45,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 19:42:45,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1383893.3333333333, ans=0.125 2023-10-03 19:42:47,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:42:47,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:47,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:47,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:50,524 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 19:42:50,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 19:42:56,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:57,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 19:42:57,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 19:43:00,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:43:05,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:11,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 19:43:11,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1384026.6666666667, ans=0.07 2023-10-03 19:43:14,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:43:14,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1384026.6666666667, ans=0.0 2023-10-03 19:43:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 19:43:17,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:43:17,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:43:17,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 19:43:21,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:43:24,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:43:25,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:43:26,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:26,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 19:43:28,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:43:29,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 19:43:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:43:34,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:43:35,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 19:43:38,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:43:38,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:43:38,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1384160.0, ans=0.0 2023-10-03 19:43:39,820 INFO [train.py:1046] (2/4) Epoch 40, batch 450, loss[loss=0.1405, simple_loss=0.2259, pruned_loss=0.02753, over 16854.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2352, pruned_loss=0.03787, over 4198124.24 frames. ], batch size: 36, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:43:39,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:43:39,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 19:43:41,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:43:41,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:43:43,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:43:43,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 19:43:43,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:43:44,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:43:47,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:43:56,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:57,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:43:59,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 19:44:00,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 19:44:02,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:44:05,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:44:06,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:08,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:44:10,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:44:12,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.927e+02 2.083e+02 2.312e+02 3.254e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-03 19:44:12,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 19:44:14,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 19:44:15,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 19:44:16,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:44:16,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:18,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:44:19,837 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 19:44:19,852 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 19:44:19,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:44:23,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:44:24,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:44:27,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:44:27,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:44:29,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 19:44:29,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 19:44:31,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:44:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:44:34,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:44:37,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 19:44:39,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1384426.6666666667, ans=0.5 2023-10-03 19:44:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:44:43,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 19:44:43,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 19:44:45,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:44:48,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1384426.6666666667, ans=0.1 2023-10-03 19:44:49,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:44:50,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:44:50,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:44:50,870 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 19:44:53,560 INFO [train.py:1046] (2/4) Epoch 40, batch 500, loss[loss=0.1685, simple_loss=0.2416, pruned_loss=0.04768, over 23512.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2357, pruned_loss=0.03796, over 4315805.29 frames. ], batch size: 256, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:44:53,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:55,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:44:55,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:44:57,012 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 19:44:58,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 19:44:58,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:45:01,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:45:03,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:45:05,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:45:05,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1384493.3333333333, ans=0.0 2023-10-03 19:45:06,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1384560.0, ans=0.0 2023-10-03 19:45:08,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:45:08,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:45:08,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:18,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:18,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:45:18,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1384560.0, ans=0.0 2023-10-03 19:45:19,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:45:20,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:20,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 19:45:21,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:45:24,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:45:26,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:45:26,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:45:28,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:28,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 19:45:28,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1384626.6666666667, ans=0.1 2023-10-03 19:45:30,979 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 19:45:33,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:35,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:45:37,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 19:45:41,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:45:41,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1384693.3333333333, ans=0.125 2023-10-03 19:45:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:45:44,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1384693.3333333333, ans=0.125 2023-10-03 19:45:45,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:45:46,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:51,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1384760.0, ans=0.125 2023-10-03 19:45:52,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:56,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 19:45:56,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:45:56,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:46:00,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 19:46:00,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:46:02,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:46:04,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1384760.0, ans=0.0 2023-10-03 19:46:06,875 INFO [train.py:1046] (2/4) Epoch 40, batch 550, loss[loss=0.1633, simple_loss=0.2324, pruned_loss=0.04707, over 23753.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2374, pruned_loss=0.03876, over 4406767.26 frames. ], batch size: 212, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:46:06,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 19:46:08,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1384826.6666666667, ans=0.05 2023-10-03 19:46:10,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 19:46:10,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:10,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 19:46:10,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:46:11,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:13,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:14,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:46:15,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:46:17,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:46:19,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 19:46:19,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:46:24,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:24,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:27,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:46:29,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:29,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1384893.3333333333, ans=0.1 2023-10-03 19:46:33,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 19:46:33,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 19:46:35,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:46:40,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.820e+02 1.987e+02 2.259e+02 2.913e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 19:46:42,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:46:42,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:46:44,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:46:45,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1384960.0, ans=0.125 2023-10-03 19:46:46,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:46,860 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 19:46:46,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:48,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 19:46:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:46:52,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:46:52,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:46:53,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:54,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 19:46:54,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 19:46:56,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:46:56,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:46:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:46:57,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:47:00,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:47:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:47:01,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1385026.6666666667, ans=15.0 2023-10-03 19:47:03,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:47:05,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:06,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:47:06,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:47:07,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:47:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:47:08,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1385093.3333333333, ans=0.1 2023-10-03 19:47:09,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:10,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:47:10,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 19:47:12,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1385093.3333333333, ans=0.125 2023-10-03 19:47:15,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1385093.3333333333, ans=0.125 2023-10-03 19:47:18,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 19:47:19,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 19:47:21,116 INFO [train.py:1046] (2/4) Epoch 40, batch 600, loss[loss=0.1358, simple_loss=0.2191, pruned_loss=0.02622, over 24607.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2377, pruned_loss=0.03899, over 4477017.83 frames. ], batch size: 60, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:47:21,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:47:21,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:47:21,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:47:21,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1385160.0, ans=0.125 2023-10-03 19:47:21,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1385160.0, ans=0.2 2023-10-03 19:47:21,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.84 vs. limit=6.0 2023-10-03 19:47:29,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:47:31,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:47:32,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 19:47:32,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1385160.0, ans=0.125 2023-10-03 19:47:34,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:47:37,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:47:38,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:40,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 19:47:40,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:47:46,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 19:47:49,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:47:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:49,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:47:51,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1385293.3333333333, ans=0.125 2023-10-03 19:47:55,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:47:55,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1385293.3333333333, ans=0.125 2023-10-03 19:47:56,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:47:56,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:48:07,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:07,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:48:07,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:48:12,463 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.84 vs. limit=15.0 2023-10-03 19:48:14,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 19:48:20,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:48:20,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:48:22,908 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.02 vs. limit=15.0 2023-10-03 19:48:23,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 19:48:25,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:48:26,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 19:48:26,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:48:27,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1385426.6666666667, ans=10.0 2023-10-03 19:48:28,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:48:33,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 19:48:35,763 INFO [train.py:1046] (2/4) Epoch 40, batch 650, loss[loss=0.1545, simple_loss=0.2455, pruned_loss=0.03171, over 24682.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03876, over 4522738.23 frames. ], batch size: 65, lr: 2.55e-03, grad_scale: 4.0 2023-10-03 19:48:35,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:48:37,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:48:38,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:48:39,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:48:42,260 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.12 vs. limit=22.5 2023-10-03 19:48:43,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 19:48:44,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:50,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:48:50,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:48:53,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:48:54,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1385560.0, ans=0.125 2023-10-03 19:48:55,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1385560.0, ans=0.5 2023-10-03 19:48:56,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 19:48:58,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:48:59,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:49:02,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:49:02,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 19:49:02,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1385560.0, ans=0.0 2023-10-03 19:49:05,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:06,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:06,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:49:06,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:08,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:49:09,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:49:11,449 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.985e+02 2.192e+02 2.479e+02 3.880e+02, threshold=4.384e+02, percent-clipped=0.0 2023-10-03 19:49:11,531 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 19:49:11,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:11,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:49:14,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:14,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:49:16,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:16,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:49:17,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 19:49:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:49:20,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:49:20,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:49:21,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:49:23,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:49:24,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 19:49:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 19:49:26,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:26,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:49:26,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:49:26,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:49:28,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:49:33,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:33,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:49:33,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1385760.0, ans=0.07 2023-10-03 19:49:34,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:37,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:37,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 19:49:38,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:42,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-10-03 19:49:46,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:49:47,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:49:47,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:49:47,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1385760.0, ans=0.0 2023-10-03 19:49:48,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:49:49,721 INFO [train.py:1046] (2/4) Epoch 40, batch 700, loss[loss=0.1528, simple_loss=0.2248, pruned_loss=0.04037, over 23755.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.03821, over 4566048.58 frames. ], batch size: 232, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:49:52,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 19:49:52,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 19:49:56,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 19:49:56,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:57,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:49:58,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 19:50:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:50:04,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:50:07,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:50:08,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:50:08,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:50:10,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:50:13,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 19:50:13,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:50:16,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 19:50:19,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 19:50:21,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:50:21,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:50:23,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:50:28,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:50:29,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 19:50:33,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:50:35,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:50:35,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 19:50:35,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1386026.6666666667, ans=0.0 2023-10-03 19:50:39,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:50:39,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:50:42,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:50:49,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:50:50,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 19:50:53,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 19:50:53,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 19:50:56,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1386093.3333333333, ans=0.125 2023-10-03 19:50:57,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:50:59,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:50:59,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:51:00,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:51:00,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 19:51:03,468 INFO [train.py:1046] (2/4) Epoch 40, batch 750, loss[loss=0.169, simple_loss=0.2532, pruned_loss=0.04244, over 23393.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2349, pruned_loss=0.03802, over 4605956.05 frames. ], batch size: 93, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:51:05,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 19:51:05,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 19:51:06,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 19:51:06,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 19:51:06,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 19:51:08,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:51:08,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 19:51:09,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:51:11,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:51:12,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:14,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:51:15,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:51:15,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:51:18,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:51:18,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:51:20,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:51:23,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:24,270 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.85 vs. limit=10.0 2023-10-03 19:51:24,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:51:25,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 19:51:25,641 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.03 vs. limit=22.5 2023-10-03 19:51:26,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:51:26,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:51:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:51:29,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:51:30,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 19:51:32,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:51:33,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 19:51:34,852 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 19:51:34,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 19:51:34,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:51:34,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:51:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:51:37,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1386293.3333333333, ans=0.125 2023-10-03 19:51:39,314 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.912e+02 2.088e+02 2.370e+02 3.919e+02, threshold=4.175e+02, percent-clipped=0.0 2023-10-03 19:51:43,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:51:44,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:51:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:51:46,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:48,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:51:49,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 19:51:49,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:51:51,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 19:51:53,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:51:53,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1386360.0, ans=0.05 2023-10-03 19:51:54,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:51:54,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 19:51:56,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:01,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:04,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:52:04,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:04,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1386426.6666666667, ans=0.07 2023-10-03 19:52:06,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:52:08,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 19:52:10,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:52:10,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:14,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:16,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:18,012 INFO [train.py:1046] (2/4) Epoch 40, batch 800, loss[loss=0.1482, simple_loss=0.2405, pruned_loss=0.02797, over 24668.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2352, pruned_loss=0.03821, over 4617535.87 frames. ], batch size: 73, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:52:18,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1386493.3333333333, ans=0.125 2023-10-03 19:52:19,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:19,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:52:19,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1386493.3333333333, ans=0.2 2023-10-03 19:52:28,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:28,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:30,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:52:30,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:30,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1386493.3333333333, ans=0.04949747468305833 2023-10-03 19:52:31,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:31,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:33,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:35,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:52:39,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 19:52:40,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:41,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:52:41,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:52:41,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 19:52:42,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:42,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 19:52:45,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:47,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:49,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:49,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:52:52,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:52,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:53:00,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:53:00,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 19:53:04,141 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 19:53:04,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 19:53:04,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:53:04,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:06,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:08,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:53:11,075 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 19:53:11,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 19:53:12,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:53:14,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-10-03 19:53:15,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:53:18,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:53:20,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:21,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 19:53:23,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:53:24,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 19:53:30,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:53:35,553 INFO [train.py:1046] (2/4) Epoch 40, batch 850, loss[loss=0.1599, simple_loss=0.2293, pruned_loss=0.04527, over 23921.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2367, pruned_loss=0.03867, over 4640113.23 frames. ], batch size: 195, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:53:35,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:53:37,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 19:53:37,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:53:38,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:38,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 19:53:39,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:41,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:53:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:53:43,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:53:45,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:53:46,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 19:53:46,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 19:53:46,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 19:53:48,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:53:50,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:53:51,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:53:51,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:51,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:53:53,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1386893.3333333333, ans=0.125 2023-10-03 19:53:56,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:56,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:57,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 19:54:01,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 19:54:02,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1386893.3333333333, ans=0.125 2023-10-03 19:54:04,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1386960.0, ans=0.125 2023-10-03 19:54:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:54:07,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 19:54:08,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1386960.0, ans=0.125 2023-10-03 19:54:09,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 19:54:11,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 19:54:12,668 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.964e+02 2.128e+02 2.466e+02 3.367e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 19:54:12,823 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 19:54:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:54:14,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:54:14,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 19:54:15,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:16,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:18,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 19:54:19,323 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=22.5 2023-10-03 19:54:20,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:54:20,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:54:21,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:54:23,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:54:24,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:54:26,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:54:26,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 19:54:29,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:54:29,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:54:30,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:54:30,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:54:32,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:54:33,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:36,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:54:36,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:54:36,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:54:38,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:54:45,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:54:46,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:54:46,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 19:54:46,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:54:48,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:54:49,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.15 vs. limit=15.0 2023-10-03 19:54:50,060 INFO [train.py:1046] (2/4) Epoch 40, batch 900, loss[loss=0.1722, simple_loss=0.251, pruned_loss=0.04665, over 23423.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2372, pruned_loss=0.03888, over 4660563.14 frames. ], batch size: 285, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:54:50,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 19:54:56,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1387160.0, ans=0.2 2023-10-03 19:54:57,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:54:59,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:55:00,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 19:55:03,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:55:03,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 19:55:04,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:55:05,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:55:05,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:55:06,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:55:13,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1387226.6666666667, ans=0.1 2023-10-03 19:55:16,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:16,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:55:16,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:55:19,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:24,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 19:55:25,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:55:29,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:55:30,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:55:32,026 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 19:55:32,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1387293.3333333333, ans=0.0 2023-10-03 19:55:33,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 19:55:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:55:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:55:39,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:55:41,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1387360.0, ans=0.07 2023-10-03 19:55:44,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:44,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:55:44,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1387360.0, ans=0.1 2023-10-03 19:55:45,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 19:55:45,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:48,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 19:55:49,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:55:49,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:49,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1387426.6666666667, ans=0.125 2023-10-03 19:55:51,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:55:51,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:55:56,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 19:55:56,222 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 19:55:57,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 19:55:58,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 19:56:00,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:56:03,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 19:56:05,003 INFO [train.py:1046] (2/4) Epoch 40, batch 950, loss[loss=0.1487, simple_loss=0.23, pruned_loss=0.03373, over 24322.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2376, pruned_loss=0.0393, over 4673505.96 frames. ], batch size: 61, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:56:07,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:11,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1387493.3333333333, ans=0.125 2023-10-03 19:56:12,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:12,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:56:13,884 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 19:56:16,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:17,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:56:19,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:19,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:56:19,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 19:56:20,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1387560.0, ans=0.0 2023-10-03 19:56:21,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 19:56:22,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:24,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 19:56:26,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:56:27,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1387560.0, ans=0.1 2023-10-03 19:56:30,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:30,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:56:30,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:56:31,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 19:56:34,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:56:34,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:56:36,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:56:41,686 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 1.970e+02 2.172e+02 2.506e+02 3.661e+02, threshold=4.343e+02, percent-clipped=0.0 2023-10-03 19:56:41,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:56:41,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:44,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 19:56:47,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 19:56:47,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:56:48,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:56:48,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:48,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:56:52,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 19:56:54,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:56:55,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:56:57,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:57,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 19:56:57,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:57,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:56:58,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 19:57:01,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:57:05,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:57:10,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:57:11,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 19:57:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 19:57:13,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:57:19,399 INFO [train.py:1046] (2/4) Epoch 40, batch 1000, loss[loss=0.1506, simple_loss=0.2381, pruned_loss=0.03158, over 24309.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2368, pruned_loss=0.03885, over 4688730.53 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:57:20,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 19:57:20,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:25,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:57:25,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 19:57:25,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 19:57:31,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:31,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:57:32,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:57:34,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 19:57:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 19:57:40,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 19:57:40,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:57:43,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 19:57:44,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 19:57:45,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 19:57:46,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:47,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:57,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:57:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:57:58,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:58,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:58,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 19:57:58,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:57:59,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:57:59,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:58:01,252 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 19:58:05,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 19:58:06,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 19:58:08,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 19:58:09,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:58:10,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.37 vs. limit=15.0 2023-10-03 19:58:14,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:14,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:58:14,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:16,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:58:17,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 19:58:20,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:58:20,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 19:58:22,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 19:58:23,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:58:23,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:58:26,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:58:28,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:58:30,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:58:32,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:58:32,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:58:33,950 INFO [train.py:1046] (2/4) Epoch 40, batch 1050, loss[loss=0.1737, simple_loss=0.257, pruned_loss=0.04523, over 23599.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.03855, over 4697600.54 frames. ], batch size: 85, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:58:35,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:58:36,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:39,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:58:42,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:58:44,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:58:46,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:58:48,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:58:48,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:58:48,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:58:50,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 19:58:52,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:58:52,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 19:58:54,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:58:56,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 19:58:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 19:59:01,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:59:01,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:59:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:59:04,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 19:59:04,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 19:59:04,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:59:08,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 19:59:09,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1388293.3333333333, ans=0.1 2023-10-03 19:59:10,268 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.893e+02 2.073e+02 2.289e+02 3.386e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 19:59:10,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 19:59:11,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:12,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1388293.3333333333, ans=0.125 2023-10-03 19:59:12,408 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.18 vs. limit=22.5 2023-10-03 19:59:15,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 19:59:18,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 19:59:18,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:59:19,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:59:22,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:59:24,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1388360.0, ans=0.0 2023-10-03 19:59:27,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 19:59:27,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1388360.0, ans=0.125 2023-10-03 19:59:28,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 19:59:28,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 19:59:28,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:59:30,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:59:31,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 19:59:33,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1388426.6666666667, ans=0.125 2023-10-03 19:59:34,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:59:36,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:59:36,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:59:37,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:59:37,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:40,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:40,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 19:59:43,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:59:43,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 19:59:43,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 19:59:44,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:59:47,895 INFO [train.py:1046] (2/4) Epoch 40, batch 1100, loss[loss=0.1572, simple_loss=0.2299, pruned_loss=0.04229, over 23692.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2356, pruned_loss=0.03806, over 4706804.36 frames. ], batch size: 150, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:59:49,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:59:54,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:59:57,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:00:00,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:00:00,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:01,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 20:00:01,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:02,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.74 vs. limit=22.5 2023-10-03 20:00:04,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.41 vs. limit=10.0 2023-10-03 20:00:04,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 20:00:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:00:07,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:00:09,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 20:00:10,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:00:10,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:10,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:00:14,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:00:16,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:00:20,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:00:24,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 20:00:25,901 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 20:00:27,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:28,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:31,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:00:31,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:00:34,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 20:00:35,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:00:35,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:00:35,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:00:35,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:35,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 20:00:41,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:00:41,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 20:00:44,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:00:46,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:00:50,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 20:00:50,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:00:51,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:53,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:55,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:55,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 20:00:55,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:00:55,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:58,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 20:00:58,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:00:58,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 20:00:59,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:00:59,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:01:01,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:01:02,410 INFO [train.py:1046] (2/4) Epoch 40, batch 1150, loss[loss=0.1631, simple_loss=0.2383, pruned_loss=0.04393, over 23695.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2361, pruned_loss=0.03818, over 4708814.04 frames. ], batch size: 232, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:01:06,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:06,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1388826.6666666667, ans=10.0 2023-10-03 20:01:09,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:01:09,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1388826.6666666667, ans=0.05 2023-10-03 20:01:11,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:01:11,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:01:11,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 20:01:12,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:01:14,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 20:01:15,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:15,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:01:20,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1388893.3333333333, ans=0.125 2023-10-03 20:01:21,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 20:01:24,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:01:29,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:30,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:30,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 20:01:30,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:01:30,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:01:35,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 20:01:35,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1388960.0, ans=0.09899494936611666 2023-10-03 20:01:36,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:01:36,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:01:39,028 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.015e+02 2.225e+02 2.485e+02 5.014e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-03 20:01:45,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:46,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.31 vs. limit=22.5 2023-10-03 20:01:51,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:51,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 20:01:52,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:52,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:57,338 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 20:01:59,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:02:04,953 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 20:02:09,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:09,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:02:10,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:02:10,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:02:13,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:02:16,506 INFO [train.py:1046] (2/4) Epoch 40, batch 1200, loss[loss=0.1524, simple_loss=0.2455, pruned_loss=0.02962, over 24326.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2371, pruned_loss=0.0387, over 4709860.18 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:02:18,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:02:18,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:02:21,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:02:21,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:22,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:02:24,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1389160.0, ans=0.035 2023-10-03 20:02:25,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:02:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:02:28,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:02:28,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:02:31,872 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 20:02:32,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1389226.6666666667, ans=0.0 2023-10-03 20:02:34,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 20:02:36,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:02:37,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:02:40,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:02:41,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:02:41,812 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 20:02:43,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:49,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:02:49,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:02:49,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 20:02:50,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:02:53,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1389293.3333333333, ans=0.125 2023-10-03 20:02:54,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 20:02:55,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1389293.3333333333, ans=0.125 2023-10-03 20:02:57,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1389293.3333333333, ans=0.0 2023-10-03 20:03:00,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 20:03:00,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:03:02,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:03:03,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:03:04,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:03:06,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:03:06,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:03:06,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:03:07,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 20:03:07,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:03:09,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:03:09,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:03:11,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:03:11,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:03:12,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1389360.0, ans=0.125 2023-10-03 20:03:12,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.50 vs. limit=15.0 2023-10-03 20:03:16,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:03:17,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:03:20,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 20:03:25,508 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 20:03:26,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-10-03 20:03:26,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:03:28,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:03:30,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:03:31,677 INFO [train.py:1046] (2/4) Epoch 40, batch 1250, loss[loss=0.1574, simple_loss=0.2401, pruned_loss=0.03733, over 23405.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2385, pruned_loss=0.03923, over 4706789.49 frames. ], batch size: 93, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:03:31,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:03:35,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 20:03:37,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:03:39,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:03:40,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 20:03:42,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:03:42,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:03:48,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:03:48,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:03:49,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:03:49,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:03:52,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:03:55,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:03:55,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:03:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:03:55,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1389560.0, ans=0.0 2023-10-03 20:03:57,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1389560.0, ans=0.125 2023-10-03 20:03:58,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:03:58,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:01,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:02,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:04:09,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 20:04:09,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:04:10,317 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.900e+02 2.073e+02 2.356e+02 3.253e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-03 20:04:13,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:04:13,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 20:04:14,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:04:14,626 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 20:04:14,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:14,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:18,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:22,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:23,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:04:23,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 20:04:23,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 20:04:24,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 20:04:27,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.48 vs. limit=15.0 2023-10-03 20:04:27,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:04:29,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 20:04:29,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:31,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 20:04:32,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:04:34,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 20:04:34,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:04:34,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:04:34,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:04:35,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:04:38,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 20:04:38,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1389760.0, ans=0.2 2023-10-03 20:04:40,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:04:41,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:04:43,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:04:45,733 INFO [train.py:1046] (2/4) Epoch 40, batch 1300, loss[loss=0.1653, simple_loss=0.2544, pruned_loss=0.03814, over 24284.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.238, pruned_loss=0.03878, over 4729411.16 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:04:45,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:04:47,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:04:49,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 20:04:49,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1389826.6666666667, ans=0.0 2023-10-03 20:04:54,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:04:55,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:04:57,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:04:58,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:05:00,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:05:00,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 20:05:06,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:05:06,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:05:07,204 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.67 vs. limit=15.0 2023-10-03 20:05:09,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 20:05:12,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:05:14,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1389960.0, ans=0.125 2023-10-03 20:05:16,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:16,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:05:16,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1389960.0, ans=0.1 2023-10-03 20:05:18,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:05:18,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1389960.0, ans=0.025 2023-10-03 20:05:18,754 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.11 vs. limit=15.0 2023-10-03 20:05:19,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:19,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:05:21,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:05:22,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 20:05:24,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1389960.0, ans=0.125 2023-10-03 20:05:27,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:05:27,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:05:29,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 20:05:29,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:05:31,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:05:32,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1390026.6666666667, ans=0.125 2023-10-03 20:05:34,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:05:34,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 20:05:35,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:05:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 20:05:37,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:05:41,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:05:41,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:05:42,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 20:05:44,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 20:05:44,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 20:05:50,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:05:53,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 20:05:54,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:59,083 INFO [train.py:1046] (2/4) Epoch 40, batch 1350, loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.03579, over 24485.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.03882, over 4718889.11 frames. ], batch size: 63, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:06:01,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 20:06:04,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:07,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:06:08,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:10,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:06:11,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:06:16,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:06:17,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 20:06:17,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:06:17,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:06:20,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 20:06:22,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:06:23,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:06:23,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 20:06:26,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 20:06:26,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 20:06:27,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:27,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 20:06:38,028 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.449e+02 1.913e+02 2.157e+02 2.390e+02 3.072e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-03 20:06:39,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:42,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1390360.0, ans=0.07 2023-10-03 20:06:49,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:49,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:06:51,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 20:06:54,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:06:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 20:06:54,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:06:55,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:58,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:07:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 20:07:04,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:07:05,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1390426.6666666667, ans=0.2 2023-10-03 20:07:07,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1390426.6666666667, ans=15.0 2023-10-03 20:07:07,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 20:07:10,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 20:07:13,509 INFO [train.py:1046] (2/4) Epoch 40, batch 1400, loss[loss=0.1455, simple_loss=0.2152, pruned_loss=0.0379, over 23447.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2351, pruned_loss=0.0384, over 4715466.36 frames. ], batch size: 285, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:07:16,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 20:07:18,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:07:21,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:07:22,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:07:26,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 20:07:28,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 20:07:36,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:07:39,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:07:40,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:07:40,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:07:44,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:07:46,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 20:07:54,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:07:55,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:07:56,086 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.49 vs. limit=15.0 2023-10-03 20:07:57,401 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.75 vs. limit=15.0 2023-10-03 20:07:59,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 20:08:00,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:08:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:08:02,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:08:02,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:08:04,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:08:04,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:08:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:08:06,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 20:08:06,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:08:10,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:13,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:08:22,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 20:08:22,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 20:08:23,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:08:26,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 20:08:26,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:08:26,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1390826.6666666667, ans=0.1 2023-10-03 20:08:27,594 INFO [train.py:1046] (2/4) Epoch 40, batch 1450, loss[loss=0.1474, simple_loss=0.221, pruned_loss=0.03691, over 23749.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2349, pruned_loss=0.03804, over 4712988.62 frames. ], batch size: 232, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:08:29,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:08:31,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1390826.6666666667, ans=0.0 2023-10-03 20:08:33,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:08:35,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:08:35,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:36,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 20:08:41,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:08:41,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:08:42,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:08:42,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 20:08:43,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:08:45,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 20:08:45,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:46,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:46,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 20:08:47,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:08:47,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:08:49,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 20:08:49,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:50,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:08:53,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:55,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:59,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:08:59,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:09:01,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:09:01,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:09:02,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1390960.0, ans=0.125 2023-10-03 20:09:04,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:09:04,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:09:04,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:09:04,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1390960.0, ans=0.125 2023-10-03 20:09:05,922 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.904e+02 2.065e+02 2.334e+02 4.319e+02, threshold=4.131e+02, percent-clipped=1.0 2023-10-03 20:09:05,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:09,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 20:09:10,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:09:12,170 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 20:09:14,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:09:16,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:09:17,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:18,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 20:09:22,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:23,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 20:09:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 20:09:26,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:28,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:09:28,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:09:29,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 20:09:32,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 20:09:34,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 20:09:36,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:09:41,638 INFO [train.py:1046] (2/4) Epoch 40, batch 1500, loss[loss=0.1294, simple_loss=0.2058, pruned_loss=0.02647, over 21772.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.235, pruned_loss=0.0382, over 4716408.02 frames. ], batch size: 48, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:09:47,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 20:09:47,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:09:47,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:09:48,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:48,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:09:49,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:09:51,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 20:09:51,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:09:53,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:09:53,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:09:53,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:09:56,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:09:57,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:03,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:03,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 20:10:05,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:10:05,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:10:07,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:10:09,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 20:10:12,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1391293.3333333333, ans=0.0 2023-10-03 20:10:14,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 20:10:15,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:10:15,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1391293.3333333333, ans=0.0 2023-10-03 20:10:16,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 20:10:18,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:10:21,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:10:22,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:10:22,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:10:23,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 20:10:23,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:10:23,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:10:24,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1391360.0, ans=0.07 2023-10-03 20:10:25,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 20:10:25,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:10:31,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:10:31,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 20:10:35,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:10:36,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:10:38,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1391360.0, ans=0.1 2023-10-03 20:10:40,943 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 20:10:41,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:41,007 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 20:10:42,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:10:43,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:10:43,828 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 20:10:45,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:10:47,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 20:10:49,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:51,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:51,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:51,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:53,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:53,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:10:54,969 INFO [train.py:1046] (2/4) Epoch 40, batch 1550, loss[loss=0.1439, simple_loss=0.2344, pruned_loss=0.02668, over 24466.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.235, pruned_loss=0.03801, over 4721189.91 frames. ], batch size: 63, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:10:55,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 20:10:56,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 20:10:56,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:10:58,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 20:10:58,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 20:10:59,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:11:01,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:01,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1391493.3333333333, ans=0.125 2023-10-03 20:11:02,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:11:02,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:11:02,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=1391493.3333333333, ans=0.1 2023-10-03 20:11:05,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:05,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:09,106 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 20:11:10,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:10,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:11:10,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:11:11,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:11:11,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 20:11:14,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:11:14,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 20:11:17,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 20:11:17,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 20:11:17,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:18,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:22,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:11:24,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 20:11:24,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 20:11:27,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1391626.6666666667, ans=0.125 2023-10-03 20:11:33,599 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.912e+02 2.147e+02 2.431e+02 4.744e+02, threshold=4.295e+02, percent-clipped=1.0 2023-10-03 20:11:33,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:39,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:11:39,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:11:39,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:11:39,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 20:11:45,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:11:46,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:48,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.87 vs. limit=15.0 2023-10-03 20:11:49,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.12 vs. limit=10.0 2023-10-03 20:11:49,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:11:51,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1391693.3333333333, ans=0.125 2023-10-03 20:11:52,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:11:52,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:52,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 20:11:52,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:11:54,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:11:54,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:55,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 20:11:55,606 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 20:11:57,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:59,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.98 vs. limit=15.0 2023-10-03 20:12:03,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 20:12:05,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1391760.0, ans=0.125 2023-10-03 20:12:08,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1391826.6666666667, ans=0.1 2023-10-03 20:12:09,088 INFO [train.py:1046] (2/4) Epoch 40, batch 1600, loss[loss=0.1598, simple_loss=0.2472, pruned_loss=0.03615, over 24453.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2368, pruned_loss=0.0384, over 4712153.10 frames. ], batch size: 69, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:12:09,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:12:09,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:09,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 20:12:11,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:12:12,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:12:12,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:12:12,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:12:13,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:12:17,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:17,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 20:12:17,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 20:12:19,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 20:12:20,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.70 vs. limit=22.5 2023-10-03 20:12:20,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:12:22,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 20:12:22,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:12:25,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:12:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:12:33,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1391893.3333333333, ans=0.125 2023-10-03 20:12:34,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 20:12:34,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1391893.3333333333, ans=0.07 2023-10-03 20:12:34,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1391893.3333333333, ans=0.2 2023-10-03 20:12:34,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1391893.3333333333, ans=0.2 2023-10-03 20:12:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:12:38,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 20:12:38,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:39,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.17 vs. limit=10.0 2023-10-03 20:12:40,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 20:12:45,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 20:12:51,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:51,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 20:12:52,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:52,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:12:52,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:12:54,740 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.97 vs. limit=22.5 2023-10-03 20:12:55,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 20:12:59,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:13:02,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:13:02,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:04,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:04,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:13:06,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:13:06,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1392026.6666666667, ans=0.0 2023-10-03 20:13:08,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:13:08,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1392093.3333333333, ans=0.125 2023-10-03 20:13:10,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:13:14,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1392093.3333333333, ans=0.0 2023-10-03 20:13:16,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:16,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:13:18,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 20:13:18,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:13:18,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1392093.3333333333, ans=0.0 2023-10-03 20:13:20,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 20:13:23,580 INFO [train.py:1046] (2/4) Epoch 40, batch 1650, loss[loss=0.1606, simple_loss=0.2514, pruned_loss=0.03493, over 24289.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2372, pruned_loss=0.03835, over 4726409.11 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:13:23,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:13:25,295 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:13:26,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:13:26,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:13:27,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 20:13:27,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 20:13:27,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 20:13:27,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 20:13:33,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:33,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:13:33,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:13:33,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:13:36,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:13:38,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 20:13:41,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:13:41,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:13:41,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:13:41,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:13:41,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 20:13:43,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 20:13:46,543 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.95 vs. limit=22.5 2023-10-03 20:13:48,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:13:50,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:13:58,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 20:13:58,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:13:59,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 20:14:01,155 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.943e+02 2.116e+02 2.383e+02 3.563e+02, threshold=4.232e+02, percent-clipped=0.0 2023-10-03 20:14:01,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:04,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:14:04,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:14:06,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:08,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:14:08,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:10,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:11,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:11,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:14:11,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1392360.0, ans=0.125 2023-10-03 20:14:12,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:14:14,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:14:15,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:14:17,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:14:17,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 20:14:18,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:14:18,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 20:14:19,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 20:14:21,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 20:14:21,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:14:21,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:14:22,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:22,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:22,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 20:14:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:28,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:14:28,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1392426.6666666667, ans=0.125 2023-10-03 20:14:29,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:30,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.51 vs. limit=10.0 2023-10-03 20:14:31,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 20:14:35,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:35,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:14:35,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 20:14:37,329 INFO [train.py:1046] (2/4) Epoch 40, batch 1700, loss[loss=0.1511, simple_loss=0.2458, pruned_loss=0.02817, over 24286.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2377, pruned_loss=0.03904, over 4702720.45 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:14:37,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:14:37,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:14:37,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:40,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:14:40,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:14:41,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 20:14:44,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:14:48,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1392493.3333333333, ans=0.0 2023-10-03 20:14:50,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:53,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:14:57,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1392560.0, ans=0.1 2023-10-03 20:15:00,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:15:00,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:15:02,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:15:02,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:15:03,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 20:15:04,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:15:06,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:06,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:15:09,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:15:10,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 20:15:12,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 20:15:12,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:14,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 20:15:15,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:15:22,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:23,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:23,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:15:26,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:15:26,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 20:15:26,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:15:30,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:30,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 20:15:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:15:30,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:15:32,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:32,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:15:34,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:15:34,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:15:35,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:35,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:15:35,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:41,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:15:42,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 20:15:44,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:46,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:15:48,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 20:15:50,205 INFO [train.py:1046] (2/4) Epoch 40, batch 1750, loss[loss=0.1487, simple_loss=0.2214, pruned_loss=0.03802, over 23650.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2361, pruned_loss=0.03841, over 4707637.66 frames. ], batch size: 256, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:15:53,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:54,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1392826.6666666667, ans=0.125 2023-10-03 20:15:55,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:15:56,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1392826.6666666667, ans=0.125 2023-10-03 20:15:57,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:15:57,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 20:15:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:16:00,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:16:00,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:06,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 20:16:07,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:09,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 20:16:09,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:16:10,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:16:14,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:16:15,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 20:16:16,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:16:18,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 20:16:22,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1392960.0, ans=0.0 2023-10-03 20:16:25,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:16:26,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:16:26,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:16:29,453 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 1.982e+02 2.207e+02 2.647e+02 3.651e+02, threshold=4.414e+02, percent-clipped=0.0 2023-10-03 20:16:29,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:29,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:16:31,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:16:34,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:34,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1393026.6666666667, ans=0.125 2023-10-03 20:16:37,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:16:37,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1393026.6666666667, ans=0.125 2023-10-03 20:16:38,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:16:39,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 20:16:42,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:16:43,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 20:16:45,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:16:46,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:46,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:16:50,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:16:50,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 20:16:50,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:51,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.41 vs. limit=22.5 2023-10-03 20:16:52,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:16:54,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:59,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:16:59,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:17:02,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 20:17:02,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:17:03,614 INFO [train.py:1046] (2/4) Epoch 40, batch 1800, loss[loss=0.1642, simple_loss=0.2591, pruned_loss=0.03471, over 24520.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.0382, over 4704640.94 frames. ], batch size: 71, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:17:03,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:17:03,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:03,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:17:03,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:17:03,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:17:07,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:17:08,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:17:11,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:17:13,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:17:16,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:17:17,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:17:20,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:17:21,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:23,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:23,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:17:25,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:17:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 20:17:27,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:29,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:34,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 20:17:36,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 20:17:37,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 20:17:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:17:39,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:39,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:17:39,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:17:46,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 20:17:47,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:17:50,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:51,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 20:17:52,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 20:17:52,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:17:53,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:17:54,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:17:59,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 20:18:05,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:18:07,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 20:18:07,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:18:07,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:18:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:18:08,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 20:18:10,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:18:10,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:18:13,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 20:18:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:18:15,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:18:16,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:18:16,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:18:16,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:18:18,159 INFO [train.py:1046] (2/4) Epoch 40, batch 1850, loss[loss=0.1696, simple_loss=0.2576, pruned_loss=0.04081, over 24367.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2358, pruned_loss=0.03811, over 4704089.84 frames. ], batch size: 77, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:18:18,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:18:19,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:18:19,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:18:22,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:18:22,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:18:29,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:18:30,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 20:18:33,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 20:18:36,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 20:18:39,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:18:41,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 20:18:41,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 20:18:45,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1393626.6666666667, ans=0.125 2023-10-03 20:18:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:18:54,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 20:18:57,063 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.932e+02 2.118e+02 2.522e+02 3.488e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 20:18:57,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:18:57,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:18:59,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 20:19:01,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:01,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:19:01,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1393693.3333333333, ans=0.1 2023-10-03 20:19:02,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:19:02,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:19:05,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:19:09,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:19:10,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:10,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:19:10,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:13,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:19:14,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:19:18,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 20:19:19,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:19:22,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:19:23,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.83 vs. limit=22.5 2023-10-03 20:19:23,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:19:23,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 20:19:23,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 20:19:25,408 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 20:19:25,494 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 20:19:25,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1393760.0, ans=0.0 2023-10-03 20:19:26,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:19:28,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:19:28,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:19:28,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:28,265 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 20:19:28,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1393760.0, ans=0.125 2023-10-03 20:19:29,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:19:29,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:29,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:19:30,875 INFO [train.py:1046] (2/4) Epoch 40, batch 1900, loss[loss=0.1666, simple_loss=0.2423, pruned_loss=0.04544, over 23606.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03797, over 4714965.95 frames. ], batch size: 256, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:19:30,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:19:31,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:19:31,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 20:19:33,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:33,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 20:19:33,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:19:33,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:39,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:41,233 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.40 vs. limit=15.0 2023-10-03 20:19:42,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:19:42,172 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 20:19:43,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 20:19:44,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:19:45,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:19:45,572 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 20:19:45,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1393893.3333333333, ans=0.0 2023-10-03 20:19:46,901 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 20:19:50,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 20:19:50,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:19:52,309 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.62 vs. limit=6.0 2023-10-03 20:19:55,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 20:19:58,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 20:19:58,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1393893.3333333333, ans=0.125 2023-10-03 20:20:10,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 20:20:11,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 20:20:12,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:12,921 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 20:20:12,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 20:20:12,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 20:20:14,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 20:20:14,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:20:17,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 20:20:19,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1394026.6666666667, ans=0.1 2023-10-03 20:20:19,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1394026.6666666667, ans=0.09899494936611666 2023-10-03 20:20:22,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:20:22,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1394026.6666666667, ans=0.0 2023-10-03 20:20:23,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-10-03 20:20:25,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:20:25,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 20:20:26,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:20:29,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 20:20:31,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:20:35,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1394093.3333333333, ans=0.0 2023-10-03 20:20:36,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:20:36,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:20:36,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:20:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:20:40,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:20:40,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:20:40,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:20:43,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:20:43,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:20:46,358 INFO [train.py:1046] (2/4) Epoch 40, batch 1950, loss[loss=0.1848, simple_loss=0.2483, pruned_loss=0.06067, over 23729.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03849, over 4710372.63 frames. ], batch size: 164, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:20:46,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:20:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:20:46,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:20:46,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1394160.0, ans=0.125 2023-10-03 20:20:47,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:20:50,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1394160.0, ans=0.1 2023-10-03 20:20:51,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:20:52,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:20:53,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:53,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:20:55,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 20:20:57,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 20:20:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:58,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:01,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:21:02,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:02,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:05,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:21:07,313 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.51 vs. limit=22.5 2023-10-03 20:21:08,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:21:08,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:21:08,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:21:08,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:13,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:15,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:21:15,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:15,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:21:15,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 20:21:17,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:21:17,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:21:19,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:20,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:23,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:21:26,796 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.972e+02 2.252e+02 2.613e+02 4.035e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-03 20:21:26,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:21:29,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:21:31,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:21:31,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 20:21:31,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:21:35,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:21:35,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:21:36,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:21:45,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:48,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:49,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:52,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:54,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:21:54,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:55,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 20:21:55,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:21:57,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:59,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 20:22:00,500 INFO [train.py:1046] (2/4) Epoch 40, batch 2000, loss[loss=0.172, simple_loss=0.2584, pruned_loss=0.0428, over 24674.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2374, pruned_loss=0.03878, over 4708785.87 frames. ], batch size: 73, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:22:00,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:22:04,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:22:04,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:22:06,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:22:07,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:22:09,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:11,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 20:22:13,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:22:13,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.51 vs. limit=22.5 2023-10-03 20:22:14,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:22:16,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 20:22:17,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:22:18,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:22:19,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:22:23,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 20:22:24,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 20:22:28,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:22:30,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 20:22:30,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:22:33,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:22:34,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:22:34,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:34,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:22:35,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:22:37,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 20:22:38,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 20:22:38,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:22:38,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:22:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:22:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:22:47,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:22:49,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:22:49,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:49,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:22:49,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:52,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:55,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:22:57,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 20:23:01,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:23:03,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:07,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:07,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:23:08,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:11,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:23:11,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:11,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1394760.0, ans=0.5 2023-10-03 20:23:12,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.02 vs. limit=15.0 2023-10-03 20:23:12,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:23:12,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:23:14,229 INFO [train.py:1046] (2/4) Epoch 40, batch 2050, loss[loss=0.1547, simple_loss=0.2269, pruned_loss=0.04126, over 24316.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2368, pruned_loss=0.03875, over 4699618.17 frames. ], batch size: 56, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:23:14,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-03 20:23:15,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:15,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:17,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-10-03 20:23:18,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:23:20,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:21,407 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.12 vs. limit=15.0 2023-10-03 20:23:25,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:23:26,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:23:26,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:28,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:23:30,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 20:23:30,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:23:31,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:23:31,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:23:39,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:23:39,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:40,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 20:23:45,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:46,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 20:23:46,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:23:49,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:23:53,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:23:54,420 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.914e+02 2.146e+02 2.312e+02 3.091e+02, threshold=4.293e+02, percent-clipped=0.0 2023-10-03 20:23:54,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:23:54,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:23:55,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:23:56,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1394960.0, ans=0.2 2023-10-03 20:23:57,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:23:57,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:24:00,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:01,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:24:04,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:24:04,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:24:08,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:24:08,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1395026.6666666667, ans=0.2 2023-10-03 20:24:14,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:24:15,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 20:24:20,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:24:21,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:24:22,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:24:24,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 20:24:28,245 INFO [train.py:1046] (2/4) Epoch 40, batch 2100, loss[loss=0.1322, simple_loss=0.1885, pruned_loss=0.03796, over 19291.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2355, pruned_loss=0.03833, over 4704610.84 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:24:28,327 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 20:24:28,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:24:28,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:28,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:24:30,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:24:31,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 20:24:31,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 20:24:33,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:24:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:24:35,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.18 vs. limit=22.5 2023-10-03 20:24:35,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:24:37,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1395160.0, ans=0.1 2023-10-03 20:24:38,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:24:39,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:24:39,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 20:24:41,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:24:42,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 20:24:42,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 20:24:44,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:24:44,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:24:44,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 20:24:44,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 20:24:50,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 20:24:50,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:24:53,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:24:53,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:24:56,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1395293.3333333333, ans=0.1 2023-10-03 20:24:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 20:24:58,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:24:58,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 20:24:59,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 20:25:01,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:01,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 20:25:01,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 20:25:02,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 20:25:05,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:25:06,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:25:08,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:25:09,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:25:09,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:11,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:11,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 20:25:12,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:12,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:12,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:12,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 20:25:12,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1395360.0, ans=0.125 2023-10-03 20:25:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 20:25:15,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 20:25:17,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1395360.0, ans=0.0 2023-10-03 20:25:20,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:25:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:25:24,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 20:25:27,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:31,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:25:31,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:25:32,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:25:32,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 20:25:32,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:25:33,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:33,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:25:35,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:25:35,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:36,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 20:25:38,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 20:25:38,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:25:40,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:40,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:25:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:25:42,013 INFO [train.py:1046] (2/4) Epoch 40, batch 2150, loss[loss=0.1637, simple_loss=0.247, pruned_loss=0.04022, over 23959.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2344, pruned_loss=0.03784, over 4701891.21 frames. ], batch size: 80, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:25:42,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:25:45,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 20:25:47,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:25:48,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:51,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:25:51,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:25:51,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:25:51,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1395493.3333333333, ans=0.0 2023-10-03 20:25:55,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:55,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:25:55,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:25:57,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1395560.0, ans=0.2 2023-10-03 20:25:59,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:25:59,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 20:26:04,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:05,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:26:05,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:05,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:05,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1395560.0, ans=0.0 2023-10-03 20:26:07,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:07,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:26:07,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:26:07,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:26:08,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:26:08,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 20:26:08,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1395560.0, ans=0.0 2023-10-03 20:26:09,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:26:11,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:26:13,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:15,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:26:16,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.36 vs. limit=22.5 2023-10-03 20:26:16,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:26:16,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1395626.6666666667, ans=0.07 2023-10-03 20:26:19,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:26:19,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:26:19,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1395626.6666666667, ans=0.0 2023-10-03 20:26:21,716 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.888e+02 2.102e+02 2.294e+02 3.502e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-03 20:26:21,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:21,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 20:26:21,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:26:25,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:25,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1395693.3333333333, ans=0.0 2023-10-03 20:26:26,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:27,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:27,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:26:28,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1395693.3333333333, ans=0.125 2023-10-03 20:26:28,504 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.38 vs. limit=15.0 2023-10-03 20:26:29,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:30,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:30,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 20:26:32,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 20:26:32,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:26:32,156 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 20:26:32,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:32,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:26:32,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1395693.3333333333, ans=0.0 2023-10-03 20:26:32,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1395693.3333333333, ans=0.2 2023-10-03 20:26:34,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 20:26:34,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:26:34,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 20:26:34,201 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 20:26:34,202 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 20:26:35,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 20:26:36,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:36,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:26:38,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:26:39,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:39,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:26:39,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1395760.0, ans=0.0 2023-10-03 20:26:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:41,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:50,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:26:51,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 20:26:52,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1395760.0, ans=0.125 2023-10-03 20:26:54,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:26:56,438 INFO [train.py:1046] (2/4) Epoch 40, batch 2200, loss[loss=0.16, simple_loss=0.2522, pruned_loss=0.03392, over 24563.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2348, pruned_loss=0.03802, over 4702073.80 frames. ], batch size: 71, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:27:01,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:01,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:27:03,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:03,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:27:06,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:27:06,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:27:06,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 20:27:10,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 20:27:14,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:27:21,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 20:27:22,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:24,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:27:24,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:27:25,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1395960.0, ans=0.2 2023-10-03 20:27:27,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:27:27,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 20:27:30,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:27:30,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1395960.0, ans=0.2 2023-10-03 20:27:31,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:31,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 20:27:34,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:27:36,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:27:38,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:27:39,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:41,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 20:27:44,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:45,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 20:27:48,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:48,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:27:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:51,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:27:52,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:27:52,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:52,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:54,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:27:55,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:27:57,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:27:59,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:28:00,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:28:01,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:28:03,086 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 20:28:05,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:28:05,862 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 20:28:07,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:28:07,303 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 20:28:08,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:09,904 INFO [train.py:1046] (2/4) Epoch 40, batch 2250, loss[loss=0.1488, simple_loss=0.2273, pruned_loss=0.03511, over 23233.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.03828, over 4714955.74 frames. ], batch size: 105, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:28:09,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:28:12,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:13,498 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 20:28:14,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:28:18,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:28:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:28:22,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:28:25,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:27,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:28:27,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1396226.6666666667, ans=0.125 2023-10-03 20:28:28,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:28:30,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 20:28:31,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:28:31,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:28:33,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1396226.6666666667, ans=0.0 2023-10-03 20:28:34,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 20:28:34,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1396226.6666666667, ans=0.0 2023-10-03 20:28:35,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:28:35,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:37,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:28:43,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:28:44,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:28:44,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:28:46,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 20:28:47,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:50,638 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.901e+02 2.050e+02 2.328e+02 3.368e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-03 20:28:50,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:28:52,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1396293.3333333333, ans=0.125 2023-10-03 20:28:55,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:28:56,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:28:58,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:58,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:28:59,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:29:01,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:29:05,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:29:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:29:10,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:29:10,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:29:11,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:29:16,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:29:19,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:29:19,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 20:29:19,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:19,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:29:23,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 20:29:25,040 INFO [train.py:1046] (2/4) Epoch 40, batch 2300, loss[loss=0.1616, simple_loss=0.2489, pruned_loss=0.03719, over 24059.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2353, pruned_loss=0.03813, over 4726723.35 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:29:25,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:29:25,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:29,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:30,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:29:33,973 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 20:29:35,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:29:35,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1396493.3333333333, ans=0.0 2023-10-03 20:29:40,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:29:40,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:29:42,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:29:42,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:29:42,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 20:29:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:29:45,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:29:45,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:29:45,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1396560.0, ans=0.125 2023-10-03 20:29:49,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:29:49,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1396560.0, ans=0.125 2023-10-03 20:29:52,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:29:53,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.70 vs. limit=15.0 2023-10-03 20:29:55,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:00,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:30:00,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:30:05,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:30:05,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1396626.6666666667, ans=0.125 2023-10-03 20:30:06,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:30:09,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:30:10,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:30:12,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:30:12,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 20:30:18,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:30:18,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:18,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:18,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:30:18,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:30:18,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 20:30:19,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.45 vs. limit=15.0 2023-10-03 20:30:19,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:30:19,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 20:30:19,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:30:19,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:20,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.81 vs. limit=15.0 2023-10-03 20:30:21,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 20:30:27,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.46 vs. limit=15.0 2023-10-03 20:30:28,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:30:30,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=1396760.0, ans=0.2 2023-10-03 20:30:32,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:30:34,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:30:35,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:30:35,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:30:36,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:30:38,905 INFO [train.py:1046] (2/4) Epoch 40, batch 2350, loss[loss=0.1405, simple_loss=0.2197, pruned_loss=0.0306, over 19482.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.03825, over 4721431.80 frames. ], batch size: 42, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:30:38,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:30:39,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:30:39,294 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:30:40,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 20:30:45,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:30:45,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 20:30:49,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 20:30:52,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:54,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1396893.3333333333, ans=0.2 2023-10-03 20:30:56,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:56,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:30:56,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:30:58,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 20:30:59,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:31:00,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1396893.3333333333, ans=0.125 2023-10-03 20:31:06,662 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.60 vs. limit=10.0 2023-10-03 20:31:07,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 20:31:08,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:31:10,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:31:10,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:31:11,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1396960.0, ans=0.04949747468305833 2023-10-03 20:31:12,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:31:14,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 20:31:14,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:31:17,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:31:17,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:31:17,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:31:19,436 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.056e+02 2.202e+02 2.513e+02 4.106e+02, threshold=4.405e+02, percent-clipped=1.0 2023-10-03 20:31:21,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:31:23,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 20:31:24,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:31:26,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:31:26,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:31:29,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 20:31:30,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:31:32,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=15.0 2023-10-03 20:31:32,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 20:31:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:31:38,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 20:31:40,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 20:31:41,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:31:41,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 20:31:41,426 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 20:31:41,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 20:31:44,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 20:31:46,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:31:50,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:31:53,162 INFO [train.py:1046] (2/4) Epoch 40, batch 2400, loss[loss=0.1626, simple_loss=0.2478, pruned_loss=0.0387, over 24087.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2358, pruned_loss=0.03839, over 4717421.84 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:31:54,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:31:57,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:31:59,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 20:31:59,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 20:32:04,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-03 20:32:06,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:32:06,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:32:08,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 20:32:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:32:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:09,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 20:32:13,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1397226.6666666667, ans=0.07 2023-10-03 20:32:16,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:17,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 20:32:22,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:32:24,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1397293.3333333333, ans=0.125 2023-10-03 20:32:26,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 20:32:27,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1397293.3333333333, ans=0.0 2023-10-03 20:32:28,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:32:30,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:33,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:32:33,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 20:32:34,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:32:41,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:32:42,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1397360.0, ans=0.125 2023-10-03 20:32:44,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:32:47,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:32:48,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:32:48,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:32:48,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:32:48,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:32:50,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:32:50,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:32:54,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:32:55,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:32:55,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 20:32:58,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 20:33:00,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1397426.6666666667, ans=0.0 2023-10-03 20:33:01,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:33:01,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:33:01,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 20:33:02,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 20:33:02,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 20:33:02,832 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 20:33:04,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 20:33:04,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:33:05,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:33:06,989 INFO [train.py:1046] (2/4) Epoch 40, batch 2450, loss[loss=0.1444, simple_loss=0.2289, pruned_loss=0.02994, over 24654.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2351, pruned_loss=0.03804, over 4728373.56 frames. ], batch size: 65, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:33:07,082 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 20:33:07,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:08,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:33:08,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1397493.3333333333, ans=0.125 2023-10-03 20:33:11,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:33:11,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:33:15,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:15,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:17,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 20:33:18,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1397493.3333333333, ans=0.0 2023-10-03 20:33:22,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:33:22,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:26,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:33:27,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:33:27,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:33:27,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 20:33:29,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1397560.0, ans=0.0 2023-10-03 20:33:32,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:34,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:33:34,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:33:34,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1397560.0, ans=0.1 2023-10-03 20:33:34,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.22 vs. limit=10.0 2023-10-03 20:33:37,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:33:37,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:38,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:38,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:40,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 20:33:41,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:33:45,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1397626.6666666667, ans=0.1 2023-10-03 20:33:46,904 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.910e+02 2.084e+02 2.402e+02 3.633e+02, threshold=4.168e+02, percent-clipped=0.0 2023-10-03 20:33:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:49,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:50,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:33:51,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:33:51,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:52,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:33:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 20:33:57,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:57,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:33:59,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:01,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:34:04,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:34:04,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 20:34:05,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:34:06,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:34:08,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 20:34:08,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:34:09,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:34:10,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1397760.0, ans=0.125 2023-10-03 20:34:13,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1397760.0, ans=0.2 2023-10-03 20:34:14,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:34:15,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:34:15,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:34:15,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1397760.0, ans=0.1 2023-10-03 20:34:20,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 20:34:21,411 INFO [train.py:1046] (2/4) Epoch 40, batch 2500, loss[loss=0.1452, simple_loss=0.2337, pruned_loss=0.02835, over 24294.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2347, pruned_loss=0.03791, over 4720605.29 frames. ], batch size: 61, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:34:21,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:34:29,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:34:36,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:34:36,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:34:38,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:34:38,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 20:34:44,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1397893.3333333333, ans=0.125 2023-10-03 20:34:45,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:34:45,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:34:45,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:34:45,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:34:47,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 20:34:48,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:34:48,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:49,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 20:34:49,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:34:50,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 20:34:50,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:34:55,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:34:57,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:57,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1397960.0, ans=0.1 2023-10-03 20:34:57,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1397960.0, ans=0.0 2023-10-03 20:34:59,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:35:00,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 20:35:00,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:35:01,710 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.43 vs. limit=10.0 2023-10-03 20:35:04,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:06,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:09,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:11,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1398026.6666666667, ans=0.0 2023-10-03 20:35:12,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:35:16,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:35:19,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 20:35:19,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:35:19,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:35:21,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:35:21,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:35:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 20:35:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 20:35:22,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 20:35:24,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:27,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 20:35:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 20:35:28,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:35:28,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 20:35:28,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1398093.3333333333, ans=0.09899494936611666 2023-10-03 20:35:33,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 20:35:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:35:34,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:35:35,915 INFO [train.py:1046] (2/4) Epoch 40, batch 2550, loss[loss=0.1533, simple_loss=0.2419, pruned_loss=0.03233, over 24604.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2351, pruned_loss=0.03819, over 4720872.62 frames. ], batch size: 65, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:35:35,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:35:38,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:35:38,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 20:35:40,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:35:43,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 20:35:44,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:35:47,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:47,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1398160.0, ans=0.1 2023-10-03 20:35:48,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:35:49,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 20:35:49,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:35:50,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:35:50,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:53,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:35:53,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 20:35:53,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:35:53,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:53,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 20:36:05,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1398293.3333333333, ans=0.015 2023-10-03 20:36:07,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:36:11,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:11,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:11,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:36:13,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:36:16,968 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.930e+02 2.153e+02 2.349e+02 3.303e+02, threshold=4.307e+02, percent-clipped=0.0 2023-10-03 20:36:21,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:36:22,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:36:22,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:36:22,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:36:23,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:36:23,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:36:27,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:27,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:31,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:36:32,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 20:36:32,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:36:33,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:33,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:36:33,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1398426.6666666667, ans=0.1 2023-10-03 20:36:34,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:36:36,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:36:43,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:36:46,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:36:48,920 INFO [train.py:1046] (2/4) Epoch 40, batch 2600, loss[loss=0.1624, simple_loss=0.2537, pruned_loss=0.03558, over 24457.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.03856, over 4722468.32 frames. ], batch size: 69, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:36:48,966 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 20:36:50,470 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 20:36:50,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:36:51,839 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 20:36:51,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 20:36:51,933 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 20:36:54,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:54,715 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 20:36:56,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 20:36:59,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 20:37:01,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:37:02,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 20:37:04,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 20:37:04,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1398560.0, ans=0.125 2023-10-03 20:37:05,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:37:05,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 20:37:08,617 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 20:37:08,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 20:37:16,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:17,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:17,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:37:17,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 20:37:18,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:37:19,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1398626.6666666667, ans=0.0 2023-10-03 20:37:22,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.58 vs. limit=12.0 2023-10-03 20:37:24,285 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 20:37:29,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:31,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:32,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 20:37:32,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:37:32,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:37:33,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 20:37:37,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:37:37,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:37:38,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:37:41,589 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 20:37:41,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:37:42,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:37:47,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:37:47,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:37:47,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 20:37:49,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:50,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:37:51,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:37:53,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1398760.0, ans=0.125 2023-10-03 20:37:54,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 20:37:56,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:58,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:38:02,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 20:38:02,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:03,472 INFO [train.py:1046] (2/4) Epoch 40, batch 2650, loss[loss=0.1691, simple_loss=0.2449, pruned_loss=0.04669, over 23469.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2362, pruned_loss=0.03878, over 4716811.50 frames. ], batch size: 285, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:38:03,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:38:04,892 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 20:38:04,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:07,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:09,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:38:11,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:38:14,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:38:15,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 20:38:15,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:38:16,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:38:19,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 20:38:21,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 20:38:23,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:38:25,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 20:38:25,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:26,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 20:38:30,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=15.0 2023-10-03 20:38:30,611 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.62 vs. limit=15.0 2023-10-03 20:38:31,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:38:32,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:38:35,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 20:38:35,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 20:38:38,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:38:41,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 20:38:41,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:43,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:38:43,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:38:45,162 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.003e+02 2.143e+02 2.550e+02 3.121e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 20:38:45,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:45,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:38:46,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:48,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:38:50,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:52,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:38:53,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:38:56,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:56,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:38:57,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:59,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:39:00,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:39:04,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:04,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:39:04,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:39:05,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 20:39:08,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:39:10,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:12,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:14,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:39:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:17,263 INFO [train.py:1046] (2/4) Epoch 40, batch 2700, loss[loss=0.1516, simple_loss=0.2389, pruned_loss=0.03215, over 24457.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2376, pruned_loss=0.03937, over 4709767.69 frames. ], batch size: 63, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:39:18,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:39:18,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 20:39:21,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:39:23,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 20:39:24,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:39:26,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:26,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:26,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:39:26,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:39:27,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:39:27,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:39:27,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 20:39:28,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:39:30,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:39:31,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:39:31,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:36,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:39:36,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 20:39:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:39:42,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:39:42,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:39:47,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1399293.3333333333, ans=6.0 2023-10-03 20:39:48,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:39:48,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:39:48,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:39:48,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:39:52,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:39:55,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.20 vs. limit=22.5 2023-10-03 20:39:55,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:39:55,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:39:55,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:39:57,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:40:06,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:40:06,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:40:09,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1399360.0, ans=0.0 2023-10-03 20:40:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:40:10,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:11,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1399360.0, ans=0.125 2023-10-03 20:40:14,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:40:15,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:17,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:40:18,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:20,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:40:20,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:40:22,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:40:24,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:40:24,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:40:24,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1399426.6666666667, ans=0.2 2023-10-03 20:40:28,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 20:40:30,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:31,495 INFO [train.py:1046] (2/4) Epoch 40, batch 2750, loss[loss=0.1653, simple_loss=0.2565, pruned_loss=0.03709, over 24306.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2377, pruned_loss=0.03935, over 4703513.57 frames. ], batch size: 74, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:40:31,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:40:31,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 20:40:32,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 20:40:33,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:38,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:39,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:39,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:40:40,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:43,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:40:43,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:40:45,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:40:45,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:45,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 20:40:45,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:40:45,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:49,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1399560.0, ans=0.2 2023-10-03 20:40:50,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 20:40:52,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:40:52,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:53,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:40:53,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:40:55,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:56,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:40:56,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:56,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:57,243 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.53 vs. limit=22.5 2023-10-03 20:41:01,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:41:01,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:41:01,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:41:03,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:41:05,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:41:08,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.20 vs. limit=8.0 2023-10-03 20:41:13,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:41:14,584 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.978e+02 2.171e+02 2.705e+02 3.940e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-03 20:41:14,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:41:14,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:19,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:41:19,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:41:19,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:41:19,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1399693.3333333333, ans=0.125 2023-10-03 20:41:22,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1399693.3333333333, ans=0.09899494936611666 2023-10-03 20:41:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:41:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:41:25,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 20:41:29,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:31,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 20:41:34,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1399760.0, ans=0.1 2023-10-03 20:41:36,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:41:38,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:41:40,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 20:41:40,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:41:42,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:41:42,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 20:41:42,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:41:46,239 INFO [train.py:1046] (2/4) Epoch 40, batch 2800, loss[loss=0.1526, simple_loss=0.2371, pruned_loss=0.03408, over 19290.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.03886, over 4704814.87 frames. ], batch size: 42, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:41:46,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 20:41:46,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:41:46,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1399826.6666666667, ans=0.125 2023-10-03 20:41:47,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:41:47,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 20:41:47,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:41:49,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:50,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:41:51,039 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 20:41:51,039 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 20:41:53,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:54,372 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.22 vs. limit=15.0 2023-10-03 20:41:55,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:41:55,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:41:58,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:42:01,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 20:42:01,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1399893.3333333333, ans=0.125 2023-10-03 20:42:03,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 20:42:05,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 20:42:05,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:07,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:42:07,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:11,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:42:11,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:11,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:42:13,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:42:21,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:42:23,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:42:25,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:25,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:42:26,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:31,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:42:31,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 20:42:31,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1400026.6666666667, ans=0.125 2023-10-03 20:42:32,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:42:32,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:42:32,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:42:37,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:42:37,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:42,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:42:43,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:42:43,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:43,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:42:43,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:42:45,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:42:46,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:46,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 20:42:46,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:42:47,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:42:47,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:42:49,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 20:42:50,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.93 vs. limit=12.0 2023-10-03 20:42:50,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:50,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:42:52,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:42:52,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.31 vs. limit=10.0 2023-10-03 20:42:53,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 20:42:59,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:43:00,982 INFO [train.py:1046] (2/4) Epoch 40, batch 2850, loss[loss=0.16, simple_loss=0.2447, pruned_loss=0.03767, over 23493.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2353, pruned_loss=0.03844, over 4693501.82 frames. ], batch size: 93, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:43:01,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:43:01,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:43:02,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:43:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:06,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:43:09,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:11,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:43:13,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:43:13,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 20:43:19,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 20:43:19,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:20,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 20:43:21,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:23,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 20:43:24,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 20:43:25,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:37,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:39,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:43:39,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:43:39,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:43:40,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:43:40,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:43:42,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:43:44,037 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 1.957e+02 2.192e+02 2.497e+02 4.123e+02, threshold=4.384e+02, percent-clipped=0.0 2023-10-03 20:43:44,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 20:43:45,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:43:45,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:43:46,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:46,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:50,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:50,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:50,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:51,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:43:52,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:43:54,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:55,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:56,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:43:59,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:44:02,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 20:44:02,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 20:44:05,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:44:06,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:06,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 20:44:08,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:44:08,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:08,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:08,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:44:08,244 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 20:44:10,133 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 20:44:10,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:44:10,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:14,220 INFO [train.py:1046] (2/4) Epoch 40, batch 2900, loss[loss=0.1657, simple_loss=0.2441, pruned_loss=0.04363, over 23170.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2351, pruned_loss=0.0386, over 4682199.05 frames. ], batch size: 105, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:44:16,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:44:16,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:16,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:44:16,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 20:44:20,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:44:20,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 20:44:22,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 20:44:23,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:44:23,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:44:24,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:44:25,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1400493.3333333333, ans=0.07 2023-10-03 20:44:26,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:44:29,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:44:30,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:44:32,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:44:32,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 20:44:32,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:44:35,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:37,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 20:44:39,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 20:44:41,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1400560.0, ans=0.0 2023-10-03 20:44:43,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:43,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 20:44:43,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:44:47,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:44:47,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:44:48,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:44:50,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:53,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:55,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:44:56,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 20:44:56,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 20:44:57,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:45:01,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:45:04,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 20:45:06,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:45:12,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:45:21,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:45:21,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:45:23,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 20:45:27,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:27,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 20:45:28,538 INFO [train.py:1046] (2/4) Epoch 40, batch 2950, loss[loss=0.1627, simple_loss=0.2392, pruned_loss=0.04305, over 23689.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2364, pruned_loss=0.03876, over 4691582.16 frames. ], batch size: 232, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:45:28,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:45:29,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:45:34,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:45:35,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 20:45:35,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:45:35,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:37,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:45:38,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:45:40,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 20:45:40,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 20:45:42,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:45:42,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:45:48,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:45:50,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:45:52,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:45:53,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:45:55,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:45:55,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:45:56,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:57,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:57,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:46:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 20:46:02,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1400960.0, ans=0.125 2023-10-03 20:46:06,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 20:46:06,333 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 20:46:07,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:46:09,044 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 20:46:09,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1400960.0, ans=0.125 2023-10-03 20:46:10,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.938e+02 2.092e+02 2.435e+02 3.514e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 20:46:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 20:46:11,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:46:11,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:46:11,824 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 20:46:11,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:46:13,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1401026.6666666667, ans=0.1 2023-10-03 20:46:15,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 20:46:15,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:46:16,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:46:19,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:46:19,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:46:19,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:19,529 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 20:46:20,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:46:20,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 20:46:24,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:25,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:46:26,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 20:46:26,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:46:28,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 20:46:29,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:46:31,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:46:32,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:46:35,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:35,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:46:37,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:46:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:37,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:46:39,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:46:39,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:46:40,683 INFO [train.py:1046] (2/4) Epoch 40, batch 3000, loss[loss=0.158, simple_loss=0.2484, pruned_loss=0.0338, over 24293.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2374, pruned_loss=0.03928, over 4692654.21 frames. ], batch size: 77, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:46:40,683 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 20:46:52,702 INFO [train.py:1078] (2/4) Epoch 40, validation: loss=0.3553, simple_loss=0.2798, pruned_loss=0.2154, over 1125622.00 frames. 2023-10-03 20:46:52,703 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 20:46:52,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:46:54,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:54,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 20:46:55,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:59,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:46:59,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:47:02,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 20:47:02,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 20:47:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:47:06,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:47:06,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 20:47:06,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:47:13,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:47:17,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1401226.6666666667, ans=0.2 2023-10-03 20:47:21,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:47:23,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1401293.3333333333, ans=0.125 2023-10-03 20:47:26,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1401293.3333333333, ans=0.125 2023-10-03 20:47:27,262 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.37 vs. limit=22.5 2023-10-03 20:47:29,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 20:47:29,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:47:32,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:47:32,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:47:32,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:47:33,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:47:33,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 20:47:33,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1401293.3333333333, ans=0.0 2023-10-03 20:47:36,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 20:47:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:47:37,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:47:41,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:47:41,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:47:42,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:42,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:47:45,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:47:46,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:47:46,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:47:48,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:47:50,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 20:47:51,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:47:51,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:47:51,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:47:56,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:56,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:57,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 20:47:59,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 20:47:59,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:47:59,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 20:48:00,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:48:01,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 20:48:05,796 INFO [train.py:1046] (2/4) Epoch 40, batch 3050, loss[loss=0.1442, simple_loss=0.222, pruned_loss=0.03318, over 24630.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.03944, over 4700349.40 frames. ], batch size: 60, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:48:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:48:05,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 20:48:07,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 20:48:07,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 20:48:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:48:08,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:48:10,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:48:10,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:48:10,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:11,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:48:13,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 20:48:14,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:48:16,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.72 vs. limit=15.0 2023-10-03 20:48:17,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:18,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:48:21,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:24,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 20:48:26,616 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.87 vs. limit=15.0 2023-10-03 20:48:29,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 20:48:29,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 20:48:31,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:48:32,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.44 vs. limit=15.0 2023-10-03 20:48:34,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:48:37,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:37,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:38,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:48:39,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:48:41,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:48:41,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:48:41,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:41,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:48:44,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:44,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1401626.6666666667, ans=0.0 2023-10-03 20:48:45,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:48:48,606 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.898e+02 2.071e+02 2.314e+02 3.328e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 20:48:48,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:48:48,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 20:48:50,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:50,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:48:53,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:48:53,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:48:55,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:48:55,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:01,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:49:01,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:04,436 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=22.5 2023-10-03 20:49:06,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:06,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:49:06,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:49:08,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:49:08,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:49:09,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:49:09,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 20:49:10,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:49:11,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1401760.0, ans=0.2 2023-10-03 20:49:12,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:12,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 20:49:14,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1401760.0, ans=0.125 2023-10-03 20:49:14,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-10-03 20:49:15,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:20,218 INFO [train.py:1046] (2/4) Epoch 40, batch 3100, loss[loss=0.1869, simple_loss=0.24, pruned_loss=0.06687, over 19824.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.03927, over 4699410.75 frames. ], batch size: 388, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:49:20,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:21,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:49:23,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:49:25,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 20:49:27,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 20:49:30,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 20:49:31,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:49:33,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:49:33,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:36,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:49:37,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.80 vs. limit=15.0 2023-10-03 20:49:38,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:44,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 20:49:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 20:49:50,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:49:50,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:49:50,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:49:51,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 20:49:53,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:49:53,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 20:49:53,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:49:55,170 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.55 vs. limit=15.0 2023-10-03 20:49:55,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:55,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 20:49:57,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:50:00,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:50:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 20:50:03,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 20:50:04,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:04,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:50:07,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:07,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:50:08,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:50:08,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:50:10,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:50:10,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:50:12,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:12,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 20:50:16,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:50:19,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 20:50:21,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:50:21,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 20:50:22,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:22,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:24,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 20:50:32,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 20:50:34,343 INFO [train.py:1046] (2/4) Epoch 40, batch 3150, loss[loss=0.1512, simple_loss=0.2394, pruned_loss=0.0315, over 24461.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2362, pruned_loss=0.03896, over 4688404.58 frames. ], batch size: 63, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:50:34,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:34,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:37,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:50:37,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:50:37,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 20:50:38,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:38,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:50:40,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 20:50:41,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:44,740 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 20:50:48,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 20:50:49,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:50:49,544 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 20:50:50,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 20:50:54,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 20:50:54,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 20:50:54,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 20:50:54,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:54,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:50:55,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:55,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 20:50:57,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1402226.6666666667, ans=0.0 2023-10-03 20:50:58,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:58,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:51:00,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:51:02,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:51:05,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 20:51:07,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:51:08,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:51:09,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:51:11,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 20:51:14,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 20:51:14,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:51:15,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 20:51:15,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 20:51:16,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:51:16,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:51:17,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:51:17,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:51:18,688 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.948e+02 2.114e+02 2.510e+02 3.900e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-03 20:51:18,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 20:51:18,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:51:18,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:20,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:51:21,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1402360.0, ans=0.2 2023-10-03 20:51:22,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:51:22,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 20:51:22,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:51:22,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.70 vs. limit=22.5 2023-10-03 20:51:23,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 20:51:25,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:25,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 20:51:25,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 20:51:28,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:51:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:51:30,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 20:51:30,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 20:51:30,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:51:31,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1402360.0, ans=0.125 2023-10-03 20:51:33,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1402426.6666666667, ans=0.2 2023-10-03 20:51:34,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:51:34,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:51:41,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:51:41,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:42,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.18 vs. limit=15.0 2023-10-03 20:51:44,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 20:51:48,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.85 vs. limit=6.0 2023-10-03 20:51:48,681 INFO [train.py:1046] (2/4) Epoch 40, batch 3200, loss[loss=0.1456, simple_loss=0.2227, pruned_loss=0.03424, over 23375.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2348, pruned_loss=0.03822, over 4694396.40 frames. ], batch size: 119, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:51:48,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:51:48,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:51:53,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:54,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:51:54,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 20:51:56,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1402493.3333333333, ans=0.125 2023-10-03 20:52:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:52:02,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:52:05,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:52:14,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:52:20,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1402626.6666666667, ans=0.2 2023-10-03 20:52:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 20:52:23,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1402626.6666666667, ans=0.125 2023-10-03 20:52:24,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:52:28,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 20:52:29,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:52:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:52:32,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:52:34,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:52:38,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 20:52:39,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 20:52:42,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 20:52:45,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 20:52:45,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:52:51,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:52:52,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:52:52,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:52:52,996 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 20:52:52,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 20:52:57,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:52:59,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 20:53:00,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 20:53:00,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 20:53:01,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1402760.0, ans=0.125 2023-10-03 20:53:02,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 20:53:04,245 INFO [train.py:1046] (2/4) Epoch 40, batch 3250, loss[loss=0.1508, simple_loss=0.2241, pruned_loss=0.03872, over 23551.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2354, pruned_loss=0.03794, over 4713305.47 frames. ], batch size: 256, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:53:04,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:53:04,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1402826.6666666667, ans=15.0 2023-10-03 20:53:06,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:53:06,987 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 20:53:07,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:53:07,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:09,569 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 20:53:12,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:53:15,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:53:21,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:53:22,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 20:53:23,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:53:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:53:23,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:53:25,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1402893.3333333333, ans=0.0 2023-10-03 20:53:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:53:27,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:53:30,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:53:30,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:30,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:53:33,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:53:33,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1402960.0, ans=0.125 2023-10-03 20:53:34,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:53:38,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:38,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:39,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:40,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:53:40,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:53:45,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 20:53:45,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:53:45,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:53:47,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:53:48,327 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.926e+02 2.163e+02 2.567e+02 5.244e+02, threshold=4.326e+02, percent-clipped=4.0 2023-10-03 20:53:48,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:53:55,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:54:01,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:54:01,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:01,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 20:54:01,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:54:01,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:54:01,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:05,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 20:54:05,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 20:54:05,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:54:07,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:09,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:54:09,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:54:10,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:54:14,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:54:14,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:54:15,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 20:54:15,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:17,215 INFO [train.py:1046] (2/4) Epoch 40, batch 3300, loss[loss=0.1423, simple_loss=0.2172, pruned_loss=0.03365, over 24320.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2361, pruned_loss=0.03815, over 4720538.53 frames. ], batch size: 56, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:54:18,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:54:18,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 20:54:20,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1403160.0, ans=0.125 2023-10-03 20:54:21,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:54:21,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 20:54:22,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 20:54:24,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 20:54:24,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:27,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:54:29,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:54:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:31,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:54:31,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:54:33,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:34,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:54:39,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 20:54:40,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:54:40,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:41,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:43,360 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 20:54:43,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:54:44,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:54:44,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:54:44,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:54:45,047 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:54:46,089 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 20:54:50,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:50,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:54:52,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:52,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 20:54:52,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1403293.3333333333, ans=0.1 2023-10-03 20:54:53,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 20:54:53,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:54,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:54:56,362 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 20:54:57,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 20:54:59,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:55:02,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 20:55:03,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:55:06,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:55:07,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:55:11,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:12,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:55:12,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:55:12,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:55:13,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:55:13,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:55:15,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:55:16,673 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 20:55:18,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 20:55:19,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.14 vs. limit=15.0 2023-10-03 20:55:19,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:55:20,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:55:20,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:21,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1403426.6666666667, ans=0.0 2023-10-03 20:55:22,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:55:22,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:22,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:55:24,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:24,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:55:25,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:55:26,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:55:29,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 20:55:29,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:30,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.76 vs. limit=6.0 2023-10-03 20:55:31,099 INFO [train.py:1046] (2/4) Epoch 40, batch 3350, loss[loss=0.1447, simple_loss=0.221, pruned_loss=0.03421, over 20788.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2367, pruned_loss=0.03793, over 4721121.10 frames. ], batch size: 45, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:55:31,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:32,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:55:33,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:55:36,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:37,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:40,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:55:43,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:43,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:55:46,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:47,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:55:49,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:50,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:55:51,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 20:55:53,727 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 20:55:53,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:56,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 20:55:56,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 20:55:56,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:55:56,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:55:56,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1403560.0, ans=0.125 2023-10-03 20:55:58,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:55:58,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 20:55:59,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:59,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:55:59,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1403626.6666666667, ans=0.0 2023-10-03 20:56:00,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:02,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:02,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:04,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:56:07,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1403626.6666666667, ans=0.2 2023-10-03 20:56:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:10,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:10,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:14,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:56:16,072 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.915e+02 2.091e+02 2.361e+02 5.355e+02, threshold=4.181e+02, percent-clipped=1.0 2023-10-03 20:56:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:17,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:17,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:19,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:20,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 20:56:21,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:56:21,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 20:56:21,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:56:21,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 20:56:23,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:25,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:26,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1403693.3333333333, ans=0.0 2023-10-03 20:56:29,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1403760.0, ans=0.125 2023-10-03 20:56:29,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1403760.0, ans=0.0 2023-10-03 20:56:32,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:33,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 20:56:33,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:56:35,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:56:37,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:56:41,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:56:44,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 20:56:46,169 INFO [train.py:1046] (2/4) Epoch 40, batch 3400, loss[loss=0.169, simple_loss=0.2523, pruned_loss=0.04279, over 23354.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2376, pruned_loss=0.03841, over 4720727.81 frames. ], batch size: 93, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:56:46,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:56:46,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:56:47,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:47,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 20:56:49,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:49,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 20:56:50,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:56:50,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:56:52,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:56:53,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:56:53,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 20:56:57,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 20:56:57,968 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 20:56:57,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:02,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:57:02,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:57:02,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:03,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:57:09,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:57:11,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 20:57:14,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:57:17,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:17,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:57:17,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.37 vs. limit=15.0 2023-10-03 20:57:18,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:57:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:57:26,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 20:57:31,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:33,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:33,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 20:57:34,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:57:34,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:57:36,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:57:36,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:57:39,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:43,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:57:43,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:57:48,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:57:49,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 20:57:55,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:57:59,852 INFO [train.py:1046] (2/4) Epoch 40, batch 3450, loss[loss=0.1323, simple_loss=0.2138, pruned_loss=0.02541, over 24284.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2374, pruned_loss=0.03817, over 4728483.78 frames. ], batch size: 56, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:57:59,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 20:58:01,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1404160.0, ans=0.125 2023-10-03 20:58:04,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 20:58:05,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:58:07,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:58:07,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 20:58:07,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:58:12,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:58:16,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:58:18,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:58:18,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:58:18,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:20,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:27,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 20:58:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 20:58:31,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:58:31,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:58:32,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:58:39,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 20:58:39,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:58:42,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:58:42,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:58:43,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:58:44,697 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.925e+02 2.071e+02 2.347e+02 3.387e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-03 20:58:46,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:58:47,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 20:58:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:58:47,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:50,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:58:53,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 20:58:58,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:59:01,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.36 vs. limit=22.5 2023-10-03 20:59:03,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:59:03,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1404426.6666666667, ans=0.125 2023-10-03 20:59:04,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:08,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:12,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1404426.6666666667, ans=15.0 2023-10-03 20:59:12,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:13,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:59:13,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:59:14,335 INFO [train.py:1046] (2/4) Epoch 40, batch 3500, loss[loss=0.1721, simple_loss=0.2553, pruned_loss=0.04448, over 24074.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2364, pruned_loss=0.0382, over 4700597.05 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:59:14,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:59:18,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:20,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:59:21,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 20:59:22,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.80 vs. limit=15.0 2023-10-03 20:59:23,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:59:26,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 20:59:28,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:29,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 20:59:30,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.60 vs. limit=22.5 2023-10-03 20:59:32,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:59:33,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:59:33,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:59:33,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:59:33,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:59:34,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:34,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:59:35,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 20:59:38,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:40,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:59:41,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:59:44,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:44,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 20:59:45,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:59:50,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:59:51,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:59:52,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1404626.6666666667, ans=0.0 2023-10-03 20:59:52,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.98 vs. limit=15.0 2023-10-03 20:59:53,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:54,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:59:56,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:59:57,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 20:59:58,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 20:59:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 20:59:59,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:00:00,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:02,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:00:02,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:00:03,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1404693.3333333333, ans=0.0 2023-10-03 21:00:04,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:00:05,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1404693.3333333333, ans=0.1 2023-10-03 21:00:06,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:00:08,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:00:11,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 21:00:11,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 21:00:11,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:00:11,608 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:00:13,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:00:15,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:00:16,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:19,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 21:00:19,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:00:21,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:00:23,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 21:00:24,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 21:00:27,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:29,203 INFO [train.py:1046] (2/4) Epoch 40, batch 3550, loss[loss=0.1401, simple_loss=0.2182, pruned_loss=0.03106, over 24629.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2347, pruned_loss=0.03814, over 4690033.86 frames. ], batch size: 60, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 21:00:29,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:00:29,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:00:30,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:33,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:00:42,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:42,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 21:00:45,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:00:46,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:00:46,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1404893.3333333333, ans=0.0 2023-10-03 21:00:47,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:00:49,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:00:49,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:00:53,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:00:53,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:00:53,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:53,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:00:55,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:01:00,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:01:00,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:01:01,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:01:01,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:01:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:01:01,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 21:01:01,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:04,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 21:01:09,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:10,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:01:10,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:13,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 21:01:13,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:01:14,497 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.948e+02 2.132e+02 2.372e+02 3.710e+02, threshold=4.263e+02, percent-clipped=0.0 2023-10-03 21:01:14,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 21:01:15,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:01:16,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1405026.6666666667, ans=0.0 2023-10-03 21:01:17,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:01:18,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:01:22,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 21:01:22,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:01:23,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1405026.6666666667, ans=0.125 2023-10-03 21:01:27,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:01:29,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 21:01:29,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:29,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1405093.3333333333, ans=0.0 2023-10-03 21:01:30,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:32,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 21:01:38,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 21:01:38,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:01:39,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:01:41,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:42,641 INFO [train.py:1046] (2/4) Epoch 40, batch 3600, loss[loss=0.1731, simple_loss=0.2587, pruned_loss=0.04377, over 24292.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2352, pruned_loss=0.03818, over 4706694.17 frames. ], batch size: 77, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:01:42,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:44,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:01:46,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:01:48,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:48,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:01:48,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1405160.0, ans=0.125 2023-10-03 21:01:49,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:01:49,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:49,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 21:01:54,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:01:54,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:57,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:01:59,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:01:59,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:02:01,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:02:01,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 21:02:02,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:02:05,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:02:05,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:02:07,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:10,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:02:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:02:10,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1405293.3333333333, ans=0.125 2023-10-03 21:02:13,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 21:02:20,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:02:22,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:02:23,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 21:02:27,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:02:31,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:34,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:35,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1405360.0, ans=0.1 2023-10-03 21:02:41,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:02:42,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:02:42,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 21:02:43,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 21:02:44,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 21:02:46,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:02:47,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:02:47,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 21:02:49,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:02:49,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:02:49,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:02:50,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 21:02:51,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 21:02:56,509 INFO [train.py:1046] (2/4) Epoch 40, batch 3650, loss[loss=0.1651, simple_loss=0.2476, pruned_loss=0.04127, over 24027.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2364, pruned_loss=0.03822, over 4718090.01 frames. ], batch size: 86, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:02:56,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:56,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 21:03:01,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 21:03:01,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:03:05,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 21:03:07,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 21:03:13,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:03:13,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:03:13,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:03:16,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:03:16,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:03:17,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 21:03:17,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:03:17,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1405560.0, ans=0.0 2023-10-03 21:03:19,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:03:20,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 21:03:20,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:03:22,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:03:22,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:23,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:03:25,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 21:03:26,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 21:03:27,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:03:29,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 21:03:30,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:03:30,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:03:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:03:37,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:37,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:03:39,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:03:41,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:03:42,371 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.999e+02 2.151e+02 2.370e+02 3.014e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-03 21:03:44,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:03:45,227 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.63 vs. limit=15.0 2023-10-03 21:03:45,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:03:47,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:03:47,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:03:48,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:03:49,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:49,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:03:51,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-10-03 21:03:52,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1405693.3333333333, ans=0.0 2023-10-03 21:03:55,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 21:03:59,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:03:59,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:03:59,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1405760.0, ans=0.125 2023-10-03 21:04:01,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:04:01,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:03,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:04:03,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:04,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 21:04:04,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:07,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:04:10,589 INFO [train.py:1046] (2/4) Epoch 40, batch 3700, loss[loss=0.1685, simple_loss=0.2406, pruned_loss=0.04822, over 22729.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2376, pruned_loss=0.03863, over 4713812.52 frames. ], batch size: 322, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:04:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:04:10,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:04:12,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:12,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 21:04:12,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:14,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:04:15,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:04:16,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:04:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:04:20,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:21,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:04:22,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:22,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:04:22,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1405826.6666666667, ans=0.0 2023-10-03 21:04:25,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:26,986 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 21:04:34,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:04:35,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:04:35,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:04:37,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 21:04:37,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:04:39,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:40,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 21:04:43,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:45,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:04:47,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:49,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:04:50,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:04:54,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:04:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 21:04:56,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:56,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 21:05:02,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:05:02,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:05:04,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1406026.6666666667, ans=0.2 2023-10-03 21:05:05,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:06,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 21:05:08,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:05:08,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:05:08,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:05:08,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:10,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:05:12,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 21:05:13,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 21:05:13,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1406093.3333333333, ans=10.0 2023-10-03 21:05:13,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1406093.3333333333, ans=0.125 2023-10-03 21:05:14,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:05:14,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:16,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:05:17,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:05:20,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:05:23,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:05:23,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:05:24,764 INFO [train.py:1046] (2/4) Epoch 40, batch 3750, loss[loss=0.1727, simple_loss=0.257, pruned_loss=0.0442, over 24021.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2384, pruned_loss=0.03907, over 4721721.11 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:05:24,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 21:05:27,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 21:05:28,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:05:29,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 21:05:31,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:05:32,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:32,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:33,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:05:38,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:05:40,818 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.88 vs. limit=22.5 2023-10-03 21:05:41,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:05:41,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:05:43,382 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.74 vs. limit=15.0 2023-10-03 21:05:44,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:46,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:05:46,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 21:05:48,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:05:48,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.94 vs. limit=15.0 2023-10-03 21:05:49,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:05:50,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:05:53,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 21:05:56,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 21:05:59,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:05:59,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:06:01,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:06,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:07,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 21:06:11,030 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.980e+02 2.164e+02 2.553e+02 4.062e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 21:06:12,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 21:06:15,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:18,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:06:18,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:06:21,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:06:22,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1406360.0, ans=0.125 2023-10-03 21:06:25,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:06:27,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:06:29,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:06:31,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:06:31,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1406426.6666666667, ans=0.2 2023-10-03 21:06:34,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:06:38,472 INFO [train.py:1046] (2/4) Epoch 40, batch 3800, loss[loss=0.1577, simple_loss=0.229, pruned_loss=0.04323, over 23781.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2378, pruned_loss=0.03919, over 4726427.53 frames. ], batch size: 164, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:06:40,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:06:44,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:46,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:06:47,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 21:06:49,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:52,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:06:52,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:06:55,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 21:06:55,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:56,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:06:58,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:59,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:06:59,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:06:59,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 21:07:04,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:07:05,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:07:06,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:07:09,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:07:09,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:07:10,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:07:10,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:07:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:14,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:07:17,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1406626.6666666667, ans=0.0 2023-10-03 21:07:18,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 21:07:18,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 21:07:20,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:07:27,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:07:32,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:07:33,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 21:07:34,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 21:07:34,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:07:36,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:07:37,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:40,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 21:07:43,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 21:07:45,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 21:07:45,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:45,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:07:51,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:07:51,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:07:52,907 INFO [train.py:1046] (2/4) Epoch 40, batch 3850, loss[loss=0.149, simple_loss=0.2157, pruned_loss=0.04119, over 23570.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2365, pruned_loss=0.03849, over 4716582.76 frames. ], batch size: 256, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:07:57,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:07:58,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 21:07:58,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:08:00,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:08:02,698 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.53 vs. limit=12.0 2023-10-03 21:08:03,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:08:05,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:08:07,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:08:08,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 21:08:16,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:17,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:08:19,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:08:19,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:08:24,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:24,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:08:25,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:08:25,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:08:27,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:08:28,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:08:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:30,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:08:31,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 21:08:31,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 21:08:31,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:08:32,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:34,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:36,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:36,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 21:08:39,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 21:08:40,421 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.946e+02 2.190e+02 2.397e+02 4.110e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-03 21:08:40,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:41,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 21:08:43,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:08:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:50,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:54,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:54,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 21:08:55,211 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.36 vs. limit=15.0 2023-10-03 21:08:57,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 21:09:00,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:01,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:04,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:09:04,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:09:05,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:05,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:05,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:09:05,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 21:09:05,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:09:07,622 INFO [train.py:1046] (2/4) Epoch 40, batch 3900, loss[loss=0.1682, simple_loss=0.2323, pruned_loss=0.05204, over 22816.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2354, pruned_loss=0.03815, over 4720246.60 frames. ], batch size: 322, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 21:09:09,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 21:09:09,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:09,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:11,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:09:11,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:13,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:09:14,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:14,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:09:14,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:09:14,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 21:09:14,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:15,042 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-10-03 21:09:19,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:09:21,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:09:21,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:09:22,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:09:23,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:09:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:25,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:09:27,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 21:09:27,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:09:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 21:09:30,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:31,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 21:09:31,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 21:09:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:09:37,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:09:38,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:09:38,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:09:41,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:09:43,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:09:45,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:09:45,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:09:47,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:09:53,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:09:53,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:09:57,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1407360.0, ans=0.1 2023-10-03 21:09:59,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:10:01,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:10:04,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1407360.0, ans=0.125 2023-10-03 21:10:11,671 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:10:12,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:10:14,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:10:14,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 21:10:14,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1407426.6666666667, ans=0.125 2023-10-03 21:10:15,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 21:10:15,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:10:17,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 21:10:18,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:10:19,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 21:10:21,440 INFO [train.py:1046] (2/4) Epoch 40, batch 3950, loss[loss=0.1541, simple_loss=0.2431, pruned_loss=0.03254, over 24568.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03779, over 4721820.23 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:10:21,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1407493.3333333333, ans=0.0 2023-10-03 21:10:26,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:10:26,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 21:10:27,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:10:29,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:10:32,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:10:36,487 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 21:10:37,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:10:37,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 21:10:38,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 21:10:39,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:10:41,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:10:42,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:10:42,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:10:45,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 21:10:48,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:10:48,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:10:48,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:10:49,511 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.50 vs. limit=15.0 2023-10-03 21:10:50,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:10:51,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:10:53,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1407626.6666666667, ans=0.125 2023-10-03 21:10:57,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1407626.6666666667, ans=0.125 2023-10-03 21:11:03,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:11:03,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:11:09,377 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.960e+02 2.144e+02 2.428e+02 3.730e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-03 21:11:09,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 21:11:14,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 21:11:14,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 21:11:14,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:11:15,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:11:17,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1407693.3333333333, ans=0.0 2023-10-03 21:11:23,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:11:23,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:11:24,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:11:24,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:11:24,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 21:11:26,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1407760.0, ans=0.125 2023-10-03 21:11:29,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:11:30,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:11:35,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 21:11:36,499 INFO [train.py:1046] (2/4) Epoch 40, batch 4000, loss[loss=0.161, simple_loss=0.25, pruned_loss=0.03595, over 24564.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2358, pruned_loss=0.03777, over 4732118.77 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:11:42,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:43,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.46 vs. limit=15.0 2023-10-03 21:11:48,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:48,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1407826.6666666667, ans=0.125 2023-10-03 21:11:51,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:11:51,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:11:53,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:53,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 21:11:54,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:11:54,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 21:11:54,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:11:54,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 21:11:57,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:02,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:12:02,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:12:02,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:12:02,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:12:02,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:12:03,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:12:06,369 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 21:12:07,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:12:07,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:09,240 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 21:12:11,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:12:11,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:12:17,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 21:12:19,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:12:20,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:12:21,994 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 21:12:23,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:12:25,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 21:12:25,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:12:25,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:25,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1408026.6666666667, ans=0.125 2023-10-03 21:12:25,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1408026.6666666667, ans=0.2 2023-10-03 21:12:26,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:12:28,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:12:28,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:12:28,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:12:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 21:12:29,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:31,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 21:12:37,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:12:40,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 21:12:44,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:12:44,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:46,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:12:47,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:12:50,420 INFO [train.py:1046] (2/4) Epoch 40, batch 4050, loss[loss=0.1605, simple_loss=0.2426, pruned_loss=0.03918, over 23213.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03797, over 4738784.65 frames. ], batch size: 105, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:12:50,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:53,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:12:54,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 21:12:54,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1408160.0, ans=0.125 2023-10-03 21:12:57,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:12:57,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:12:59,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:13:00,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:13:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:13:05,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:13:06,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:13:08,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 21:13:09,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:13:09,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:13:13,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1408226.6666666667, ans=0.125 2023-10-03 21:13:14,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:13:15,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:13:17,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 21:13:20,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 21:13:21,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 21:13:23,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:13:30,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 21:13:30,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:13:30,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1408293.3333333333, ans=0.125 2023-10-03 21:13:33,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:13:33,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1408360.0, ans=0.0 2023-10-03 21:13:36,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:13:36,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:13:36,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:13:37,894 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.894e+02 2.100e+02 2.406e+02 4.274e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-03 21:13:41,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:13:43,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 21:13:43,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:13:45,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:13:45,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1408360.0, ans=0.2 2023-10-03 21:13:45,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1408360.0, ans=0.2 2023-10-03 21:13:46,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 21:13:51,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:13:58,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 21:13:58,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:13:58,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:14:01,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 21:14:01,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 21:14:01,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:04,139 INFO [train.py:1046] (2/4) Epoch 40, batch 4100, loss[loss=0.1561, simple_loss=0.2372, pruned_loss=0.03756, over 23677.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2373, pruned_loss=0.03808, over 4737886.07 frames. ], batch size: 232, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:14:04,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:14:05,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:05,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:14:10,469 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.56 vs. limit=15.0 2023-10-03 21:14:13,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 21:14:14,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 21:14:14,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 21:14:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 21:14:16,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:18,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:18,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:18,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:14:18,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1408560.0, ans=0.0 2023-10-03 21:14:19,656 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 21:14:22,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:14:23,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:14:23,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:25,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:14:27,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:14:29,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:14:29,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:14:31,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 21:14:32,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:32,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:14:32,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:14:32,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:14:32,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 21:14:35,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:14:37,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 21:14:40,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:14:41,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:14:41,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 21:14:44,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:14:44,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:14:44,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:14:47,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 21:14:49,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:14:49,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:14:52,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 21:14:52,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:53,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:14:56,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:14:59,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:02,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:15:02,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:15:10,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:10,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:15:13,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:15:16,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:15:16,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1408760.0, ans=0.2 2023-10-03 21:15:19,453 INFO [train.py:1046] (2/4) Epoch 40, batch 4150, loss[loss=0.1431, simple_loss=0.2239, pruned_loss=0.03115, over 24321.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2371, pruned_loss=0.03779, over 4735828.38 frames. ], batch size: 61, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:15:19,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:15:19,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:15:20,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:15:20,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:15:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 21:15:23,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:23,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 21:15:25,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 21:15:25,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 21:15:26,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:32,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:15:32,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:35,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:15:35,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:15:36,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:15:38,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:15:38,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:15:40,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:15:45,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:50,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:15:51,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 21:15:53,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 21:15:53,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:15:53,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1408960.0, ans=0.2 2023-10-03 21:15:54,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 21:15:54,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:15:54,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:15:57,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:15:58,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:16:01,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 21:16:04,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:16:06,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:07,373 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.015e+02 2.186e+02 2.538e+02 3.737e+02, threshold=4.372e+02, percent-clipped=0.0 2023-10-03 21:16:07,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 21:16:07,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:16:09,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 21:16:09,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1409026.6666666667, ans=0.0 2023-10-03 21:16:10,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:16:12,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:16:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:16,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 21:16:16,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:16,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:16:16,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:16:20,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 21:16:20,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:20,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:16:20,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:16:21,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 21:16:21,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:16:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:16:23,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:16:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:24,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 21:16:24,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:16:26,798 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-03 21:16:30,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:16:31,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 21:16:33,340 INFO [train.py:1046] (2/4) Epoch 40, batch 4200, loss[loss=0.1475, simple_loss=0.2088, pruned_loss=0.04307, over 23382.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03802, over 4724150.79 frames. ], batch size: 285, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:16:33,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:16:36,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:16:37,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:16:37,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1409160.0, ans=0.125 2023-10-03 21:16:38,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:16:38,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:16:40,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 21:16:44,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 21:16:44,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:45,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:47,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1409226.6666666667, ans=0.2 2023-10-03 21:16:48,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:16:51,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:16:54,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:16:54,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:55,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 21:16:55,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:56,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:56,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:16:56,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:16:58,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:17:01,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 21:17:02,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:17:05,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:17:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:17:09,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:17:11,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:17:13,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:17:13,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 21:17:13,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:17:15,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:17:20,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:17:22,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:17:26,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:17:29,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 21:17:29,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1409360.0, ans=0.1 2023-10-03 21:17:31,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:17:35,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:17:37,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:17:38,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 21:17:42,420 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.25 vs. limit=22.5 2023-10-03 21:17:44,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:17:44,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1409426.6666666667, ans=0.1 2023-10-03 21:17:48,463 INFO [train.py:1046] (2/4) Epoch 40, batch 4250, loss[loss=0.157, simple_loss=0.251, pruned_loss=0.0315, over 24667.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2347, pruned_loss=0.0378, over 4721222.93 frames. ], batch size: 73, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:17:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:17:50,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:17:51,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:17:56,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:17:56,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 21:17:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:17:59,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:02,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:18:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:07,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:08,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:18:08,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:18:11,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:11,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:13,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:15,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:18:15,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:16,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 21:18:21,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 21:18:21,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:22,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:18:22,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:24,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:18:24,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:24,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:27,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=1409626.6666666667, ans=0.1 2023-10-03 21:18:29,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:18:29,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:18:34,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:18:35,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:36,952 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.902e+02 2.076e+02 2.432e+02 4.125e+02, threshold=4.152e+02, percent-clipped=0.0 2023-10-03 21:18:37,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 21:18:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:18:38,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 21:18:39,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:18:39,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1409693.3333333333, ans=0.0 2023-10-03 21:18:42,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:18:43,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:18:47,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 21:18:48,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:18:49,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:18:53,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:55,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:56,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:18:57,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:18:57,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:19:00,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:19:00,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:19:00,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 21:19:02,334 INFO [train.py:1046] (2/4) Epoch 40, batch 4300, loss[loss=0.1447, simple_loss=0.2208, pruned_loss=0.03436, over 23368.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2347, pruned_loss=0.03766, over 4730949.90 frames. ], batch size: 285, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:19:02,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:19:06,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:19:06,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:19:09,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:19:18,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:19:18,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 21:19:20,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:19:21,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:19:21,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:19:23,411 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 21:19:25,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:19:26,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:19:26,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1409893.3333333333, ans=0.0 2023-10-03 21:19:27,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1409893.3333333333, ans=0.125 2023-10-03 21:19:28,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.56 vs. limit=15.0 2023-10-03 21:19:30,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 21:19:32,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:19:32,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 21:19:32,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1409960.0, ans=0.125 2023-10-03 21:19:33,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:19:33,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1409960.0, ans=0.2 2023-10-03 21:19:35,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:19:37,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:19:37,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:19:40,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:19:40,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:19:41,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:19:41,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 21:19:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 21:19:46,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:19:48,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1410026.6666666667, ans=0.0 2023-10-03 21:19:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:19:49,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:19:49,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:19:49,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:19:49,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 21:19:49,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 21:19:49,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 21:19:51,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:19:51,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 21:19:52,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 21:19:54,276 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:19:54,735 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-10-03 21:19:56,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:19:58,538 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 21:20:00,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:20:01,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:01,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:20:02,017 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:20:03,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 21:20:03,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:20:03,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:04,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:20:04,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:20:05,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:20:07,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:20:08,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:09,764 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.06 vs. limit=10.0 2023-10-03 21:20:10,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:10,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:20:15,525 INFO [train.py:1046] (2/4) Epoch 40, batch 4350, loss[loss=0.158, simple_loss=0.2419, pruned_loss=0.03708, over 23456.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2361, pruned_loss=0.03786, over 4724118.07 frames. ], batch size: 93, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:20:16,985 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.50 vs. limit=22.5 2023-10-03 21:20:17,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 21:20:17,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:20:21,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:20:23,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:23,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1410160.0, ans=0.125 2023-10-03 21:20:27,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:20:27,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:20:33,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:20:36,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:38,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:20:38,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:20:41,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:20:43,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:20:45,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:20:50,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 21:20:50,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:20:52,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:56,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:58,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 21:21:02,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:02,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:21:04,097 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.055e+02 2.301e+02 2.803e+02 4.605e+02, threshold=4.602e+02, percent-clipped=1.0 2023-10-03 21:21:08,374 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 21:21:08,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:09,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:21:11,071 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 21:21:13,752 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 21:21:13,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:21:13,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:15,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:21:15,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:16,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:21:16,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:21:19,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 21:21:20,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:20,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:20,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:20,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 21:21:22,480 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 21:21:22,491 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 21:21:22,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 21:21:25,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:21:25,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:21:25,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:26,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:21:27,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 21:21:27,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1410493.3333333333, ans=10.0 2023-10-03 21:21:28,585 INFO [train.py:1046] (2/4) Epoch 40, batch 4400, loss[loss=0.1555, simple_loss=0.2342, pruned_loss=0.03838, over 23715.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2366, pruned_loss=0.03768, over 4739923.82 frames. ], batch size: 232, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:21:28,752 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 21:21:28,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1410493.3333333333, ans=0.0 2023-10-03 21:21:30,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:30,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1410493.3333333333, ans=0.125 2023-10-03 21:21:33,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:21:33,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:35,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:38,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 21:21:38,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 21:21:38,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 21:21:38,132 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 21:21:39,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:21:39,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:21:40,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 21:21:43,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:45,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:45,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 21:21:47,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:47,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 21:21:47,797 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 21:21:49,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 21:21:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 21:21:51,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 21:21:51,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:53,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:54,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:54,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:21:55,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 21:21:57,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 21:21:57,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:59,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:21:59,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:22:01,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:03,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:22:03,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 21:22:03,681 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 21:22:06,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:22:15,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 21:22:19,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:22:23,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:22:25,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:22:25,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 21:22:25,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:22:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:22:25,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:22:27,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:22:30,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 21:22:32,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 21:22:34,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 21:22:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:22:34,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 21:22:36,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:22:39,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:22:40,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 21:22:43,744 INFO [train.py:1046] (2/4) Epoch 40, batch 4450, loss[loss=0.1631, simple_loss=0.2471, pruned_loss=0.03951, over 23212.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2376, pruned_loss=0.03839, over 4730813.16 frames. ], batch size: 105, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:22:43,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:22:46,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:46,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:22:52,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:22:53,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:22:53,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1410826.6666666667, ans=0.95 2023-10-03 21:22:56,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1410826.6666666667, ans=0.125 2023-10-03 21:22:57,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:58,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:22:59,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:23:00,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:23:03,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 21:23:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:23:04,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:04,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:23:04,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:23:06,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1410893.3333333333, ans=0.04949747468305833 2023-10-03 21:23:07,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:23:10,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:11,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:12,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.71 vs. limit=15.0 2023-10-03 21:23:13,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:23:13,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:23:14,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:23:16,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1410960.0, ans=0.125 2023-10-03 21:23:17,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 21:23:18,071 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-03 21:23:18,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 21:23:18,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 21:23:18,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:23:22,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:23:23,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 21:23:26,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:23:27,306 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=15.0 2023-10-03 21:23:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:32,400 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.956e+02 2.159e+02 2.589e+02 3.505e+02, threshold=4.319e+02, percent-clipped=0.0 2023-10-03 21:23:32,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 21:23:32,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:32,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:23:32,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:23:32,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:23:33,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:37,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1411026.6666666667, ans=0.125 2023-10-03 21:23:38,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:23:39,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 21:23:41,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:23:42,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:23:43,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:23:46,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:47,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:23:50,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:23:52,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 21:23:53,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:23:57,166 INFO [train.py:1046] (2/4) Epoch 40, batch 4500, loss[loss=0.1647, simple_loss=0.2531, pruned_loss=0.03817, over 24424.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2378, pruned_loss=0.03852, over 4725836.68 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:23:59,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:24:01,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 21:24:01,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 21:24:02,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:24:06,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:24:06,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1411160.0, ans=0.1 2023-10-03 21:24:07,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:24:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:24:09,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:24:09,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:09,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1411160.0, ans=0.2 2023-10-03 21:24:10,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:13,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1411226.6666666667, ans=0.04949747468305833 2023-10-03 21:24:19,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1411226.6666666667, ans=0.0 2023-10-03 21:24:20,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:24:21,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:24:23,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:24:25,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:24:26,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1411293.3333333333, ans=0.0 2023-10-03 21:24:27,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:24:32,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:24:35,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:24:40,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:24:43,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:24:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 21:24:43,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:43,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:24:46,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:24:47,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:24:50,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:50,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 21:24:50,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:24:50,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:54,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:24:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:24:58,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:02,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:25:02,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:25:03,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 21:25:05,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 21:25:05,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 21:25:08,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 21:25:10,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1411493.3333333333, ans=0.0 2023-10-03 21:25:11,773 INFO [train.py:1046] (2/4) Epoch 40, batch 4550, loss[loss=0.1549, simple_loss=0.2441, pruned_loss=0.03283, over 24301.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2372, pruned_loss=0.03845, over 4728695.28 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:25:11,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 21:25:12,934 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.85 vs. limit=5.0 2023-10-03 21:25:13,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:25:17,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:25:17,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:25:21,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:25:24,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:25:26,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:25:27,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:25:27,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:25:27,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:29,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:25:30,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:25:33,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:25:36,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 21:25:36,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 21:25:38,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:25:39,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 21:25:44,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 21:25:44,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:25:46,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 21:25:48,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:25:52,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:52,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:53,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:25:55,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 21:25:57,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:25:59,968 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.936e+02 2.088e+02 2.339e+02 3.946e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 21:26:00,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:00,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:26:02,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:26:03,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 21:26:03,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 21:26:03,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:26:04,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 21:26:06,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 21:26:06,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:26:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:08,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:26:09,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:09,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:26:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:26:12,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 21:26:14,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:26:14,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 21:26:15,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 21:26:15,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:26:15,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 21:26:17,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1411760.0, ans=0.125 2023-10-03 21:26:18,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:26:18,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:26:19,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1411760.0, ans=0.0 2023-10-03 21:26:21,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:26:22,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:22,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:26:23,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:26:25,680 INFO [train.py:1046] (2/4) Epoch 40, batch 4600, loss[loss=0.1576, simple_loss=0.2293, pruned_loss=0.04299, over 23378.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.236, pruned_loss=0.03809, over 4726902.20 frames. ], batch size: 285, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:26:25,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:26:29,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:26:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:26:33,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:26:33,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:26:33,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1411826.6666666667, ans=0.2 2023-10-03 21:26:33,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1411826.6666666667, ans=0.025 2023-10-03 21:26:34,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 21:26:36,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:26:36,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1411826.6666666667, ans=10.0 2023-10-03 21:26:40,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:26:40,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:26:42,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:49,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 21:26:50,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:54,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:58,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:26:58,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:27:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 21:27:04,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:27:04,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:06,482 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.08 vs. limit=15.0 2023-10-03 21:27:08,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:08,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:27:10,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:27:14,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 21:27:16,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:27:19,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:20,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:27:23,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:23,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 21:27:24,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:24,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 21:27:24,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:24,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:26,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:28,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:27:28,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:29,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 21:27:29,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 21:27:30,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 21:27:30,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:32,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:27:33,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:33,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:39,888 INFO [train.py:1046] (2/4) Epoch 40, batch 4650, loss[loss=0.1482, simple_loss=0.2331, pruned_loss=0.03163, over 24664.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2356, pruned_loss=0.03773, over 4729219.98 frames. ], batch size: 65, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:27:41,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:27:44,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:45,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:47,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:27:47,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:47,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:27:49,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:50,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 21:27:53,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:27:55,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1412226.6666666667, ans=0.07 2023-10-03 21:27:56,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 21:27:56,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:57,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 21:27:58,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:27:59,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 21:27:59,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 21:27:59,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:00,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:28:00,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1412226.6666666667, ans=0.95 2023-10-03 21:28:04,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:28:06,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:06,578 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 21:28:06,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1412226.6666666667, ans=0.125 2023-10-03 21:28:09,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:11,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 21:28:13,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:13,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:28:15,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 21:28:15,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:28:15,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1412293.3333333333, ans=0.05 2023-10-03 21:28:19,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:28:19,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1412293.3333333333, ans=0.125 2023-10-03 21:28:24,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:28:24,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.11 vs. limit=15.0 2023-10-03 21:28:28,248 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.889e+02 2.069e+02 2.261e+02 3.657e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 21:28:28,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:30,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.38 vs. limit=12.0 2023-10-03 21:28:31,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:31,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:32,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:28:34,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 21:28:34,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 21:28:36,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 21:28:36,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 21:28:38,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:28:46,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:28:46,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:28:46,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 21:28:47,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:28:50,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:28:50,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:28:50,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:28:51,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:28:52,972 INFO [train.py:1046] (2/4) Epoch 40, batch 4700, loss[loss=0.1719, simple_loss=0.2571, pruned_loss=0.04337, over 24442.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03793, over 4731933.41 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:28:53,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:28:53,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:56,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:28:56,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:28:56,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:28:57,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 21:28:59,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:28:59,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1412493.3333333333, ans=0.0 2023-10-03 21:29:00,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 21:29:05,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1412493.3333333333, ans=0.025 2023-10-03 21:29:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:09,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:29:09,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:10,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:29:10,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1412560.0, ans=0.125 2023-10-03 21:29:11,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:29:16,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 21:29:16,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 21:29:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:19,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:29:19,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:29:21,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:26,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:29:27,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:29:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:29:33,748 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.42 vs. limit=22.5 2023-10-03 21:29:36,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 21:29:37,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:29:39,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:43,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 21:29:45,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:29:48,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1412693.3333333333, ans=0.025 2023-10-03 21:29:49,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:29:49,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 21:29:49,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1412693.3333333333, ans=0.1 2023-10-03 21:29:51,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:51,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:55,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:55,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:29:55,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 21:29:57,021 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 21:29:58,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:58,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:58,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:58,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 21:30:00,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:30:05,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 21:30:07,643 INFO [train.py:1046] (2/4) Epoch 40, batch 4750, loss[loss=0.1486, simple_loss=0.2351, pruned_loss=0.03105, over 24489.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2372, pruned_loss=0.03877, over 4703281.20 frames. ], batch size: 66, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:30:07,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:30:08,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-10-03 21:30:09,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:13,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:13,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:30:13,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1412826.6666666667, ans=0.0 2023-10-03 21:30:14,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 21:30:14,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1412826.6666666667, ans=0.125 2023-10-03 21:30:16,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:30:18,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 21:30:18,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1412826.6666666667, ans=0.125 2023-10-03 21:30:19,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1412826.6666666667, ans=0.125 2023-10-03 21:30:20,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:30:20,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:30:20,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:30:21,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1412893.3333333333, ans=10.0 2023-10-03 21:30:25,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 21:30:29,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:30:31,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 21:30:32,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:30:35,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:30:35,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:30:36,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:36,735 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 21:30:36,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 21:30:44,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 21:30:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:30:48,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:30:50,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:30:50,381 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 21:30:50,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:30:53,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:30:56,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:30:56,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1413026.6666666667, ans=0.125 2023-10-03 21:30:57,303 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.943e+02 2.101e+02 2.308e+02 3.375e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-03 21:30:59,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 21:30:59,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 21:31:00,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:31:00,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:31:00,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:02,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:31:03,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 21:31:06,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 21:31:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:12,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:31:12,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 21:31:12,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:31:12,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:15,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:31:15,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:17,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:31:20,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:31:21,560 INFO [train.py:1046] (2/4) Epoch 40, batch 4800, loss[loss=0.1642, simple_loss=0.2501, pruned_loss=0.03916, over 23736.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2376, pruned_loss=0.03883, over 4712659.41 frames. ], batch size: 85, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:31:21,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 21:31:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 21:31:22,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 21:31:23,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1413160.0, ans=0.125 2023-10-03 21:31:24,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:31:25,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:31:26,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1413160.0, ans=0.125 2023-10-03 21:31:27,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 21:31:31,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:33,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:37,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:31:39,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:39,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:40,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 21:31:40,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:31:42,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:31:43,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:31:48,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:31:48,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:48,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:31:49,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:49,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 21:31:49,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:51,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:53,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:55,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:58,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:58,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:32:02,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:32:02,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:04,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 21:32:04,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 21:32:06,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:06,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:32:07,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:32:07,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:32:07,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:32:10,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:32:11,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:32:13,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:32:15,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:19,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:20,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1413360.0, ans=0.0 2023-10-03 21:32:22,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 21:32:22,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:32:22,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:22,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:32:23,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:26,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:32:28,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:32:28,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:28,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:32:29,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:32:29,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:32:29,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1413426.6666666667, ans=0.125 2023-10-03 21:32:31,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1413426.6666666667, ans=0.0 2023-10-03 21:32:34,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:34,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:34,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:32:34,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1413426.6666666667, ans=0.0 2023-10-03 21:32:35,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 21:32:37,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 21:32:37,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:32:37,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:32:38,326 INFO [train.py:1046] (2/4) Epoch 40, batch 4850, loss[loss=0.142, simple_loss=0.2252, pruned_loss=0.02934, over 24586.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2382, pruned_loss=0.0394, over 4707834.38 frames. ], batch size: 60, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:32:38,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:32:38,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:40,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1413493.3333333333, ans=0.2 2023-10-03 21:32:41,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:41,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1413493.3333333333, ans=0.0 2023-10-03 21:32:47,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 21:32:48,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:52,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:32:53,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:32:53,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:59,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1413560.0, ans=0.5 2023-10-03 21:33:00,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:33:01,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:33:01,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 21:33:03,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.28 vs. limit=8.0 2023-10-03 21:33:04,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:33:06,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1413560.0, ans=0.125 2023-10-03 21:33:07,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:33:07,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:33:08,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:33:08,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 21:33:12,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:33:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:16,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:16,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 21:33:16,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 21:33:18,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:33:25,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:33:27,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 21:33:27,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:33:27,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:33:28,687 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.943e+02 2.210e+02 2.558e+02 3.580e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-03 21:33:28,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:33:30,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 21:33:30,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:30,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 21:33:30,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1413693.3333333333, ans=0.125 2023-10-03 21:33:31,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:33:32,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.46 vs. limit=15.0 2023-10-03 21:33:33,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:33:34,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 21:33:43,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:33:49,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:33:52,407 INFO [train.py:1046] (2/4) Epoch 40, batch 4900, loss[loss=0.1481, simple_loss=0.2199, pruned_loss=0.03821, over 23651.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.237, pruned_loss=0.03928, over 4704832.28 frames. ], batch size: 149, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:33:55,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 21:33:55,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:33:59,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:01,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:34:01,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:34:04,931 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.42 vs. limit=10.0 2023-10-03 21:34:05,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 21:34:09,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 21:34:13,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 21:34:14,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 21:34:14,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:34:14,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:34:14,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:34:14,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:34:14,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:34:15,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 21:34:17,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 21:34:18,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:34:20,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:34:20,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:34:23,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:34:24,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:26,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:34:27,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 21:34:28,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:34:29,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:34:29,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 21:34:29,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 21:34:32,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 21:34:34,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:34:34,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:34:36,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:34:37,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:37,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 21:34:37,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:34:39,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 21:34:40,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1414026.6666666667, ans=0.125 2023-10-03 21:34:41,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:34:43,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:34:44,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1414026.6666666667, ans=0.125 2023-10-03 21:34:46,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:34:47,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 21:34:49,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:34:50,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 21:34:50,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 21:34:52,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.43 vs. limit=22.5 2023-10-03 21:34:58,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:34:59,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:35:00,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 21:35:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:35:01,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:35:02,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:06,473 INFO [train.py:1046] (2/4) Epoch 40, batch 4950, loss[loss=0.1503, simple_loss=0.218, pruned_loss=0.04129, over 22695.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03839, over 4726374.18 frames. ], batch size: 322, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:35:06,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:35:06,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:35:07,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:35:07,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 21:35:08,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:35:13,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:35:13,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:35:13,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1414160.0, ans=0.125 2023-10-03 21:35:14,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 21:35:15,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 21:35:15,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:35:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 21:35:17,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:17,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:35:17,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:35:17,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:20,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:20,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:35:20,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:35:21,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:35:24,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:24,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:35:24,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1414226.6666666667, ans=0.125 2023-10-03 21:35:29,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:35:33,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:36,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:35:36,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1414293.3333333333, ans=0.05 2023-10-03 21:35:37,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:37,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:39,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:35:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 21:35:44,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 21:35:45,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:46,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:35:46,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:35:48,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:35:48,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:35:50,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:35:52,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.52 vs. limit=10.0 2023-10-03 21:35:52,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:54,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:35:56,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:35:58,275 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.018e+02 2.178e+02 2.432e+02 3.766e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-03 21:35:58,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:58,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:58,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 21:35:58,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:36:00,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:36:02,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1414360.0, ans=0.0 2023-10-03 21:36:03,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:36:04,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:36:04,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:36:04,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:36:05,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:36:07,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:36:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:36:09,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:36:10,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:36:11,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 21:36:16,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:17,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1414426.6666666667, ans=0.125 2023-10-03 21:36:20,743 INFO [train.py:1046] (2/4) Epoch 40, batch 5000, loss[loss=0.1652, simple_loss=0.2365, pruned_loss=0.04697, over 23901.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2359, pruned_loss=0.03807, over 4732701.82 frames. ], batch size: 195, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:36:20,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 21:36:20,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:36:28,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:36:28,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:36:29,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 21:36:31,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 21:36:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:36:35,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 21:36:35,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:36:35,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:36:35,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 21:36:36,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:36:37,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:36:38,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 21:36:38,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:38,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:36:39,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 21:36:41,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 21:36:41,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:36:41,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 21:36:41,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:36:43,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:43,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:36:43,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 21:36:44,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 21:36:47,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 21:36:47,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:36:47,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:48,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 21:36:48,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:36:51,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:51,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:53,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 21:36:54,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 21:36:54,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:36:57,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:36:59,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1414626.6666666667, ans=0.125 2023-10-03 21:37:02,569 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 21:37:04,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:37:05,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:37:05,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:06,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 21:37:08,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:37:08,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:37:08,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:37:09,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1414693.3333333333, ans=0.1 2023-10-03 21:37:10,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 21:37:12,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:37:14,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:37:16,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:37:22,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 21:37:24,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:28,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1414760.0, ans=0.07 2023-10-03 21:37:33,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:37:34,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:34,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:37:35,410 INFO [train.py:1046] (2/4) Epoch 40, batch 5050, loss[loss=0.1442, simple_loss=0.2315, pruned_loss=0.02847, over 24487.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2362, pruned_loss=0.03797, over 4719837.36 frames. ], batch size: 63, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:37:35,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:37:35,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:37:36,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:37:36,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:40,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:40,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 21:37:42,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:37:43,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:37:45,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:37:45,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 21:37:47,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:37:48,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:37:49,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:37:51,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:37:51,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:37:58,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 21:37:59,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:38:01,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:38:01,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 21:38:02,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:38:03,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:03,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:38:04,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 21:38:04,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 21:38:06,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:07,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:10,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:10,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1414960.0, ans=0.1 2023-10-03 21:38:11,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 21:38:14,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:38:18,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 21:38:18,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:38:18,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:38:19,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:38:20,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:38:21,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:38:24,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:38:24,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:24,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:38:24,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:38:24,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 21:38:27,410 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.936e+02 2.098e+02 2.306e+02 3.294e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-03 21:38:27,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:38:27,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:38:32,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:38:32,299 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 21:38:32,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:38:35,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:38:36,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:36,383 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 21:38:36,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1415093.3333333333, ans=0.0 2023-10-03 21:38:39,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:39,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 21:38:39,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:38:42,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:42,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 21:38:43,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 21:38:47,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:47,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:38:48,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:38:49,728 INFO [train.py:1046] (2/4) Epoch 40, batch 5100, loss[loss=0.169, simple_loss=0.2444, pruned_loss=0.04679, over 23442.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2377, pruned_loss=0.03837, over 4725068.36 frames. ], batch size: 285, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:38:51,114 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 21:38:53,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:54,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1415160.0, ans=0.1 2023-10-03 21:38:55,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 21:38:57,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 21:38:57,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:58,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:39:01,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:39:03,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 21:39:03,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 21:39:07,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:39:08,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:39:11,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:39:14,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 21:39:15,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:39:17,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:39:17,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:39:20,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:22,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:22,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 21:39:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 21:39:25,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:26,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 21:39:26,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 21:39:30,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:39:33,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1415360.0, ans=0.125 2023-10-03 21:39:39,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:39:41,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 21:39:42,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 21:39:42,447 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 21:39:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 21:39:45,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:45,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1415360.0, ans=0.2 2023-10-03 21:39:46,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 21:39:50,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 21:39:52,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:39:54,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:39:55,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 21:39:59,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:39:59,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 21:40:02,314 INFO [train.py:1046] (2/4) Epoch 40, batch 5150, loss[loss=0.1666, simple_loss=0.2499, pruned_loss=0.04165, over 23990.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.238, pruned_loss=0.03836, over 4731163.49 frames. ], batch size: 80, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:40:05,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:40:05,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:40:05,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:40:06,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:40:06,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:40:06,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:40:08,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 21:40:08,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 21:40:08,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 21:40:09,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:40:09,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 21:40:11,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:12,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 21:40:13,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.18 vs. limit=15.0 2023-10-03 21:40:13,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:40:15,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:40:18,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:40:19,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 21:40:20,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.60 vs. limit=22.5 2023-10-03 21:40:21,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:21,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:40:22,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:40:22,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:40:22,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:40:22,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1415560.0, ans=0.1 2023-10-03 21:40:23,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.19 vs. limit=15.0 2023-10-03 21:40:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:40:24,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:40:25,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 21:40:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:40:28,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:40:30,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:40:30,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 21:40:32,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:40:34,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1415626.6666666667, ans=0.0 2023-10-03 21:40:38,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:40:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 21:40:42,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:40:44,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1415626.6666666667, ans=0.0 2023-10-03 21:40:49,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:40:49,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:52,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:40:53,865 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.916e+02 2.105e+02 2.538e+02 3.935e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-03 21:40:53,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:40:55,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 21:40:57,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1415693.3333333333, ans=0.125 2023-10-03 21:41:00,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:41:00,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1415760.0, ans=0.125 2023-10-03 21:41:01,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:41:01,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:41:05,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:07,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:41:07,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 21:41:11,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:41:11,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:41:12,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1415760.0, ans=0.0 2023-10-03 21:41:13,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:41:13,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:41:14,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:41:14,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:41:14,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:41:14,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:41:16,087 INFO [train.py:1046] (2/4) Epoch 40, batch 5200, loss[loss=0.165, simple_loss=0.2495, pruned_loss=0.0402, over 23767.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2381, pruned_loss=0.03857, over 4725497.00 frames. ], batch size: 85, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:41:18,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:41:21,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:41:23,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:26,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 21:41:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:41:28,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:31,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:31,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:41:31,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:32,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 21:41:36,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:41:36,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:39,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 21:41:41,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:41:43,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:41:43,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 21:41:44,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 21:41:47,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 21:41:47,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:47,392 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 21:41:48,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:50,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:41:50,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:41:50,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 21:41:51,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:41:51,756 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:41:51,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1415960.0, ans=0.125 2023-10-03 21:41:53,735 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:41:54,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:56,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 21:41:56,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 21:41:56,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1415960.0, ans=0.125 2023-10-03 21:41:57,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 21:42:02,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 21:42:03,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:42:05,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1416026.6666666667, ans=0.2 2023-10-03 21:42:09,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:42:09,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:10,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 21:42:11,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:42:11,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 21:42:11,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:12,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:42:14,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:42:15,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:42:17,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1416093.3333333333, ans=0.0 2023-10-03 21:42:18,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:42:18,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:18,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:18,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.10 vs. limit=15.0 2023-10-03 21:42:24,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1416093.3333333333, ans=0.125 2023-10-03 21:42:25,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:25,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 21:42:26,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:42:26,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:42:27,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:29,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:42:30,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-10-03 21:42:30,674 INFO [train.py:1046] (2/4) Epoch 40, batch 5250, loss[loss=0.1596, simple_loss=0.2499, pruned_loss=0.0346, over 24660.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2367, pruned_loss=0.03868, over 4711892.95 frames. ], batch size: 73, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:42:30,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:42:33,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:42:37,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:37,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:42:39,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:42:43,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:43,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1416226.6666666667, ans=0.0 2023-10-03 21:42:44,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:42:47,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:42:47,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:42:50,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 21:42:51,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:52,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:18,736 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.994e+02 2.255e+02 2.624e+02 3.979e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-03 21:43:35,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1416426.6666666667, ans=10.0 2023-10-03 21:43:38,750 INFO [train.py:1046] (2/4) Epoch 40, batch 5300, loss[loss=0.155, simple_loss=0.2378, pruned_loss=0.03607, over 24320.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.03854, over 4713167.14 frames. ], batch size: 61, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:43:38,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1416493.3333333333, ans=0.2 2023-10-03 21:43:51,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1416560.0, ans=0.125 2023-10-03 21:43:52,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:43:52,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 21:43:52,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 21:43:52,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:52,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:53,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:53,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:53,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:53,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:43:53,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:53,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:43:53,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:43:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 21:43:53,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 21:43:53,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 21:43:53,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:43:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 21:43:54,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 21:43:54,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:54,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:54,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:54,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:43:54,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:43:54,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:43:54,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:55,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:55,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:55,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:55,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:43:55,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:55,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:43:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 21:43:56,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:43:56,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:56,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 21:43:56,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 21:43:56,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:43:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:43:56,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 21:43:56,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 21:43:56,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:43:57,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:43:57,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:43:57,334 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 21:43:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 21:43:57,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:43:57,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:58,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 21:43:58,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 21:43:58,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 21:43:58,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:44:05,017 INFO [train.py:1046] (2/4) Epoch 41, batch 0, loss[loss=0.1317, simple_loss=0.2096, pruned_loss=0.02694, over 24267.00 frames. ], tot_loss[loss=0.1317, simple_loss=0.2096, pruned_loss=0.02694, over 24267.00 frames. ], batch size: 56, lr: 2.50e-03, grad_scale: 32.0 2023-10-03 21:44:05,017 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 21:44:17,635 INFO [train.py:1078] (2/4) Epoch 41, validation: loss=0.3341, simple_loss=0.2655, pruned_loss=0.2013, over 1125622.00 frames. 2023-10-03 21:44:17,635 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 21:44:18,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1416573.3333333333, ans=0.0 2023-10-03 21:44:19,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 21:44:21,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:44:23,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:44:28,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:28,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:44:29,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:29,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 21:44:32,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 21:44:35,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:35,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:38,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:38,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:39,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:44:39,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:44:41,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 21:44:44,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:44:46,823 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=12.0 2023-10-03 21:44:50,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:44:50,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:52,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 21:44:56,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:44:56,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:44:58,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:02,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:45:05,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:12,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 21:45:15,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 21:45:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:45:15,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:17,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:45:18,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:45:18,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 21:45:20,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:21,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:24,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:45:27,658 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 21:45:29,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:45:31,717 INFO [train.py:1046] (2/4) Epoch 41, batch 50, loss[loss=0.154, simple_loss=0.2307, pruned_loss=0.03866, over 22869.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2384, pruned_loss=0.03982, over 1065793.63 frames. ], batch size: 322, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 21:45:33,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:45:36,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:45:36,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 21:45:36,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:45:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:45:37,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:45:40,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:45:42,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:45:45,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 21:45:45,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:49,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1416973.3333333333, ans=0.1 2023-10-03 21:45:50,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:45:50,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 21:45:52,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1416973.3333333333, ans=0.1 2023-10-03 21:45:53,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 21:45:55,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:45:55,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:45:55,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:57,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:45:57,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.18 vs. limit=6.0 2023-10-03 21:45:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:45:59,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:45:59,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:46:05,816 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.939e+02 2.175e+02 2.475e+02 4.066e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 21:46:07,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:46:08,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:08,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:46:08,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 21:46:10,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:46:12,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:46:13,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 21:46:13,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:46:16,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 21:46:17,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1417106.6666666667, ans=0.125 2023-10-03 21:46:20,845 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.70 vs. limit=22.5 2023-10-03 21:46:23,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:46:23,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:46:24,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:25,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:46:25,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:46:28,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 21:46:28,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 21:46:28,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1417106.6666666667, ans=0.025 2023-10-03 21:46:30,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:30,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:46:31,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1417173.3333333333, ans=0.0 2023-10-03 21:46:33,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:46:33,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:46:34,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 21:46:35,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 21:46:35,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1417173.3333333333, ans=0.2 2023-10-03 21:46:36,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 21:46:38,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:46:38,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:46:39,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 21:46:39,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 21:46:41,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:46:41,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:44,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:46:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:46:45,742 INFO [train.py:1046] (2/4) Epoch 41, batch 100, loss[loss=0.1687, simple_loss=0.2557, pruned_loss=0.04087, over 24565.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2395, pruned_loss=0.03873, over 1889810.72 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:46:45,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:46:48,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:46:51,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:46:51,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 21:46:51,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:55,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:46:55,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:46:55,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:55,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:46:55,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:46:56,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 21:47:00,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:47:00,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:00,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:00,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:47:01,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1417306.6666666667, ans=0.1 2023-10-03 21:47:04,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 21:47:06,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:06,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:07,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:47:08,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1417306.6666666667, ans=0.025 2023-10-03 21:47:09,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:47:12,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1417306.6666666667, ans=0.0 2023-10-03 21:47:14,092 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 21:47:15,318 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 21:47:15,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:15,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:47:17,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1417373.3333333333, ans=0.125 2023-10-03 21:47:19,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:47:19,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1417373.3333333333, ans=0.0 2023-10-03 21:47:21,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:22,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:25,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1417373.3333333333, ans=0.125 2023-10-03 21:47:27,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:29,177 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 21:47:30,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:47:32,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1417440.0, ans=0.1 2023-10-03 21:47:33,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:47:35,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:47:37,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:40,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:41,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:47:44,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:47:45,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:48,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:48,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:47:49,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:49,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 21:47:49,754 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 21:47:51,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:51,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:47:52,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:47:52,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:52,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 21:47:53,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:47:53,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:47:53,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:47:55,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:56,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:56,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:47:57,823 INFO [train.py:1046] (2/4) Epoch 41, batch 150, loss[loss=0.1634, simple_loss=0.2328, pruned_loss=0.04696, over 23833.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2391, pruned_loss=0.03932, over 2515274.43 frames. ], batch size: 179, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:47:57,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:47:58,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1417573.3333333333, ans=0.0 2023-10-03 21:48:01,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:05,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:48:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:05,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:09,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:48:09,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:12,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:48:12,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:16,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 21:48:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 21:48:16,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 21:48:20,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:48:20,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:48:21,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:48:22,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:48:22,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:48:22,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:22,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:24,215 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 21:48:25,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:48:26,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=15.0 2023-10-03 21:48:29,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:30,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1417706.6666666667, ans=0.125 2023-10-03 21:48:32,373 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.887e+02 2.079e+02 2.290e+02 3.335e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-03 21:48:34,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:48:35,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 21:48:37,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:48:37,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:39,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:48:41,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:48:41,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=1417773.3333333333, ans=0.2 2023-10-03 21:48:42,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:48:44,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:48:45,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:47,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 21:48:51,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:51,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:48:52,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:48:52,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:48:55,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:56,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 21:48:59,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:49:00,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:49:01,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:03,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:49:03,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 21:49:03,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:49:03,651 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 21:49:07,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:49:11,704 INFO [train.py:1046] (2/4) Epoch 41, batch 200, loss[loss=0.1327, simple_loss=0.2154, pruned_loss=0.02496, over 20351.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2388, pruned_loss=0.0385, over 3019135.26 frames. ], batch size: 44, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:49:11,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:49:12,106 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:49:13,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:49:16,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 21:49:17,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:17,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:19,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 21:49:21,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:49:22,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:22,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1417906.6666666667, ans=0.125 2023-10-03 21:49:23,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:49:28,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:49:28,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:49:28,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:33,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1417973.3333333333, ans=0.125 2023-10-03 21:49:43,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=1418040.0, ans=0.1 2023-10-03 21:49:46,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1418040.0, ans=0.0 2023-10-03 21:49:47,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:49:47,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:49:48,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:49:48,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:49:50,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:49:50,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:49:52,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:49:52,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:49:54,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:54,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:49:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 21:49:57,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:49:57,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:03,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:50:06,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1418106.6666666667, ans=0.125 2023-10-03 21:50:07,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:50:14,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:16,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:50:17,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1418173.3333333333, ans=0.1 2023-10-03 21:50:22,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:23,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 21:50:25,601 INFO [train.py:1046] (2/4) Epoch 41, batch 250, loss[loss=0.1495, simple_loss=0.2341, pruned_loss=0.0325, over 24666.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2387, pruned_loss=0.03831, over 3403721.95 frames. ], batch size: 68, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:50:25,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:25,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:50:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:50:25,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:50:28,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 21:50:28,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:50:29,711 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 21:50:31,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:31,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:50:32,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:33,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:35,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:50:35,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:37,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:50:38,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:50:47,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:50:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:50:50,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:50:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:50:58,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:50:59,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:50:59,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:51:00,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.78 vs. limit=15.0 2023-10-03 21:51:00,898 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.902e+02 2.095e+02 2.391e+02 3.475e+02, threshold=4.190e+02, percent-clipped=0.0 2023-10-03 21:51:01,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:51:01,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:51:02,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:51:05,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:51:08,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 21:51:08,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:51:08,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.59 vs. limit=15.0 2023-10-03 21:51:09,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:51:11,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:51:11,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:51:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:51:12,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:51:12,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:51:14,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1418440.0, ans=0.0 2023-10-03 21:51:15,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:15,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:51:16,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:21,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:51:24,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:26,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:51:29,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:30,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:51:35,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 21:51:37,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:51:37,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:51:38,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 21:51:39,830 INFO [train.py:1046] (2/4) Epoch 41, batch 300, loss[loss=0.1686, simple_loss=0.2366, pruned_loss=0.05032, over 23779.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2367, pruned_loss=0.03797, over 3695358.12 frames. ], batch size: 179, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:51:39,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:51:41,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:51:41,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 21:51:45,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:47,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:51:51,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:51:53,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 21:51:53,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:56,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:51:56,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 21:51:56,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:00,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:52:05,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:52:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 21:52:08,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 21:52:08,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:11,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:13,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:13,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 21:52:13,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:52:15,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:52:19,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:52:19,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:52:22,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:52:22,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 21:52:23,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:52:25,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:26,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 21:52:27,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1418773.3333333333, ans=0.1 2023-10-03 21:52:28,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:52:31,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:52:36,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:52:36,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 21:52:40,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:40,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:52:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:42,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1418840.0, ans=0.0 2023-10-03 21:52:44,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:52:44,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 21:52:44,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:52:45,557 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.46 vs. limit=22.5 2023-10-03 21:52:46,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:52:46,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 21:52:49,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:49,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:50,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:50,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:52:51,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:52,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1418840.0, ans=0.125 2023-10-03 21:52:55,530 INFO [train.py:1046] (2/4) Epoch 41, batch 350, loss[loss=0.1436, simple_loss=0.2244, pruned_loss=0.03142, over 24296.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2345, pruned_loss=0.03753, over 3921764.22 frames. ], batch size: 61, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:52:55,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:52:55,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 21:52:58,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:58,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1418906.6666666667, ans=0.2 2023-10-03 21:53:02,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:53:06,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:06,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 21:53:10,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:53:11,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 21:53:12,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:12,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 21:53:14,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:53:17,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 21:53:17,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:53:20,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:53:20,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:53:21,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:21,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:21,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:53:21,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:23,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:53:23,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1419040.0, ans=0.1 2023-10-03 21:53:23,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1419040.0, ans=0.2 2023-10-03 21:53:24,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:53:24,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:31,193 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.922e+02 2.071e+02 2.368e+02 3.597e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 21:53:32,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:53:32,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:53:32,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1419040.0, ans=0.125 2023-10-03 21:53:34,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:53:34,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:38,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 21:53:38,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:38,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1419106.6666666667, ans=0.0 2023-10-03 21:53:39,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1419106.6666666667, ans=0.0 2023-10-03 21:53:44,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:44,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:53:44,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:53:45,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 21:53:47,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:53:48,521 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 21:53:50,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 21:53:50,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:53,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:53:53,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 21:53:53,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1419173.3333333333, ans=0.0 2023-10-03 21:53:56,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:53:59,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:53:59,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:02,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:02,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:54:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:54:07,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:54:07,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1419173.3333333333, ans=0.2 2023-10-03 21:54:09,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:54:10,458 INFO [train.py:1046] (2/4) Epoch 41, batch 400, loss[loss=0.1629, simple_loss=0.2344, pruned_loss=0.04574, over 22776.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2343, pruned_loss=0.03763, over 4095219.22 frames. ], batch size: 322, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 21:54:10,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 21:54:10,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:10,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:12,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:54:13,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:13,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1419240.0, ans=0.0 2023-10-03 21:54:14,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:16,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:18,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 21:54:20,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 21:54:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:21,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 21:54:21,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:27,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:54:27,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:54:28,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 21:54:28,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:54:28,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:28,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:54:28,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:31,399 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 21:54:31,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 21:54:32,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1419306.6666666667, ans=0.125 2023-10-03 21:54:37,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:37,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1419306.6666666667, ans=0.125 2023-10-03 21:54:38,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:38,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 21:54:38,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1419373.3333333333, ans=0.125 2023-10-03 21:54:40,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 21:54:43,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:54:44,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:54:50,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 21:54:54,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:54:55,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 21:54:59,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:55:00,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:55:00,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 21:55:03,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:55:03,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1419440.0, ans=0.125 2023-10-03 21:55:05,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1419440.0, ans=0.125 2023-10-03 21:55:05,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1419440.0, ans=0.125 2023-10-03 21:55:06,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:55:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:55:10,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:11,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 21:55:13,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:55:15,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 21:55:16,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:55:16,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:55:18,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 21:55:21,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:55:22,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:55:22,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:55:22,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 21:55:24,090 INFO [train.py:1046] (2/4) Epoch 41, batch 450, loss[loss=0.1567, simple_loss=0.2374, pruned_loss=0.03799, over 23441.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2351, pruned_loss=0.03779, over 4238281.30 frames. ], batch size: 93, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:55:24,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:55:24,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1419573.3333333333, ans=0.125 2023-10-03 21:55:25,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:55:26,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:55:26,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 21:55:26,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:55:28,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:55:31,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:55:42,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:43,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:55:44,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 21:55:45,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.70 vs. limit=10.0 2023-10-03 21:55:46,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 21:55:48,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:55:52,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:53,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:55:54,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.18 vs. limit=15.0 2023-10-03 21:55:56,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:55:58,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:55:59,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 21:56:00,960 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.852e+02 2.077e+02 2.381e+02 3.655e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-03 21:56:01,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 21:56:04,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 21:56:04,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:05,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:05,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:56:07,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 21:56:07,740 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 21:56:07,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:56:09,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:56:10,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:56:13,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:56:13,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:56:14,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 21:56:14,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 21:56:16,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:56:18,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:56:18,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:56:20,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 21:56:23,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:56:25,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 21:56:25,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 21:56:27,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:56:32,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:56:32,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:56:35,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:56:35,586 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 21:56:38,160 INFO [train.py:1046] (2/4) Epoch 41, batch 500, loss[loss=0.1662, simple_loss=0.2383, pruned_loss=0.04701, over 23764.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2367, pruned_loss=0.03839, over 4346900.12 frames. ], batch size: 179, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:56:39,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:39,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:56:41,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:41,080 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 21:56:42,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 21:56:42,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:43,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.78 vs. limit=15.0 2023-10-03 21:56:45,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:56:49,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:56:51,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:56:53,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:56:53,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:55,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:00,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1419973.3333333333, ans=0.125 2023-10-03 21:57:05,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:05,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 21:57:05,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:57:07,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 21:57:07,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:57:10,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:57:11,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:57:11,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:57:11,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:11,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 21:57:14,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1420040.0, ans=0.0 2023-10-03 21:57:15,499 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 21:57:16,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.20 vs. limit=22.5 2023-10-03 21:57:16,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:19,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:19,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:19,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:20,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:57:22,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 21:57:24,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:57:26,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:26,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1420106.6666666667, ans=0.125 2023-10-03 21:57:30,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:33,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:34,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1420173.3333333333, ans=0.1 2023-10-03 21:57:39,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:42,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 21:57:42,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:42,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:43,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 21:57:45,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:57:47,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:50,623 INFO [train.py:1046] (2/4) Epoch 41, batch 550, loss[loss=0.1401, simple_loss=0.2193, pruned_loss=0.03043, over 24631.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2374, pruned_loss=0.03862, over 4431881.64 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:57:50,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1420240.0, ans=0.09899494936611666 2023-10-03 21:57:52,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 21:57:55,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 21:57:55,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:55,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 21:57:56,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:57:56,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:57,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:57,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:57,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:57:58,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:58:01,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:58:03,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 21:58:03,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:58:05,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1420306.6666666667, ans=0.0 2023-10-03 21:58:06,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.66 vs. limit=15.0 2023-10-03 21:58:07,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:07,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:07,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1420306.6666666667, ans=0.025 2023-10-03 21:58:12,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:58:12,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:16,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 21:58:16,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 21:58:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:58:19,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1420373.3333333333, ans=0.0 2023-10-03 21:58:22,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1420373.3333333333, ans=0.125 2023-10-03 21:58:24,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:58:25,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:58:26,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:58:28,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.959e+02 2.274e+02 2.623e+02 4.132e+02, threshold=4.548e+02, percent-clipped=0.0 2023-10-03 21:58:31,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:31,090 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 21:58:31,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:32,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 21:58:35,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:58:36,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:58:36,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:58:37,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:39,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 21:58:41,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 21:58:41,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:58:42,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:58:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:58:42,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:58:45,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:58:46,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:58:48,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:58:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:50,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:58:50,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:58:51,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:58:53,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:58:55,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:56,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:58:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:59:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 21:59:05,219 INFO [train.py:1046] (2/4) Epoch 41, batch 600, loss[loss=0.1566, simple_loss=0.2471, pruned_loss=0.03298, over 24535.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.238, pruned_loss=0.03941, over 4478394.96 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:59:06,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 21:59:08,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:59:08,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:59:08,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:16,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:59:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:59:17,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 21:59:17,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1420573.3333333333, ans=0.0 2023-10-03 21:59:20,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:59:21,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:59:24,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:59:26,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1420640.0, ans=0.125 2023-10-03 21:59:27,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 21:59:27,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:59:33,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 21:59:36,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:59:36,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:59:36,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:59:42,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:59:42,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:59:44,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:50,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:59:53,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1420773.3333333333, ans=0.125 2023-10-03 21:59:54,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:54,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:59:54,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:00:03,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 22:00:06,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:00:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:00:08,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1420840.0, ans=0.125 2023-10-03 22:00:09,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1420840.0, ans=0.0 2023-10-03 22:00:10,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 22:00:12,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:00:12,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=15.0 2023-10-03 22:00:15,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 22:00:17,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:00:17,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:00:20,120 INFO [train.py:1046] (2/4) Epoch 41, batch 650, loss[loss=0.1564, simple_loss=0.2511, pruned_loss=0.03081, over 24454.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2365, pruned_loss=0.03873, over 4536084.44 frames. ], batch size: 69, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:00:21,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 22:00:21,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:00:24,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:00:26,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:00:27,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:30,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 22:00:32,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:00:36,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:00:36,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:00:40,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:43,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 22:00:44,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:00:46,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:00:49,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:00:49,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:00:52,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:53,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:53,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:00:54,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:56,661 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.942e+02 2.236e+02 2.477e+02 3.530e+02, threshold=4.472e+02, percent-clipped=0.0 2023-10-03 22:00:56,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:00:58,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:00:58,309 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 22:00:58,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:58,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:00:58,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1421040.0, ans=0.125 2023-10-03 22:01:02,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:02,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:01:02,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:03,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:01:05,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 22:01:05,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:01:05,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:01:06,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:01:06,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:01:09,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:01:11,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 22:01:11,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 22:01:13,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:13,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:01:13,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:01:13,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:01:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:01:21,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:21,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:01:22,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:01:25,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:25,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:01:27,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:33,466 INFO [train.py:1046] (2/4) Epoch 41, batch 700, loss[loss=0.1687, simple_loss=0.2548, pruned_loss=0.04134, over 23762.00 frames. ], tot_loss[loss=0.156, simple_loss=0.235, pruned_loss=0.03849, over 4561906.84 frames. ], batch size: 85, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:01:33,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:01:33,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:01:33,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:01:34,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:01:37,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 22:01:37,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 22:01:41,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 22:01:41,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:44,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:01:46,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 22:01:51,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:01:52,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:01:54,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:54,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:01:54,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:01:56,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1421306.6666666667, ans=0.125 2023-10-03 22:01:59,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:02:00,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1421306.6666666667, ans=0.125 2023-10-03 22:02:01,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 22:02:01,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:02:04,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 22:02:06,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 22:02:08,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:02:08,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:02:10,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:02:14,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:02:14,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 22:02:20,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:20,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:02:20,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 22:02:25,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:02:26,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:29,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:02:34,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:02:34,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 22:02:36,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 22:02:36,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 22:02:39,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:41,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:02:42,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:02:45,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:45,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 22:02:45,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1421573.3333333333, ans=0.2 2023-10-03 22:02:47,441 INFO [train.py:1046] (2/4) Epoch 41, batch 750, loss[loss=0.145, simple_loss=0.2256, pruned_loss=0.03226, over 23500.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2346, pruned_loss=0.03828, over 4588181.30 frames. ], batch size: 134, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:02:48,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 22:02:48,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 22:02:50,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 22:02:50,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 22:02:50,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 22:02:52,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:02:53,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 22:02:54,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:56,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:02:56,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:02:57,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:57,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:02:58,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:03:00,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:03:02,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:03:06,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:03:07,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:03:07,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:03:08,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 22:03:10,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:03:10,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:03:11,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:03:13,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:03:14,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 22:03:14,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:03:16,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 22:03:16,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 22:03:17,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 22:03:17,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:03:17,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:03:18,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1421706.6666666667, ans=0.125 2023-10-03 22:03:21,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:03:24,237 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.879e+02 2.063e+02 2.265e+02 2.874e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-03 22:03:28,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:03:28,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:28,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:03:31,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:03:31,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:03:31,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 22:03:33,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:03:35,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:03:37,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:03:38,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:03:39,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 22:03:40,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:40,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.69 vs. limit=22.5 2023-10-03 22:03:44,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:03:45,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:03:45,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:03:47,876 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.33 vs. limit=22.5 2023-10-03 22:03:48,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:03:51,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 22:03:52,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:03:52,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:03:53,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1421840.0, ans=0.125 2023-10-03 22:03:54,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:03:55,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:03:58,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:58,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:04:00,895 INFO [train.py:1046] (2/4) Epoch 41, batch 800, loss[loss=0.2078, simple_loss=0.2707, pruned_loss=0.0724, over 19133.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2357, pruned_loss=0.03865, over 4617725.61 frames. ], batch size: 388, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:04:06,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:04:06,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:07,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1421906.6666666667, ans=0.0 2023-10-03 22:04:09,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:04:09,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:04:10,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.64 vs. limit=15.0 2023-10-03 22:04:11,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:12,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:12,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:14,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1421973.3333333333, ans=0.09899494936611666 2023-10-03 22:04:17,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:19,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:04:22,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 22:04:22,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:22,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:04:24,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:04:24,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:04:24,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 22:04:24,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:25,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 22:04:28,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:30,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:32,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:04:32,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:04:35,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:35,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:38,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:04:38,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:04:38,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 22:04:40,344 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 22:04:40,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 22:04:40,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:04:40,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:04:41,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:41,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:04:46,319 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 22:04:47,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 22:04:48,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:04:50,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:04:55,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:04:58,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:59,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 22:04:59,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:05:00,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1422173.3333333333, ans=0.2 2023-10-03 22:05:02,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 22:05:05,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:05:08,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:05:09,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 22:05:10,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:05:11,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:05:12,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 22:05:12,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:14,103 INFO [train.py:1046] (2/4) Epoch 41, batch 850, loss[loss=0.1497, simple_loss=0.2246, pruned_loss=0.03739, over 23153.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.03925, over 4619583.67 frames. ], batch size: 119, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:05:14,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:05:15,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:17,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:05:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:05:22,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 22:05:22,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 22:05:22,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 22:05:24,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:05:25,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:05:26,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:26,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:05:26,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:05:31,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:31,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:05:32,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 22:05:36,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1422306.6666666667, ans=0.025 2023-10-03 22:05:37,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 22:05:40,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:40,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 22:05:43,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 22:05:44,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 22:05:47,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 22:05:47,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:05:47,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:05:48,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:05:51,485 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.958e+02 2.183e+02 2.406e+02 3.404e+02, threshold=4.367e+02, percent-clipped=0.0 2023-10-03 22:05:51,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:51,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1422373.3333333333, ans=0.07 2023-10-03 22:05:52,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:53,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 22:05:56,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:05:56,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:05:58,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:05:58,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:06:00,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:06:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:06:02,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 22:06:06,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:06:06,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:06:08,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:06:08,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:06:08,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1422440.0, ans=0.0 2023-10-03 22:06:09,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:06:12,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:06:15,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:06:16,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:06:16,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:17,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:06:19,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1422506.6666666667, ans=0.0 2023-10-03 22:06:24,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:06:25,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:06:25,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 22:06:27,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:06:27,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:06:29,454 INFO [train.py:1046] (2/4) Epoch 41, batch 900, loss[loss=0.173, simple_loss=0.2415, pruned_loss=0.05219, over 23789.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2376, pruned_loss=0.03897, over 4657504.64 frames. ], batch size: 164, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:06:30,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 22:06:36,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:06:40,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:40,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 22:06:45,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:06:45,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 22:06:46,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:06:48,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:06:48,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:06:49,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:06:49,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:06:58,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:06:58,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:58,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:07:01,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:07:02,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1422706.6666666667, ans=0.125 2023-10-03 22:07:06,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 22:07:06,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:07:11,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:07:12,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:07:12,372 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 22:07:14,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 22:07:16,368 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.18 vs. limit=15.0 2023-10-03 22:07:21,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:07:21,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:07:21,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:07:26,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1422773.3333333333, ans=0.1 2023-10-03 22:07:27,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:27,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:07:29,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 22:07:29,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:07:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 22:07:32,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:07:33,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:34,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:07:34,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:07:39,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 22:07:39,681 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 22:07:41,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 22:07:41,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 22:07:43,765 INFO [train.py:1046] (2/4) Epoch 41, batch 950, loss[loss=0.1529, simple_loss=0.2415, pruned_loss=0.03215, over 24457.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2381, pruned_loss=0.03913, over 4660329.28 frames. ], batch size: 69, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:07:45,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:45,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1422906.6666666667, ans=0.09899494936611666 2023-10-03 22:07:48,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 22:07:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:07:55,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:07:55,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1422906.6666666667, ans=0.2 2023-10-03 22:07:57,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:07:58,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:08:00,077 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 22:08:03,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:03,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:08:03,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1422973.3333333333, ans=0.0 2023-10-03 22:08:05,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:08:05,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:08:05,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 22:08:05,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1422973.3333333333, ans=0.2 2023-10-03 22:08:06,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:08:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:09,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 22:08:09,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:08:12,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1423040.0, ans=0.125 2023-10-03 22:08:13,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:13,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:08:13,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:08:13,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1423040.0, ans=0.125 2023-10-03 22:08:15,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 22:08:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 22:08:16,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1423040.0, ans=0.0 2023-10-03 22:08:21,135 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.951e+02 2.151e+02 2.455e+02 3.832e+02, threshold=4.302e+02, percent-clipped=0.0 2023-10-03 22:08:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:08:21,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:08:21,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1423040.0, ans=0.1 2023-10-03 22:08:25,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:08:26,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:08:30,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 22:08:30,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 22:08:30,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:08:32,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:08:32,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:32,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:08:36,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 22:08:38,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:08:38,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1423106.6666666667, ans=0.2 2023-10-03 22:08:39,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:08:40,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:40,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 22:08:40,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:40,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:08:41,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 22:08:45,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:08:47,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:50,698 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=15.0 2023-10-03 22:08:52,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:08:54,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 22:08:54,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 22:08:57,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:58,932 INFO [train.py:1046] (2/4) Epoch 41, batch 1000, loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.03586, over 23412.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2382, pruned_loss=0.03897, over 4670367.18 frames. ], batch size: 134, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:09:02,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 22:09:02,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:08,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:09:09,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 22:09:09,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 22:09:13,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:13,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:09:15,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:17,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 22:09:21,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 22:09:22,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 22:09:23,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:09:27,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 22:09:27,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 22:09:27,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 22:09:28,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1423373.3333333333, ans=0.125 2023-10-03 22:09:29,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:29,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:34,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1423373.3333333333, ans=0.1 2023-10-03 22:09:39,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:39,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:09:39,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:40,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 22:09:40,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:09:41,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:09:42,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:43,383 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 22:09:46,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 22:09:46,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 22:09:48,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 22:09:50,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:09:51,698 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=3.99 vs. limit=15.0 2023-10-03 22:09:56,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:56,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:09:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:58,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1423506.6666666667, ans=0.5 2023-10-03 22:09:59,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:10:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 22:10:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:10:04,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 22:10:04,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 22:10:06,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:10:06,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:10:08,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:10:10,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:10:10,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:10:13,251 INFO [train.py:1046] (2/4) Epoch 41, batch 1050, loss[loss=0.1368, simple_loss=0.2154, pruned_loss=0.02915, over 24404.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2366, pruned_loss=0.03838, over 4680448.46 frames. ], batch size: 58, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:10:14,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:10:14,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:10:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 22:10:17,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:10:20,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:10:22,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1423573.3333333333, ans=0.1 2023-10-03 22:10:23,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:10:24,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:10:26,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:10:26,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:10:28,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:10:29,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:10:29,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 22:10:30,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:10:30,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 22:10:34,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:10:34,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 22:10:34,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:10:41,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:10:43,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:10:43,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:10:45,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 22:10:45,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 22:10:45,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:10:46,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1423706.6666666667, ans=0.125 2023-10-03 22:10:46,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1423706.6666666667, ans=0.09899494936611666 2023-10-03 22:10:48,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 22:10:49,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1423706.6666666667, ans=0.125 2023-10-03 22:10:50,170 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.894e+02 2.056e+02 2.296e+02 3.481e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-03 22:10:50,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 22:10:51,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:10:55,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:10:57,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:10:58,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:10:58,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:11:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:11:05,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 22:11:07,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 22:11:08,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 22:11:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:11:08,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:11:10,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 22:11:12,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:11:15,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:11:16,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:11:16,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:11:16,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:11:18,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1423840.0, ans=0.125 2023-10-03 22:11:19,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:11:19,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 22:11:22,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:11:22,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 22:11:22,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 22:11:22,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:11:27,647 INFO [train.py:1046] (2/4) Epoch 41, batch 1100, loss[loss=0.1539, simple_loss=0.2291, pruned_loss=0.03939, over 22026.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03797, over 4683122.55 frames. ], batch size: 48, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:11:29,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:11:31,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:11:35,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:11:35,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:11:35,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:11:37,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 22:11:38,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:11:41,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:11:42,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:11:46,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:11:46,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 22:11:46,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:11:48,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:11:48,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:11:48,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-03 22:11:51,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:11:52,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:11:56,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:11:59,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 22:11:59,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1424040.0, ans=0.0 2023-10-03 22:12:00,431 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 22:12:01,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:03,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:05,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:12:05,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:12:06,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 22:12:08,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:12:08,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:12:08,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:12:08,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:08,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 22:12:15,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:12:15,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 22:12:16,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:12:22,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:12:24,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 22:12:24,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:12:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:30,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:12:30,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:12:30,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 22:12:31,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:12:31,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:12:33,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 22:12:35,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:12:35,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 22:12:37,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:12:37,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:12:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:12:42,383 INFO [train.py:1046] (2/4) Epoch 41, batch 1150, loss[loss=0.1506, simple_loss=0.2265, pruned_loss=0.03734, over 23347.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03838, over 4687708.57 frames. ], batch size: 119, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:12:43,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:12:46,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:12:49,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:12:49,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:12:49,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 22:12:50,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:12:51,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1424240.0, ans=0.125 2023-10-03 22:12:53,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 22:12:53,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:12:55,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:13:01,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 22:13:02,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:07,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:13:07,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:08,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 22:13:08,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:13:08,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:13:11,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 22:13:13,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:13,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1424373.3333333333, ans=0.125 2023-10-03 22:13:14,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:13:18,761 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.950e+02 2.098e+02 2.374e+02 3.607e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-03 22:13:21,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:27,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:27,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 22:13:27,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:27,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:35,401 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 22:13:36,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:44,097 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 22:13:47,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:13:48,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:13:48,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:13:49,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:13:51,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:13:55,999 INFO [train.py:1046] (2/4) Epoch 41, batch 1200, loss[loss=0.1508, simple_loss=0.2249, pruned_loss=0.03834, over 22815.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2371, pruned_loss=0.03863, over 4703917.51 frames. ], batch size: 322, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:13:57,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:13:57,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:13:57,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1424573.3333333333, ans=0.125 2023-10-03 22:13:58,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:58,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:00,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:14:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:14:04,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:14:06,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:14:07,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:14:09,683 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 22:14:11,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 22:14:17,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:14:18,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:14:18,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1424640.0, ans=0.0 2023-10-03 22:14:20,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:14:21,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1424640.0, ans=0.025 2023-10-03 22:14:22,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:14:22,918 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 22:14:24,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:31,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:14:31,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:14:31,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 22:14:33,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:14:33,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.13 vs. limit=15.0 2023-10-03 22:14:36,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 22:14:40,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 22:14:40,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:40,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:14:44,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:14:44,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:14:47,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:14:47,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:14:48,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:14:48,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 22:14:48,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:14:48,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:14:48,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:14:52,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:14:52,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:14:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:14:58,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1424840.0, ans=0.09899494936611666 2023-10-03 22:14:59,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:15:01,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 22:15:07,064 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 22:15:07,717 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.07 vs. limit=12.0 2023-10-03 22:15:08,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:15:08,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1424906.6666666667, ans=0.0 2023-10-03 22:15:09,870 INFO [train.py:1046] (2/4) Epoch 41, batch 1250, loss[loss=0.1493, simple_loss=0.2279, pruned_loss=0.03529, over 24606.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2376, pruned_loss=0.03875, over 4713021.12 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:15:09,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:15:11,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:15:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:15:15,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 22:15:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:15:20,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:22,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 22:15:22,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:15:23,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:15:24,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1424973.3333333333, ans=0.125 2023-10-03 22:15:28,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:15:28,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:30,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:15:30,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:15:33,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:15:35,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:15:36,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:15:36,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:15:36,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:15:36,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:39,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:15:40,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:15:47,168 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.967e+02 2.225e+02 2.437e+02 4.651e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-03 22:15:47,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 22:15:47,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:15:47,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1425040.0, ans=0.125 2023-10-03 22:15:50,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:15:50,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 22:15:51,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:51,629 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 22:15:51,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:51,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:56,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:16:00,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:16:00,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:16:01,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 22:16:01,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 22:16:01,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 22:16:04,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:05,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 22:16:05,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:16:08,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 22:16:08,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:16:11,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 22:16:11,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:16:12,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:16:12,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:16:14,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:16:16,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 22:16:18,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:16:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:16:21,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:16:23,732 INFO [train.py:1046] (2/4) Epoch 41, batch 1300, loss[loss=0.1455, simple_loss=0.2162, pruned_loss=0.03739, over 23778.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2381, pruned_loss=0.03882, over 4713881.15 frames. ], batch size: 164, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:16:23,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:16:25,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:16:26,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 22:16:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:16:32,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:16:34,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:16:35,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:16:35,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 22:16:41,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:16:41,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:16:42,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 22:16:47,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:16:49,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:16:52,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:16:53,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:16:55,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:16:56,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:16:57,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 22:17:02,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:17:03,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:17:05,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 22:17:05,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:17:08,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:17:10,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:17:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 22:17:10,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:10,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 22:17:12,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:18,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:17:18,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:17:20,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 22:17:21,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 22:17:23,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 22:17:25,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1425506.6666666667, ans=0.125 2023-10-03 22:17:26,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:17:28,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 22:17:31,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:17:37,915 INFO [train.py:1046] (2/4) Epoch 41, batch 1350, loss[loss=0.166, simple_loss=0.2444, pruned_loss=0.04382, over 23536.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03859, over 4699832.94 frames. ], batch size: 93, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:17:38,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 22:17:39,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1425573.3333333333, ans=0.125 2023-10-03 22:17:40,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:17:42,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:17:42,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1425573.3333333333, ans=0.0 2023-10-03 22:17:46,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:17:46,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:17:48,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:17:48,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:17:51,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:17:54,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 22:17:55,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:17:55,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:17:55,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1425640.0, ans=0.1 2023-10-03 22:17:58,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 22:17:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:18:01,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:18:01,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 22:18:01,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 22:18:01,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1425640.0, ans=0.125 2023-10-03 22:18:04,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 22:18:05,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:05,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 22:18:16,171 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 1.965e+02 2.297e+02 2.654e+02 3.860e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 22:18:16,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:21,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1425773.3333333333, ans=0.0 2023-10-03 22:18:26,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:26,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:28,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 22:18:32,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:32,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 22:18:32,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:18:34,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:18:35,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:18:36,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 22:18:38,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:18:41,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 22:18:41,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1425840.0, ans=0.125 2023-10-03 22:18:42,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 22:18:45,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1425840.0, ans=10.0 2023-10-03 22:18:48,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 22:18:49,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:52,242 INFO [train.py:1046] (2/4) Epoch 41, batch 1400, loss[loss=0.1497, simple_loss=0.2231, pruned_loss=0.03813, over 23762.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2361, pruned_loss=0.03828, over 4711928.91 frames. ], batch size: 179, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:18:53,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:18:53,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:18:57,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 22:18:59,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 22:19:10,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:19:13,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:19:16,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:19:16,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:19:19,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1425973.3333333333, ans=0.1 2023-10-03 22:19:21,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:19:22,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 22:19:31,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:31,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:35,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 22:19:35,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:19:37,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:19:37,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:19:39,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:19:39,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:19:40,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:19:40,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:19:41,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 22:19:41,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:19:46,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:49,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:19:58,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 22:19:58,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 22:19:58,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:20:01,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 22:20:01,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:02,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1426173.3333333333, ans=0.05 2023-10-03 22:20:04,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:20:06,480 INFO [train.py:1046] (2/4) Epoch 41, batch 1450, loss[loss=0.1569, simple_loss=0.2484, pruned_loss=0.03273, over 24310.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2355, pruned_loss=0.03797, over 4711480.34 frames. ], batch size: 74, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:20:08,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:20:09,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:20:09,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:09,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 22:20:11,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1426240.0, ans=0.125 2023-10-03 22:20:13,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:15,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:20:15,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:20:15,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 22:20:16,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:20:16,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 22:20:17,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 22:20:20,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:20:22,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:20:22,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 22:20:22,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:23,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:20:24,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:26,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1426306.6666666667, ans=0.1 2023-10-03 22:20:27,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:20:31,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:20:34,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:34,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:37,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:37,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:20:37,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:37,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:20:41,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 22:20:43,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:20:45,100 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.909e+02 2.047e+02 2.238e+02 3.342e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 22:20:47,902 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 22:20:49,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:20:50,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:20:52,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:20:54,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 22:21:00,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:00,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 22:21:01,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 22:21:03,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:07,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:21:07,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:21:08,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 22:21:11,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 22:21:11,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 22:21:11,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:13,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:21:17,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1426506.6666666667, ans=0.2 2023-10-03 22:21:20,106 INFO [train.py:1046] (2/4) Epoch 41, batch 1500, loss[loss=0.1474, simple_loss=0.2253, pruned_loss=0.03475, over 23718.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2361, pruned_loss=0.03802, over 4726190.98 frames. ], batch size: 149, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:21:24,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 22:21:24,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:21:24,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:21:25,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:27,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:21:28,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:21:29,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 22:21:30,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:21:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:21:30,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:21:30,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:21:32,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:21:35,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:21:41,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:21:41,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 22:21:41,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:21:41,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:21:42,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:45,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 22:21:48,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 22:21:49,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:51,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 22:21:51,326 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:21:51,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.72 vs. limit=15.0 2023-10-03 22:21:51,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.27 vs. limit=10.0 2023-10-03 22:21:53,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:21:55,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:21:56,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:56,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:21:58,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 22:21:58,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:21:58,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:21:58,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 22:21:58,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:22:06,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:22:06,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 22:22:08,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1426773.3333333333, ans=0.125 2023-10-03 22:22:10,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:22:12,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:22:15,008 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 22:22:15,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:15,050 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 22:22:17,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:18,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:22:19,755 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 22:22:21,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:22:24,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 22:22:25,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:28,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:22:28,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:28,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:22:30,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:31,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:22:32,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 22:22:32,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 22:22:34,150 INFO [train.py:1046] (2/4) Epoch 41, batch 1550, loss[loss=0.1931, simple_loss=0.2628, pruned_loss=0.06171, over 19321.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2366, pruned_loss=0.03791, over 4712708.36 frames. ], batch size: 390, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:22:34,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:22:34,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 22:22:34,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 22:22:37,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:22:38,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:40,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:22:40,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:22:40,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:41,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:44,520 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 22:22:44,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:45,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:22:45,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:22:47,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:22:47,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 22:22:50,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:22:50,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 22:22:51,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 22:22:51,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 22:22:51,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:52,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1426973.3333333333, ans=0.0 2023-10-03 22:22:53,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:22:54,181 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.22 vs. limit=15.0 2023-10-03 22:22:57,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:23:00,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 22:23:00,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 22:23:08,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:23:10,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:23:11,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:23:11,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:23:11,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 22:23:12,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1427040.0, ans=0.0 2023-10-03 22:23:13,021 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.987e+02 2.151e+02 2.393e+02 4.271e+02, threshold=4.303e+02, percent-clipped=1.0 2023-10-03 22:23:16,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:23:17,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:20,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:23:23,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:23:23,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:23:23,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 22:23:24,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:23:27,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:23:27,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:28,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 22:23:28,925 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 22:23:31,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1427106.6666666667, ans=0.2 2023-10-03 22:23:31,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1427106.6666666667, ans=0.1 2023-10-03 22:23:32,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:23:38,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 22:23:44,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:23:44,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:45,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 22:23:47,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:23:47,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:23:47,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:23:48,765 INFO [train.py:1046] (2/4) Epoch 41, batch 1600, loss[loss=0.1608, simple_loss=0.2444, pruned_loss=0.03854, over 23729.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2376, pruned_loss=0.03831, over 4718140.16 frames. ], batch size: 85, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:23:48,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:23:50,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:23:53,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:23:53,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1427240.0, ans=0.125 2023-10-03 22:23:54,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 22:23:54,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 22:23:57,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 22:23:59,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:24:01,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 22:24:03,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:24:05,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:24:09,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:24:14,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 22:24:17,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:24:18,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 22:24:18,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:18,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 22:24:23,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 22:24:30,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:24:31,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 22:24:31,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:24:33,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:24:33,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:24:36,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 22:24:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 22:24:39,767 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-10-03 22:24:42,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:24:42,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:42,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:43,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:24:45,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:24:45,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:24:48,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:24:48,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.73 vs. limit=15.0 2023-10-03 22:24:54,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:55,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:24:58,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 22:24:58,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:24:58,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 22:25:02,726 INFO [train.py:1046] (2/4) Epoch 41, batch 1650, loss[loss=0.1635, simple_loss=0.2388, pruned_loss=0.04414, over 24359.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2383, pruned_loss=0.03868, over 4728008.72 frames. ], batch size: 61, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:25:03,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.33 vs. limit=15.0 2023-10-03 22:25:04,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:05,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:25:06,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:25:06,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 22:25:06,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 22:25:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 22:25:08,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 22:25:11,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:25:12,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:25:13,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:25:13,561 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.43 vs. limit=15.0 2023-10-03 22:25:14,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:25:17,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:19,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 22:25:22,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:25:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:25:22,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:25:22,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:25:22,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 22:25:23,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 22:25:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:25:29,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:25:36,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 22:25:36,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:40,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 22:25:43,250 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.986e+02 2.197e+02 2.523e+02 3.489e+02, threshold=4.394e+02, percent-clipped=0.0 2023-10-03 22:25:44,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:25:47,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:25:47,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:25:47,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:25:49,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:25:49,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:50,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1427773.3333333333, ans=0.1 2023-10-03 22:25:52,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:53,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:53,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:25:54,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:25:56,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:25:58,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:25:59,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1427773.3333333333, ans=10.0 2023-10-03 22:26:00,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:26:00,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 22:26:03,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:26:03,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 22:26:04,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 22:26:04,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 22:26:04,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:04,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:26:05,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:26:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:26:06,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 22:26:10,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:26:10,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:26:12,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:26:15,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 22:26:17,080 INFO [train.py:1046] (2/4) Epoch 41, batch 1700, loss[loss=0.1416, simple_loss=0.226, pruned_loss=0.02864, over 24492.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2375, pruned_loss=0.03841, over 4724209.14 frames. ], batch size: 66, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:26:18,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:26:18,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:26:18,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 22:26:20,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:26:20,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:26:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:26:22,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:26:22,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:26:22,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 22:26:23,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1427906.6666666667, ans=0.1 2023-10-03 22:26:26,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:26:31,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1427973.3333333333, ans=0.125 2023-10-03 22:26:32,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:26:36,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:26:42,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:26:42,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:26:42,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:26:42,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:26:46,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 22:26:48,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-10-03 22:26:48,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:26:48,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:51,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:26:52,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:26:53,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 22:26:53,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 22:26:55,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:57,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 22:26:57,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:27:06,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:06,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:08,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:27:09,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:27:09,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 22:27:09,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:27:10,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1428106.6666666667, ans=0.5 2023-10-03 22:27:11,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:11,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 22:27:12,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:27:12,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:12,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:15,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:15,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:27:17,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:17,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:27:17,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:20,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1428173.3333333333, ans=0.125 2023-10-03 22:27:23,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:27:25,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 22:27:26,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:28,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:27:28,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 22:27:32,473 INFO [train.py:1046] (2/4) Epoch 41, batch 1750, loss[loss=0.1524, simple_loss=0.2322, pruned_loss=0.0363, over 24483.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2353, pruned_loss=0.0382, over 4707662.81 frames. ], batch size: 63, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:27:32,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:34,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:34,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:27:35,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 22:27:35,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:37,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1428240.0, ans=0.1 2023-10-03 22:27:38,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:27:38,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:44,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 22:27:45,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:50,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 22:27:50,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:50,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:27:53,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:27:54,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 22:27:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:27:57,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 22:28:04,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:28:07,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:07,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:28:11,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:11,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:28:12,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1428373.3333333333, ans=0.1 2023-10-03 22:28:13,608 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.948e+02 2.173e+02 2.566e+02 4.881e+02, threshold=4.347e+02, percent-clipped=1.0 2023-10-03 22:28:13,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:28:15,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:16,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:28:17,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:28:17,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1428440.0, ans=0.0 2023-10-03 22:28:18,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 22:28:20,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:28:23,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 22:28:23,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:28:25,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:28:26,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:28:28,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1428440.0, ans=0.125 2023-10-03 22:28:29,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:28:30,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 22:28:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:32,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:28:35,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1428506.6666666667, ans=0.125 2023-10-03 22:28:36,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:28:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:28:40,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:28:40,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 22:28:40,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:42,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:28:42,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:28:42,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:28:42,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:28:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:28:46,870 INFO [train.py:1046] (2/4) Epoch 41, batch 1800, loss[loss=0.158, simple_loss=0.247, pruned_loss=0.03454, over 24552.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2352, pruned_loss=0.03821, over 4698803.72 frames. ], batch size: 71, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:28:46,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:28:48,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:51,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:28:54,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:56,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1428573.3333333333, ans=0.1 2023-10-03 22:28:57,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:28:57,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:29:00,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:02,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:03,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:04,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:29:07,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:29:07,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 22:29:08,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:12,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:16,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 22:29:19,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 22:29:19,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 22:29:19,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:19,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:19,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:29:21,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:29:30,340 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 22:29:30,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:29:33,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:33,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1428773.3333333333, ans=0.0 2023-10-03 22:29:34,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 22:29:34,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 22:29:36,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:29:37,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:29:38,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:29:44,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 22:29:49,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:29:49,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 22:29:50,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:29:50,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:50,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:29:51,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 22:29:54,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:29:54,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:29:57,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 22:29:57,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:00,971 INFO [train.py:1046] (2/4) Epoch 41, batch 1850, loss[loss=0.1541, simple_loss=0.2428, pruned_loss=0.03273, over 23603.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.03826, over 4706096.11 frames. ], batch size: 85, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:30:01,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:01,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:30:01,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:02,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:03,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:30:05,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:30:05,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:07,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:30:08,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:30:15,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:30:15,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 22:30:18,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 22:30:18,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1428973.3333333333, ans=0.1 2023-10-03 22:30:20,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 22:30:25,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:30:26,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 22:30:26,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 22:30:29,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.40 vs. limit=22.5 2023-10-03 22:30:36,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:30:39,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 22:30:40,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:30:40,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:30:41,802 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.909e+02 2.082e+02 2.293e+02 3.628e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-03 22:30:43,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 22:30:44,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:44,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:30:46,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:30:48,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:30:50,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:53,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:30:53,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:53,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:30:53,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:30:53,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1429106.6666666667, ans=0.0 2023-10-03 22:30:55,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1429106.6666666667, ans=0.1 2023-10-03 22:30:56,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:58,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:31:01,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 22:31:02,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:31:06,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:31:08,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:31:08,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 22:31:08,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 22:31:09,749 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 22:31:09,825 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 22:31:11,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:31:11,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:31:12,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:31:12,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:12,893 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:31:13,843 INFO [train.py:1046] (2/4) Epoch 41, batch 1900, loss[loss=0.1611, simple_loss=0.2332, pruned_loss=0.04455, over 23861.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2354, pruned_loss=0.03817, over 4704398.50 frames. ], batch size: 195, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:31:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 22:31:13,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:31:13,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:15,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:31:16,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:31:17,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1429240.0, ans=0.0 2023-10-03 22:31:18,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:31:18,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 22:31:19,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:19,877 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 22:31:19,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:31:20,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1429240.0, ans=0.0 2023-10-03 22:31:21,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:31:25,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:31:28,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:31:28,474 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 22:31:30,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 22:31:30,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1429306.6666666667, ans=0.125 2023-10-03 22:31:32,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:31:32,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:31:32,324 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 22:31:32,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1429306.6666666667, ans=0.125 2023-10-03 22:31:32,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1429306.6666666667, ans=0.125 2023-10-03 22:31:33,675 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 22:31:36,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 22:31:39,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:31:42,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 22:31:42,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1429373.3333333333, ans=0.0 2023-10-03 22:31:43,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 22:31:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 22:31:55,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 22:31:55,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:56,644 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 22:31:56,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 22:31:56,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 22:31:57,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 22:31:57,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:01,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 22:32:04,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1429440.0, ans=0.1 2023-10-03 22:32:05,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:32:07,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:32:08,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 22:32:10,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:32:13,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 22:32:13,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:32:19,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:32:19,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:32:20,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:32:20,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:32:21,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:32:21,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:32:23,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:32:27,042 INFO [train.py:1046] (2/4) Epoch 41, batch 1950, loss[loss=0.1765, simple_loss=0.2624, pruned_loss=0.04533, over 24017.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2369, pruned_loss=0.0388, over 4702013.87 frames. ], batch size: 80, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:32:27,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:32:27,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:32:28,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:32:28,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:32:28,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:32:30,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:32:35,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:32:36,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:32:38,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:38,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:32:39,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 22:32:40,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:32:40,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:42,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:45,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:32:45,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:32:45,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:45,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1429640.0, ans=0.0 2023-10-03 22:32:45,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1429640.0, ans=0.2 2023-10-03 22:32:46,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:32:51,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:32:51,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:32:51,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:32:51,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:55,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:59,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:32:59,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:32:59,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:32:59,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 22:32:59,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:33:01,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:33:01,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:01,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1429706.6666666667, ans=0.125 2023-10-03 22:33:04,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:33:07,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:33:09,228 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 2.015e+02 2.236e+02 2.546e+02 4.290e+02, threshold=4.473e+02, percent-clipped=1.0 2023-10-03 22:33:09,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1429706.6666666667, ans=0.1 2023-10-03 22:33:12,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:33:13,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:33:13,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:33:13,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 22:33:14,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:33:17,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:33:17,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:33:19,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:33:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:27,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:31,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:32,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:34,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1429840.0, ans=0.0 2023-10-03 22:33:35,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:33:35,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:37,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 22:33:37,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:33:39,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:33:39,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 22:33:41,900 INFO [train.py:1046] (2/4) Epoch 41, batch 2000, loss[loss=0.1582, simple_loss=0.2363, pruned_loss=0.04002, over 23318.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03912, over 4699725.05 frames. ], batch size: 106, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:33:42,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:33:42,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1429906.6666666667, ans=0.0 2023-10-03 22:33:44,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:33:46,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:33:46,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:33:47,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:33:49,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:53,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 22:33:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:33:56,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:33:57,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 22:33:59,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:34:00,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:34:01,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1429973.3333333333, ans=0.125 2023-10-03 22:34:02,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:34:04,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 22:34:05,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:07,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:07,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:08,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 22:34:08,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:34:10,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 22:34:10,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:34:13,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:14,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:34:14,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:14,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:34:15,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:34:17,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 22:34:18,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 22:34:18,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:34:18,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:23,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1430040.0, ans=0.125 2023-10-03 22:34:24,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:25,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:34:25,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:34:27,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:34:29,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:34:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:30,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:34:30,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:32,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:34,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:34:35,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 22:34:35,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1430106.6666666667, ans=0.035 2023-10-03 22:34:39,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:34:41,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:45,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:45,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:34:48,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:51,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:51,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:52,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:34:52,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:34:54,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:55,658 INFO [train.py:1046] (2/4) Epoch 41, batch 2050, loss[loss=0.1566, simple_loss=0.2518, pruned_loss=0.03073, over 24329.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2367, pruned_loss=0.039, over 4695231.32 frames. ], batch size: 74, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:34:55,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:57,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:58,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:35:02,554 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.44 vs. limit=15.0 2023-10-03 22:35:04,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:35:06,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:35:07,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:35:08,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:35:09,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 22:35:09,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:35:10,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:35:11,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:35:18,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1430306.6666666667, ans=0.0 2023-10-03 22:35:22,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:35:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:35:23,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 22:35:25,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1430373.3333333333, ans=0.1 2023-10-03 22:35:26,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:35:28,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 22:35:28,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:35:28,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1430373.3333333333, ans=0.125 2023-10-03 22:35:31,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:35:32,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:35:34,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:35:34,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:35:36,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:35:36,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:35:37,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:35:37,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1430373.3333333333, ans=0.125 2023-10-03 22:35:38,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.010e+02 2.255e+02 2.586e+02 3.703e+02, threshold=4.510e+02, percent-clipped=0.0 2023-10-03 22:35:39,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-10-03 22:35:40,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:35:43,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:35:44,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:35:46,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:35:49,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:35:55,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:35:55,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 22:36:01,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:36:01,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:36:03,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:36:07,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 22:36:09,756 INFO [train.py:1046] (2/4) Epoch 41, batch 2100, loss[loss=0.1529, simple_loss=0.2231, pruned_loss=0.04133, over 23692.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2348, pruned_loss=0.03847, over 4675485.78 frames. ], batch size: 232, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:36:09,858 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 22:36:09,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:10,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.46 vs. limit=15.0 2023-10-03 22:36:11,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:36:11,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:36:11,878 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=15.0 2023-10-03 22:36:14,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:36:14,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 22:36:14,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 22:36:15,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:36:18,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:36:19,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:36:23,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:23,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:36:23,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 22:36:24,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:36:25,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 22:36:25,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 22:36:27,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:27,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:36:27,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 22:36:27,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 22:36:33,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 22:36:33,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:36:36,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:36:36,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:36:40,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:36:40,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 22:36:41,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:36:42,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1430706.6666666667, ans=0.125 2023-10-03 22:36:43,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 22:36:43,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:43,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 22:36:43,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 22:36:45,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 22:36:45,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1430706.6666666667, ans=0.05 2023-10-03 22:36:47,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:36:49,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:36:51,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:36:53,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:36:53,295 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:36:54,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:55,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:55,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 22:36:55,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:55,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:57,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:57,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 22:36:58,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 22:36:58,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1430773.3333333333, ans=0.125 2023-10-03 22:37:00,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 22:37:03,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:37:07,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:37:07,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 22:37:12,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:13,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:37:15,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:37:15,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:37:15,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 22:37:16,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:37:17,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:17,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:37:17,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:37:18,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:18,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1430840.0, ans=0.125 2023-10-03 22:37:21,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 22:37:22,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 22:37:22,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:23,825 INFO [train.py:1046] (2/4) Epoch 41, batch 2150, loss[loss=0.1588, simple_loss=0.2491, pruned_loss=0.03419, over 24638.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2344, pruned_loss=0.03787, over 4698252.00 frames. ], batch size: 73, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:37:26,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:37:26,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:37:26,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:37:26,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:37:31,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:37:34,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:34,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:36,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:37:36,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:38,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:37:42,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:42,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:37:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:37:47,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:47,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 22:37:47,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1430973.3333333333, ans=0.125 2023-10-03 22:37:51,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:37:53,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:37:53,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:54,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:37:54,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:55,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:37:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:56,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:37:56,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1431040.0, ans=0.125 2023-10-03 22:37:57,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:57,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 22:37:58,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:38:00,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:00,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:01,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:38:02,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:38:05,960 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.931e+02 2.155e+02 2.450e+02 3.696e+02, threshold=4.310e+02, percent-clipped=0.0 2023-10-03 22:38:06,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:06,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:38:07,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:07,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 22:38:07,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:38:12,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:38:12,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:12,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1431106.6666666667, ans=0.125 2023-10-03 22:38:13,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:38:15,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:38:15,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:18,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:18,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 22:38:19,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 22:38:19,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:38:21,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 22:38:22,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:22,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:38:23,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 22:38:23,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:38:23,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 22:38:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 22:38:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 22:38:24,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 22:38:25,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:25,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:38:26,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:38:27,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:27,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:38:29,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:29,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:37,237 INFO [train.py:1046] (2/4) Epoch 41, batch 2200, loss[loss=0.1509, simple_loss=0.2291, pruned_loss=0.03636, over 23615.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2345, pruned_loss=0.0376, over 4703293.11 frames. ], batch size: 149, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:38:37,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:38:38,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 22:38:41,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:38:41,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1431240.0, ans=0.125 2023-10-03 22:38:45,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:45,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:38:46,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:47,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:38:49,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:50,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:50,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 22:38:55,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1431306.6666666667, ans=0.125 2023-10-03 22:38:56,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 22:38:58,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:39:04,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 22:39:05,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1431373.3333333333, ans=0.2 2023-10-03 22:39:07,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:39:07,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:39:07,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1431373.3333333333, ans=0.125 2023-10-03 22:39:11,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:39:11,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 22:39:15,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:39:17,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:17,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 22:39:20,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:39:21,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:39:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:39:23,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:26,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 22:39:26,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:28,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 22:39:30,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:30,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:39:30,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:33,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:39:35,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:39:35,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:35,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:36,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:39:36,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:39:38,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:39:39,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1431506.6666666667, ans=0.1 2023-10-03 22:39:41,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:39:41,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:39:44,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:39:44,408 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 22:39:45,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:39:47,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 22:39:49,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:39:49,231 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 22:39:52,451 INFO [train.py:1046] (2/4) Epoch 41, batch 2250, loss[loss=0.1573, simple_loss=0.2468, pruned_loss=0.03392, over 24653.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2352, pruned_loss=0.0378, over 4712983.56 frames. ], batch size: 68, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:39:52,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:52,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:39:53,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:56,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 22:39:56,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:39:59,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:40:03,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:40:06,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:40:09,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:09,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:40:10,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:40:12,499 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:40:14,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 22:40:14,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:40:14,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:40:14,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1431640.0, ans=0.125 2023-10-03 22:40:15,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 22:40:16,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:40:16,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:18,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:40:23,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:40:24,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:40:24,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:40:26,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 22:40:27,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:28,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:40:33,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:40:34,425 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.897e+02 2.082e+02 2.416e+02 4.175e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-03 22:40:34,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:40:36,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:40:37,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:40:38,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.84 vs. limit=10.0 2023-10-03 22:40:39,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:40:40,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:40:42,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1431773.3333333333, ans=0.0 2023-10-03 22:40:46,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:40:48,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:40:54,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:40:54,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:40:54,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:40:58,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:41:01,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:41:01,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 22:41:01,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1431840.0, ans=0.2 2023-10-03 22:41:02,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:02,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:41:04,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 22:41:05,696 INFO [train.py:1046] (2/4) Epoch 41, batch 2300, loss[loss=0.1583, simple_loss=0.2344, pruned_loss=0.04107, over 23774.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2367, pruned_loss=0.03834, over 4700009.94 frames. ], batch size: 164, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:41:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:41:07,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:14,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:14,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:41:17,725 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 22:41:19,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:26,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:41:26,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:41:26,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:41:26,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:26,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 22:41:28,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:41:31,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:41:31,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:41:34,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:41:37,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:41:37,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1432040.0, ans=0.125 2023-10-03 22:41:38,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1432040.0, ans=0.025 2023-10-03 22:41:41,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:41:47,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:41:47,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:49,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:41:51,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:53,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.86 vs. limit=15.0 2023-10-03 22:41:54,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:41:54,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1432106.6666666667, ans=0.125 2023-10-03 22:41:55,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:41:55,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:41:55,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 22:41:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:42:01,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:01,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:01,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:42:01,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:42:01,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1432106.6666666667, ans=0.0 2023-10-03 22:42:02,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 22:42:02,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:42:04,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 22:42:04,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:42:04,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:04,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 22:42:11,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:42:14,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:42:17,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:42:17,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:42:17,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:42:20,570 INFO [train.py:1046] (2/4) Epoch 41, batch 2350, loss[loss=0.171, simple_loss=0.2565, pruned_loss=0.04275, over 24406.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.238, pruned_loss=0.03886, over 4706245.25 frames. ], batch size: 77, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:42:20,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:42:20,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:42:21,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:42:22,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 22:42:29,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:42:29,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 22:42:34,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 22:42:34,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1432306.6666666667, ans=0.025 2023-10-03 22:42:36,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:39,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:39,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:40,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:42:40,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:42:42,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 22:42:45,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:42:51,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 22:42:52,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:42:56,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:42:56,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:42:58,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:42:58,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 22:43:00,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:43:03,362 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.889e+02 2.078e+02 2.250e+02 3.278e+02, threshold=4.156e+02, percent-clipped=0.0 2023-10-03 22:43:03,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:43:03,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:43:03,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:43:04,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:43:07,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 22:43:07,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:43:10,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:43:10,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:43:13,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 22:43:13,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:43:16,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 22:43:16,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:43:22,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 22:43:23,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 22:43:24,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1432506.6666666667, ans=0.125 2023-10-03 22:43:25,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:43:25,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:43:25,770 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 22:43:26,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 22:43:29,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 22:43:31,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:43:32,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1432573.3333333333, ans=0.0 2023-10-03 22:43:33,780 INFO [train.py:1046] (2/4) Epoch 41, batch 2400, loss[loss=0.1624, simple_loss=0.2448, pruned_loss=0.03997, over 24319.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.237, pruned_loss=0.03867, over 4695718.45 frames. ], batch size: 77, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:43:35,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:43:38,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:43:38,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:43:39,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 22:43:39,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 22:43:42,227 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.70 vs. limit=12.0 2023-10-03 22:43:46,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:43:46,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:43:47,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 22:43:48,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:43:49,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:43:49,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 22:43:56,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:43:59,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 22:44:03,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:44:08,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 22:44:10,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:44:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:15,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:44:15,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 22:44:15,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:44:24,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:27,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:44:30,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:44:30,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:44:30,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:44:30,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:44:30,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:31,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:44:31,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:44:37,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:44:37,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:44:37,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 22:44:38,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 22:44:40,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:44:40,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:41,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 22:44:41,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 22:44:41,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 22:44:41,867 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 22:44:43,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 22:44:45,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:44:47,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:47,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:44:48,328 INFO [train.py:1046] (2/4) Epoch 41, batch 2450, loss[loss=0.141, simple_loss=0.2053, pruned_loss=0.0384, over 22820.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2352, pruned_loss=0.03843, over 4692755.97 frames. ], batch size: 322, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:44:48,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 22:44:49,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:50,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1432906.6666666667, ans=0.125 2023-10-03 22:44:51,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:44:54,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:44:54,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:44:58,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:44:58,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:44:58,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 22:45:04,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:45:04,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:08,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:45:08,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:45:08,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:45:10,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 22:45:12,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:14,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:45:14,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:45:14,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1432973.3333333333, ans=0.0 2023-10-03 22:45:17,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:45:19,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:19,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:20,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:45:22,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 22:45:24,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:45:24,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1433040.0, ans=0.0 2023-10-03 22:45:27,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1433040.0, ans=0.125 2023-10-03 22:45:32,400 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.017e+02 2.173e+02 2.483e+02 3.583e+02, threshold=4.346e+02, percent-clipped=0.0 2023-10-03 22:45:32,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:33,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:33,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:45:33,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:45:33,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:34,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1433106.6666666667, ans=0.125 2023-10-03 22:45:34,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1433106.6666666667, ans=0.125 2023-10-03 22:45:35,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:45:35,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 22:45:37,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-10-03 22:45:39,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:40,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:45:42,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:45:42,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:45:47,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:45:47,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 22:45:48,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:45:50,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:45:50,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 22:45:50,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:45:50,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1433173.3333333333, ans=0.2 2023-10-03 22:45:51,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1433173.3333333333, ans=0.0 2023-10-03 22:45:52,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:45:55,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:45:56,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:58,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:46:01,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 22:46:02,345 INFO [train.py:1046] (2/4) Epoch 41, batch 2500, loss[loss=0.1479, simple_loss=0.2402, pruned_loss=0.02776, over 24443.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2354, pruned_loss=0.03811, over 4700233.06 frames. ], batch size: 66, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:46:03,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:46:07,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:46:11,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1433240.0, ans=0.125 2023-10-03 22:46:17,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:46:17,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:46:19,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:46:19,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 22:46:26,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:46:26,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:46:29,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:46:29,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 22:46:29,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 22:46:30,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:30,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:46:32,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 22:46:32,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:33,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 22:46:33,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:36,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:46:37,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:46:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:46:40,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 22:46:40,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:46:42,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:45,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:46,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1433440.0, ans=0.0 2023-10-03 22:46:50,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:53,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:46:57,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:47:00,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 22:47:00,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:47:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:47:02,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:47:02,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:47:04,688 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 22:47:04,689 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 22:47:04,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 22:47:06,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:47:06,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1433506.6666666667, ans=0.0 2023-10-03 22:47:07,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 22:47:07,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 22:47:08,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:47:10,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 22:47:13,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 22:47:16,699 INFO [train.py:1046] (2/4) Epoch 41, batch 2550, loss[loss=0.1438, simple_loss=0.2211, pruned_loss=0.03319, over 24464.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2356, pruned_loss=0.03804, over 4702048.96 frames. ], batch size: 58, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:47:17,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.20 vs. limit=22.5 2023-10-03 22:47:18,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:47:20,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:47:20,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:47:20,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1433573.3333333333, ans=0.0 2023-10-03 22:47:21,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:47:21,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 22:47:22,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:47:26,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 22:47:28,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:47:30,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:33,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:47:33,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 22:47:33,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:47:33,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:47:34,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:47:37,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:47:37,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 22:47:37,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:47:37,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:37,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 22:47:49,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:47:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:47:53,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:53,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:47:54,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.62 vs. limit=22.5 2023-10-03 22:47:55,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:48:00,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.004e+02 2.232e+02 2.565e+02 3.401e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-03 22:48:01,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:48:04,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:48:05,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:48:05,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:48:06,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:48:06,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:48:07,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1433773.3333333333, ans=0.1 2023-10-03 22:48:09,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:48:09,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:48:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:48:15,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 22:48:15,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:48:15,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:48:16,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:48:16,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1433840.0, ans=0.2 2023-10-03 22:48:17,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:48:19,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:25,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:48:26,024 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.12 vs. limit=22.5 2023-10-03 22:48:27,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:29,492 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 22:48:30,738 INFO [train.py:1046] (2/4) Epoch 41, batch 2600, loss[loss=0.1656, simple_loss=0.2591, pruned_loss=0.03603, over 24453.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2371, pruned_loss=0.03806, over 4709611.45 frames. ], batch size: 69, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:48:33,551 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 22:48:33,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:48:33,608 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 22:48:35,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 22:48:35,412 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 22:48:38,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-03 22:48:39,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:48:39,524 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 22:48:42,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 22:48:43,545 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 22:48:43,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1433973.3333333333, ans=0.1 2023-10-03 22:48:44,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:48:46,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 22:48:47,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1433973.3333333333, ans=0.0 2023-10-03 22:48:48,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 22:48:49,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:48:49,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 22:48:51,108 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 22:48:52,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 22:48:58,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:48:59,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1434040.0, ans=15.0 2023-10-03 22:48:59,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:59,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:48:59,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 22:49:00,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1434040.0, ans=0.125 2023-10-03 22:49:02,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:49:05,951 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.00 vs. limit=10.0 2023-10-03 22:49:06,664 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 22:49:09,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.02 vs. limit=22.5 2023-10-03 22:49:11,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:49:11,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:12,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 22:49:12,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:49:12,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:49:13,343 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.21 vs. limit=15.0 2023-10-03 22:49:14,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 22:49:17,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:49:17,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:49:18,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:22,070 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 22:49:22,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:23,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:49:23,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1434106.6666666667, ans=0.125 2023-10-03 22:49:27,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:49:28,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:49:28,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 22:49:30,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:49:32,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:49:33,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1434173.3333333333, ans=0.125 2023-10-03 22:49:34,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:49:38,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 22:49:38,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:40,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:49:43,104 INFO [train.py:1046] (2/4) Epoch 41, batch 2650, loss[loss=0.1614, simple_loss=0.246, pruned_loss=0.03843, over 23418.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2373, pruned_loss=0.03812, over 4726408.27 frames. ], batch size: 93, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 22:49:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 22:49:45,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:46,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:49:47,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 22:49:47,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:49:48,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:51,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:49:53,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:49:54,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:55,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 22:49:55,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:49:57,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:49:59,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 22:50:00,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1434306.6666666667, ans=0.0 2023-10-03 22:50:01,332 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 22:50:04,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:04,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1434306.6666666667, ans=0.0 2023-10-03 22:50:05,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 22:50:06,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:08,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 22:50:11,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:11,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:50:13,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:13,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:17,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 22:50:17,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 22:50:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:50:26,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 22:50:26,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:26,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:27,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:50:27,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:50:29,236 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.964e+02 2.149e+02 2.506e+02 3.247e+02, threshold=4.298e+02, percent-clipped=0.0 2023-10-03 22:50:29,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:32,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:50:33,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:50:33,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:50:33,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:50:34,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:50:34,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:36,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:50:37,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:39,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:50:39,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:50:43,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:45,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:50:45,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:45,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 22:50:47,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:50,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:51,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:53,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:50:53,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:50:54,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:50:54,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1434506.6666666667, ans=0.125 2023-10-03 22:50:55,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.19 vs. limit=15.0 2023-10-03 22:50:56,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:50:56,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 22:50:57,390 INFO [train.py:1046] (2/4) Epoch 41, batch 2700, loss[loss=0.1603, simple_loss=0.252, pruned_loss=0.03433, over 24351.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.238, pruned_loss=0.03828, over 4734827.45 frames. ], batch size: 74, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:50:58,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:50:58,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 22:51:01,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:51:01,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:01,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:03,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:51:03,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:51:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:51:03,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:51:04,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 22:51:04,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:51:07,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:51:08,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:51:08,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:51:12,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1434640.0, ans=0.0 2023-10-03 22:51:13,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:51:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 22:51:13,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:51:19,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:51:19,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:51:21,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.78 vs. limit=15.0 2023-10-03 22:51:26,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:51:26,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:51:26,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:51:26,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:51:29,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:51:32,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:51:32,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:51:32,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:51:36,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:36,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:51:38,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1434706.6666666667, ans=0.125 2023-10-03 22:51:45,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:51:45,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:51:50,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:51:50,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:51:52,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:54,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:51:55,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:51:56,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:51:58,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:52:01,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:52:02,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:52:02,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:52:06,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 22:52:07,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:08,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.45 vs. limit=22.5 2023-10-03 22:52:09,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1434906.6666666667, ans=0.95 2023-10-03 22:52:10,558 INFO [train.py:1046] (2/4) Epoch 41, batch 2750, loss[loss=0.1557, simple_loss=0.2443, pruned_loss=0.03355, over 24604.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2372, pruned_loss=0.0383, over 4728339.83 frames. ], batch size: 73, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 22:52:10,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:52:10,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 22:52:13,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 22:52:13,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:15,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:16,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:52:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:18,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:52:19,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:24,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:52:24,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:52:24,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:52:24,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:24,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 22:52:25,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:52:25,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:30,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 22:52:33,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:52:33,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:33,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:52:33,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:52:34,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:52:36,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:52:37,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:37,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:40,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1435040.0, ans=0.0 2023-10-03 22:52:41,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:52:41,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:52:41,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:52:44,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:46,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:52:46,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1435040.0, ans=15.0 2023-10-03 22:52:51,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:53,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:52:53,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:52:57,729 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 1.945e+02 2.112e+02 2.405e+02 3.716e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 22:52:58,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.61 vs. limit=15.0 2023-10-03 22:52:59,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:59,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:52:59,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:53:03,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1435106.6666666667, ans=0.0 2023-10-03 22:53:04,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:53:04,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:53:04,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 22:53:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:10,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 22:53:10,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1435173.3333333333, ans=0.0 2023-10-03 22:53:14,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:53:17,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:53:17,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 22:53:17,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:53:20,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:53:20,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 22:53:21,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.19 vs. limit=15.0 2023-10-03 22:53:21,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:53:23,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 22:53:24,745 INFO [train.py:1046] (2/4) Epoch 41, batch 2800, loss[loss=0.1607, simple_loss=0.2283, pruned_loss=0.04652, over 23910.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2356, pruned_loss=0.0383, over 4707627.61 frames. ], batch size: 212, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:53:24,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:24,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:53:26,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 22:53:26,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:53:26,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:28,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:53:28,199 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 22:53:28,200 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 22:53:32,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:35,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:53:35,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:53:38,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:53:41,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 22:53:42,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:53:44,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 22:53:45,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:45,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:53:45,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:53:49,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:53:49,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:49,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:53:50,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:53:58,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:54:00,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:54:02,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:03,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:54:03,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:09,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:54:09,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 22:54:10,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:10,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:54:10,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:54:10,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1435440.0, ans=0.0 2023-10-03 22:54:14,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.75 vs. limit=6.0 2023-10-03 22:54:14,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:16,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:18,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:54:20,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:54:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:20,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:54:21,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:54:21,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:54:23,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:54:23,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 22:54:23,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:54:25,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:54:25,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:54:25,746 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.15 vs. limit=12.0 2023-10-03 22:54:28,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 22:54:28,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:28,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:54:29,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:54:29,920 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:54:31,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 22:54:33,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.05 vs. limit=10.0 2023-10-03 22:54:36,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:54:36,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:54:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:54:38,150 INFO [train.py:1046] (2/4) Epoch 41, batch 2850, loss[loss=0.1523, simple_loss=0.2222, pruned_loss=0.04118, over 23669.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2344, pruned_loss=0.03776, over 4702044.79 frames. ], batch size: 232, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:54:40,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:54:40,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1435573.3333333333, ans=0.0 2023-10-03 22:54:44,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:54:44,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:54:44,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:48,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:49,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:51,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:54:51,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 22:54:54,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.84 vs. limit=10.0 2023-10-03 22:54:57,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 22:54:57,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:54:59,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 22:55:00,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:02,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1435640.0, ans=0.0 2023-10-03 22:55:03,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 22:55:03,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 22:55:03,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1435640.0, ans=0.125 2023-10-03 22:55:03,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1435640.0, ans=0.1 2023-10-03 22:55:04,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:06,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1435706.6666666667, ans=0.0 2023-10-03 22:55:11,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1435706.6666666667, ans=0.0 2023-10-03 22:55:17,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:55:18,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:55:18,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:55:19,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:55:19,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:55:19,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:55:22,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:55:22,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 22:55:22,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1435773.3333333333, ans=0.0 2023-10-03 22:55:23,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:55:23,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:55:25,197 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.888e+02 2.036e+02 2.281e+02 3.028e+02, threshold=4.073e+02, percent-clipped=0.0 2023-10-03 22:55:25,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:55:26,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:26,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1435773.3333333333, ans=0.125 2023-10-03 22:55:30,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:55:30,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:55:31,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:33,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:55:34,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:55:36,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:36,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:39,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:55:40,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-10-03 22:55:44,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:55:45,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 22:55:45,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 22:55:47,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:55:47,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:55:48,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 22:55:48,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:55:49,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:55:49,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:55:49,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:55:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 22:55:51,204 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 22:55:51,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:55:52,492 INFO [train.py:1046] (2/4) Epoch 41, batch 2900, loss[loss=0.166, simple_loss=0.254, pruned_loss=0.03893, over 24680.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03774, over 4713821.77 frames. ], batch size: 68, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:55:52,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:56,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:55:56,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:55:58,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:55:58,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 22:56:03,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:56:03,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 22:56:03,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1435906.6666666667, ans=0.125 2023-10-03 22:56:03,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.67 vs. limit=15.0 2023-10-03 22:56:04,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 22:56:05,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:56:06,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:56:07,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:56:07,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:56:10,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:56:11,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:56:15,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:56:15,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 22:56:17,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:56:18,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:20,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 22:56:20,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 22:56:24,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:56:24,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 22:56:24,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:56:26,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:56:26,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:56:28,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:56:30,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:33,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:56:35,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:56:37,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 22:56:37,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 22:56:37,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:56:41,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:56:42,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1436106.6666666667, ans=0.0 2023-10-03 22:56:43,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 22:56:45,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:56:50,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:59,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:56:59,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:57:00,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 22:57:03,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:03,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 22:57:03,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:57:03,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:57:06,729 INFO [train.py:1046] (2/4) Epoch 41, batch 2950, loss[loss=0.1471, simple_loss=0.2326, pruned_loss=0.03083, over 24299.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2361, pruned_loss=0.03815, over 4711296.80 frames. ], batch size: 61, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:57:08,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:57:09,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 22:57:09,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:57:10,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:12,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:57:12,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 22:57:14,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 22:57:15,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1436240.0, ans=0.125 2023-10-03 22:57:16,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:57:16,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:57:22,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:57:23,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1436306.6666666667, ans=0.0 2023-10-03 22:57:25,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:57:27,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:57:28,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:57:30,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:57:30,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:57:32,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:33,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:33,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:57:36,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 22:57:36,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1436373.3333333333, ans=0.125 2023-10-03 22:57:41,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 22:57:41,628 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 22:57:43,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:57:45,019 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 22:57:46,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 22:57:46,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:57:47,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:57:47,731 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 22:57:47,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:57:49,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 22:57:50,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:57:50,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:57:53,089 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.859e+02 2.046e+02 2.288e+02 3.221e+02, threshold=4.092e+02, percent-clipped=0.0 2023-10-03 22:57:53,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:54,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:57:54,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:57:54,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1436440.0, ans=0.125 2023-10-03 22:57:56,096 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 22:57:56,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:57,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 22:58:02,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:58:02,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.02 vs. limit=15.0 2023-10-03 22:58:03,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:58:05,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 22:58:05,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:58:06,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 22:58:09,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:58:10,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:58:10,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:58:12,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:58:12,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 22:58:13,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:58:15,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:58:15,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:58:16,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:58:17,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:58:19,922 INFO [train.py:1046] (2/4) Epoch 41, batch 3000, loss[loss=0.163, simple_loss=0.2367, pruned_loss=0.04467, over 23825.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2369, pruned_loss=0.03843, over 4715024.15 frames. ], batch size: 195, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:58:19,923 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 22:58:32,314 INFO [train.py:1078] (2/4) Epoch 41, validation: loss=0.3725, simple_loss=0.2818, pruned_loss=0.2316, over 1125622.00 frames. 2023-10-03 22:58:32,315 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 22:58:32,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:32,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 22:58:33,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:36,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:58:37,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:58:40,798 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 22:58:40,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 22:58:41,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.28 vs. limit=22.5 2023-10-03 22:58:42,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:58:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:58:43,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 22:58:45,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:58:49,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:58:49,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1436640.0, ans=0.125 2023-10-03 22:58:58,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:59:04,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 22:59:06,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:59:08,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:59:08,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:59:10,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:59:11,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:59:11,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 22:59:14,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 22:59:17,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:59:17,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:59:20,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:59:20,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:59:20,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:20,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:59:22,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1436773.3333333333, ans=0.125 2023-10-03 22:59:23,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:59:23,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:59:23,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:59:25,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:59:27,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 22:59:29,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:59:29,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:29,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:59:33,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:33,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:35,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:59:35,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 22:59:36,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:59:37,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 22:59:37,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:59:38,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 22:59:41,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:59:42,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 22:59:42,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 22:59:44,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 22:59:44,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:59:44,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:59:45,742 INFO [train.py:1046] (2/4) Epoch 41, batch 3050, loss[loss=0.2031, simple_loss=0.2758, pruned_loss=0.06518, over 19479.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2376, pruned_loss=0.03872, over 4713044.66 frames. ], batch size: 388, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:59:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:47,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:59:47,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:47,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:59:49,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 22:59:52,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:59:54,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:59:54,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:59:55,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1436906.6666666667, ans=0.0 2023-10-03 22:59:57,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:59,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 23:00:04,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 23:00:04,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1436973.3333333333, ans=0.125 2023-10-03 23:00:06,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 23:00:06,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:10,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:00:13,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:13,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:00:15,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:19,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:00:19,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:00:19,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1437040.0, ans=0.1 2023-10-03 23:00:20,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:20,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:00:20,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:21,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:23,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:25,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:25,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 23:00:25,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:25,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:00:30,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:00:31,141 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.945e+02 2.143e+02 2.357e+02 3.381e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 23:00:31,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:00:31,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:00:32,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:37,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:37,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:40,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.78 vs. limit=12.0 2023-10-03 23:00:43,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:44,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:00:44,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:46,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:00:46,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:00:48,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:00:49,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 23:00:49,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:00:49,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:52,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 23:00:53,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:57,662 INFO [train.py:1046] (2/4) Epoch 41, batch 3100, loss[loss=0.1407, simple_loss=0.2303, pruned_loss=0.02553, over 24318.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2381, pruned_loss=0.03844, over 4717439.00 frames. ], batch size: 61, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:00:57,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:57,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:00:59,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:01:02,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 23:01:05,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 23:01:06,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 23:01:06,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1437240.0, ans=0.1 2023-10-03 23:01:08,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:01:11,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:01:11,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:14,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:01:16,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1437306.6666666667, ans=0.2 2023-10-03 23:01:17,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 23:01:27,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:01:27,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:27,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:01:28,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:01:28,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 23:01:31,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:01:31,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 23:01:31,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:01:32,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:33,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 23:01:33,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.87 vs. limit=15.0 2023-10-03 23:01:34,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:01:36,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1437373.3333333333, ans=0.125 2023-10-03 23:01:39,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:01:41,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 23:01:42,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 23:01:42,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:42,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:45,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:01:45,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:47,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:01:47,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:01:47,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:01:50,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:01:50,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1437440.0, ans=0.125 2023-10-03 23:01:51,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.00 vs. limit=12.0 2023-10-03 23:01:51,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:01:51,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:51,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:01:55,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:01:57,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 23:02:00,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:02:01,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 23:02:01,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:01,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:01,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 23:02:04,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1437506.6666666667, ans=0.125 2023-10-03 23:02:12,101 INFO [train.py:1046] (2/4) Epoch 41, batch 3150, loss[loss=0.1489, simple_loss=0.2153, pruned_loss=0.04128, over 23581.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2366, pruned_loss=0.03815, over 4721966.77 frames. ], batch size: 256, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:02:12,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 23:02:14,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:15,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:15,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:02:15,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:02:17,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 23:02:18,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:18,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:02:20,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 23:02:21,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:23,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1437573.3333333333, ans=0.07 2023-10-03 23:02:24,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 23:02:27,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 23:02:27,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:02:28,522 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 23:02:28,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:02:29,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 23:02:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 23:02:29,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 23:02:30,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:30,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:02:31,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:33,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1437640.0, ans=0.125 2023-10-03 23:02:34,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 23:02:36,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:37,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:37,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:02:40,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:02:43,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 23:02:43,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:02:46,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:02:46,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1437706.6666666667, ans=0.125 2023-10-03 23:02:46,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1437706.6666666667, ans=0.125 2023-10-03 23:02:48,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:02:48,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1437706.6666666667, ans=0.125 2023-10-03 23:02:49,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 23:02:52,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 23:02:52,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:02:52,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:02:52,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:02:52,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:52,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:02:55,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:02:55,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:02:55,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1437773.3333333333, ans=0.0 2023-10-03 23:02:56,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 23:02:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:02:56,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:02:58,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:02:59,588 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.918e+02 2.167e+02 2.429e+02 3.852e+02, threshold=4.335e+02, percent-clipped=0.0 2023-10-03 23:02:59,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:03:00,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 23:03:01,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:02,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 23:03:02,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:03,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 23:03:05,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 23:03:06,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:03:06,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:08,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 23:03:09,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 23:03:09,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:03:11,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:03:12,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:12,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:03:13,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1437840.0, ans=0.5 2023-10-03 23:03:19,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:03:20,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:22,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 23:03:24,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.18 vs. limit=15.0 2023-10-03 23:03:26,368 INFO [train.py:1046] (2/4) Epoch 41, batch 3200, loss[loss=0.1489, simple_loss=0.2165, pruned_loss=0.04062, over 23664.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2356, pruned_loss=0.03806, over 4717885.17 frames. ], batch size: 232, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:03:27,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:03:27,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 23:03:29,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1437906.6666666667, ans=0.125 2023-10-03 23:03:30,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:30,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:03:30,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 23:03:33,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:38,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:03:38,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-03 23:03:42,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:51,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:03:55,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1438040.0, ans=0.125 2023-10-03 23:03:58,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 23:04:00,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:04:04,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 23:04:04,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:04:08,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:04:08,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:04:08,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:04:13,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 23:04:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 23:04:15,427 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:04:18,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 23:04:20,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 23:04:22,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:04:25,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:25,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:04:26,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:26,658 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 23:04:26,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:04:29,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:04:30,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 23:04:30,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 23:04:32,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 23:04:33,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 23:04:35,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:04:37,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:04:37,101 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 23:04:37,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:04:37,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:04:39,629 INFO [train.py:1046] (2/4) Epoch 41, batch 3250, loss[loss=0.1545, simple_loss=0.2316, pruned_loss=0.03865, over 23446.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2347, pruned_loss=0.03827, over 4703955.11 frames. ], batch size: 134, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:04:39,724 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 23:04:40,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1438240.0, ans=0.1 2023-10-03 23:04:46,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:04:47,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1438240.0, ans=0.125 2023-10-03 23:04:48,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:04:54,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1438306.6666666667, ans=0.0 2023-10-03 23:04:55,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:04:55,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 23:04:57,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:04:57,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:57,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:04:58,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:05:00,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:05:00,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:01,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:05:01,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:02,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:02,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:04,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:05:07,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:09,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:05:10,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1438373.3333333333, ans=0.125 2023-10-03 23:05:11,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:11,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:13,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:14,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:05:14,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:05:20,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 23:05:20,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:05:20,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1438373.3333333333, ans=0.0 2023-10-03 23:05:21,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:05:21,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:05:22,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:05:26,871 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.962e+02 2.147e+02 2.402e+02 3.461e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-03 23:05:27,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1438440.0, ans=0.125 2023-10-03 23:05:28,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:05:34,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:05:34,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:34,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 23:05:35,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:05:35,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 23:05:35,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:38,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 23:05:40,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 23:05:40,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:05:41,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:05:41,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:05:41,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:05:41,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:05:45,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:05:45,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:05:47,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 23:05:47,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1438506.6666666667, ans=0.015 2023-10-03 23:05:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:05:51,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:05:51,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 23:05:54,109 INFO [train.py:1046] (2/4) Epoch 41, batch 3300, loss[loss=0.1871, simple_loss=0.2583, pruned_loss=0.05795, over 19354.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.03826, over 4695346.15 frames. ], batch size: 388, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:05:54,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:05:54,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 23:05:56,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 23:05:56,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 23:05:56,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:01,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:06:03,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:06:03,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:06,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:06:06,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:06:07,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:10,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:06:14,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 23:06:15,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:06:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:16,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:18,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 23:06:18,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:06:19,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:06:19,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:06:19,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:06:19,994 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 23:06:23,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:25,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:06:26,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:26,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 23:06:27,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 23:06:27,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:29,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:06:30,774 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 23:06:32,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1438706.6666666667, ans=0.125 2023-10-03 23:06:33,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 23:06:33,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:06:36,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 23:06:38,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:06:38,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1438773.3333333333, ans=0.2 2023-10-03 23:06:41,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:06:41,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:06:43,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:06:44,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:44,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:46,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:06:47,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:06:47,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:48,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:06:50,772 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 23:06:52,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 23:06:55,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:06:55,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:06:55,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:06:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:58,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:06:59,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1438840.0, ans=0.0 2023-10-03 23:07:00,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:07:00,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:00,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:07:00,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:07:02,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:07:03,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1438840.0, ans=0.125 2023-10-03 23:07:04,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 23:07:04,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:05,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:07,079 INFO [train.py:1046] (2/4) Epoch 41, batch 3350, loss[loss=0.1636, simple_loss=0.2437, pruned_loss=0.04177, over 23392.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2365, pruned_loss=0.03862, over 4708423.59 frames. ], batch size: 93, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:07:07,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:07:07,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1438906.6666666667, ans=0.125 2023-10-03 23:07:08,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:07:08,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1438906.6666666667, ans=0.125 2023-10-03 23:07:09,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:10,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1438906.6666666667, ans=0.2 2023-10-03 23:07:11,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:07:11,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:14,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:07:15,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:15,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:07:18,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:18,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1438906.6666666667, ans=0.125 2023-10-03 23:07:20,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:07:21,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:22,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:07:25,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 23:07:25,292 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 23:07:25,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:26,266 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.52 vs. limit=10.0 2023-10-03 23:07:28,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 23:07:29,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 23:07:31,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:07:31,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:07:32,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:32,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 23:07:32,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:32,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:07:35,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:36,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:38,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:38,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:07:41,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:07:42,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:42,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:07:47,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:07:48,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:49,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1439106.6666666667, ans=0.125 2023-10-03 23:07:51,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:51,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:54,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:55,979 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.914e+02 2.116e+02 2.439e+02 3.109e+02, threshold=4.232e+02, percent-clipped=0.0 2023-10-03 23:07:56,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 23:07:56,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:07:57,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 23:07:57,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:07:58,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 23:08:00,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:02,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:08:07,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:08:07,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 23:08:09,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:08:10,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:08:11,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1439173.3333333333, ans=0.125 2023-10-03 23:08:12,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:08:16,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:08:17,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 23:08:19,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:08:19,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:08:20,802 INFO [train.py:1046] (2/4) Epoch 41, batch 3400, loss[loss=0.1593, simple_loss=0.2411, pruned_loss=0.0387, over 23686.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03866, over 4710033.89 frames. ], batch size: 232, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:08:20,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:20,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 23:08:22,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:08:22,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 23:08:23,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:08:23,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:08:23,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:08:25,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:08:25,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1439240.0, ans=0.125 2023-10-03 23:08:26,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 23:08:30,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 23:08:30,929 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 23:08:30,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:08:35,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:08:35,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:08:36,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:08:37,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:08:41,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:08:44,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 23:08:48,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:08:51,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:08:51,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:52,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:08:59,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:09:02,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 23:09:04,664 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.56 vs. limit=15.0 2023-10-03 23:09:08,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:09:08,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:09:09,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 23:09:09,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:09:11,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:09:13,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:09:13,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:09:14,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1439440.0, ans=0.0 2023-10-03 23:09:14,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1439440.0, ans=0.0 2023-10-03 23:09:15,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:09:18,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:09:18,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:09:23,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:09:26,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 23:09:28,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1439506.6666666667, ans=0.125 2023-10-03 23:09:30,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1439506.6666666667, ans=0.0 2023-10-03 23:09:31,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:09:35,193 INFO [train.py:1046] (2/4) Epoch 41, batch 3450, loss[loss=0.1482, simple_loss=0.2382, pruned_loss=0.02909, over 24684.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2377, pruned_loss=0.03896, over 4701803.95 frames. ], batch size: 73, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 23:09:36,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 23:09:40,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 23:09:40,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:09:42,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:09:42,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 23:09:44,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:09:47,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:09:51,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:09:53,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:09:54,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:09:54,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:09:55,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1439640.0, ans=0.0 2023-10-03 23:09:56,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:10:02,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 23:10:06,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 23:10:06,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:10:06,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:10:06,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:14,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 23:10:14,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:10:14,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1439706.6666666667, ans=0.2 2023-10-03 23:10:17,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:10:18,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:10:20,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:10:20,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:10:21,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 23:10:21,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:10:23,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:10:26,226 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.900e+02 2.129e+02 2.402e+02 3.276e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 23:10:26,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:10:29,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 23:10:33,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:10:35,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.00 vs. limit=15.0 2023-10-03 23:10:37,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:10:39,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:42,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:10:46,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:46,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:10:48,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:10:49,472 INFO [train.py:1046] (2/4) Epoch 41, batch 3500, loss[loss=0.1485, simple_loss=0.2309, pruned_loss=0.03306, over 20365.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.236, pruned_loss=0.03826, over 4691293.23 frames. ], batch size: 44, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:10:49,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:10:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:10:56,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:10:56,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 23:10:58,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:11:01,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:11:03,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:11:03,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 23:11:07,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:11:08,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:11:12,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:11:12,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:11:14,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:11:14,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:15,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:11:15,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 23:11:17,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:18,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:11:18,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:11:21,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1440040.0, ans=0.025 2023-10-03 23:11:23,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:24,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 23:11:24,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:11:27,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:11:28,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1440040.0, ans=0.0 2023-10-03 23:11:29,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:11:30,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:32,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:11:33,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:11:35,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 23:11:35,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1440106.6666666667, ans=0.125 2023-10-03 23:11:36,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 23:11:36,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 23:11:37,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:11:37,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:39,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:11:39,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:11:42,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 23:11:43,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:11:45,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1440106.6666666667, ans=0.1 2023-10-03 23:11:48,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:11:48,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1440106.6666666667, ans=0.1 2023-10-03 23:11:49,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 23:11:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 23:11:49,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:11:50,469 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.77 vs. limit=12.0 2023-10-03 23:11:52,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:11:52,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:11:54,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:57,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 23:11:57,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:12:00,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:12:00,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 23:12:02,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1440173.3333333333, ans=0.0 2023-10-03 23:12:03,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 23:12:04,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:04,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:12:04,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:06,235 INFO [train.py:1046] (2/4) Epoch 41, batch 3550, loss[loss=0.1595, simple_loss=0.2498, pruned_loss=0.03463, over 24627.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2352, pruned_loss=0.03795, over 4698731.37 frames. ], batch size: 68, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:12:06,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:08,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:12:17,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:19,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 23:12:21,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:12:23,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:12:25,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:26,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:12:26,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:12:28,747 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.06 vs. limit=22.5 2023-10-03 23:12:29,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:12:30,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:12:30,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:31,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:12:32,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:12:37,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:12:37,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:12:37,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:12:39,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:39,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:12:39,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 23:12:39,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:39,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:40,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 23:12:45,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:47,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:12:48,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:50,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 23:12:51,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:12:52,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 23:12:52,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:12:55,989 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 1.904e+02 2.052e+02 2.273e+02 3.106e+02, threshold=4.105e+02, percent-clipped=0.0 2023-10-03 23:12:57,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:12:57,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:13:01,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 23:13:01,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:08,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=1440506.6666666667, ans=0.1 2023-10-03 23:13:09,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:09,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 23:13:09,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:14,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:13:14,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 23:13:14,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1440506.6666666667, ans=0.125 2023-10-03 23:13:19,409 INFO [train.py:1046] (2/4) Epoch 41, batch 3600, loss[loss=0.1602, simple_loss=0.2414, pruned_loss=0.03955, over 23419.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03772, over 4714492.87 frames. ], batch size: 106, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:13:22,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 23:13:22,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:13:24,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:13:25,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:25,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1440573.3333333333, ans=0.04949747468305833 2023-10-03 23:13:26,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:27,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1440573.3333333333, ans=0.1 2023-10-03 23:13:28,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:13:30,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:13:32,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:32,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:13:33,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:13:34,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:34,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 23:13:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:13:37,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:40,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:13:43,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:13:43,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:13:44,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:13:44,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 23:13:44,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:13:46,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1440640.0, ans=0.125 2023-10-03 23:13:47,054 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 23:13:47,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:49,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:13:49,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:52,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:13:53,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:13:53,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 23:13:59,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:01,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:14:01,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 23:14:07,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:14:08,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1440773.3333333333, ans=0.125 2023-10-03 23:14:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:12,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1440773.3333333333, ans=0.0 2023-10-03 23:14:14,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:18,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:14:20,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:14:20,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 23:14:22,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 23:14:23,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 23:14:26,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:14:26,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:14:28,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 23:14:28,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:14:29,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:14:29,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:29,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 23:14:31,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 23:14:34,302 INFO [train.py:1046] (2/4) Epoch 41, batch 3650, loss[loss=0.1528, simple_loss=0.2274, pruned_loss=0.0391, over 23717.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2351, pruned_loss=0.03789, over 4710747.98 frames. ], batch size: 164, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:14:34,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:34,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 23:14:34,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1440906.6666666667, ans=0.2 2023-10-03 23:14:40,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 23:14:40,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:14:43,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 23:14:43,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1440906.6666666667, ans=0.125 2023-10-03 23:14:45,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 23:14:50,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:14:50,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:14:51,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:14:53,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:14:55,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:55,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 23:14:55,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1440973.3333333333, ans=0.2 2023-10-03 23:14:56,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:14:56,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:14:56,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 23:14:57,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:14:59,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:14:59,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:14:59,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:15:02,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 23:15:04,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 23:15:06,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:15:07,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 23:15:08,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:15:08,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:15:13,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:15:15,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:15,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:15:15,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:15:17,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:15:18,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.26 vs. limit=15.0 2023-10-03 23:15:20,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:15:23,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:15:24,911 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.947e+02 2.181e+02 2.506e+02 4.142e+02, threshold=4.361e+02, percent-clipped=1.0 2023-10-03 23:15:25,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:25,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:15:25,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1441106.6666666667, ans=0.0 2023-10-03 23:15:27,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:15:27,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1441106.6666666667, ans=0.125 2023-10-03 23:15:29,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:29,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:15:35,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 23:15:38,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:15:38,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:15:38,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1441173.3333333333, ans=0.2 2023-10-03 23:15:39,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:15:41,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:42,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:15:42,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:44,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 23:15:44,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:45,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:15:47,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:15:49,079 INFO [train.py:1046] (2/4) Epoch 41, batch 3700, loss[loss=0.1569, simple_loss=0.2462, pruned_loss=0.03386, over 24623.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2357, pruned_loss=0.03783, over 4701694.80 frames. ], batch size: 68, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:15:49,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:15:51,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:51,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 23:15:51,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:51,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:15:51,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:15:56,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:15:59,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:15:59,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:15:59,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1441240.0, ans=0.0 2023-10-03 23:16:00,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:16:00,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:16:01,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:16:05,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:08,189 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 23:16:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:16:14,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:16:17,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:16:17,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 23:16:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:16:17,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1441373.3333333333, ans=0.125 2023-10-03 23:16:21,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:21,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 23:16:21,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1441373.3333333333, ans=0.0 2023-10-03 23:16:21,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1441373.3333333333, ans=0.0 2023-10-03 23:16:23,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:24,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:16:26,538 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.40 vs. limit=12.0 2023-10-03 23:16:27,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:28,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:16:30,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:16:34,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:16:34,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 23:16:34,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:34,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 23:16:41,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:16:41,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:16:44,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:16:44,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 23:16:45,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:16:45,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:16:46,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:16:46,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:16:50,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:16:51,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 23:16:52,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 23:16:54,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:16:54,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:16:55,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:16:56,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:16:58,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:59,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:17:01,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:03,051 INFO [train.py:1046] (2/4) Epoch 41, batch 3750, loss[loss=0.1449, simple_loss=0.2319, pruned_loss=0.02891, over 24333.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2367, pruned_loss=0.03811, over 4705071.95 frames. ], batch size: 61, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:17:04,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 23:17:04,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 23:17:07,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:17:07,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 23:17:09,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:17:11,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:17:12,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:17:13,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:17:16,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:17:20,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:17:21,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:17:22,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:17:25,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:17:25,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 23:17:27,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:17:28,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:17:28,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:17:29,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 23:17:34,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 23:17:34,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:17:36,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:17:37,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1441706.6666666667, ans=0.2 2023-10-03 23:17:37,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1441706.6666666667, ans=0.05 2023-10-03 23:17:38,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:17:42,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:45,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 23:17:49,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 23:17:51,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1441773.3333333333, ans=0.125 2023-10-03 23:17:52,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:55,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.936e+02 2.132e+02 2.338e+02 3.284e+02, threshold=4.264e+02, percent-clipped=0.0 2023-10-03 23:17:56,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:17:56,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:17:59,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:18:04,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:18:07,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:18:09,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:18:10,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:18:12,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:18:17,513 INFO [train.py:1046] (2/4) Epoch 41, batch 3800, loss[loss=0.14, simple_loss=0.2236, pruned_loss=0.02823, over 24486.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2362, pruned_loss=0.03792, over 4706618.97 frames. ], batch size: 66, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:18:21,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1441906.6666666667, ans=0.1 2023-10-03 23:18:22,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:18:25,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 23:18:26,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 23:18:27,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:18:29,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:18:30,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:18:32,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 23:18:32,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:33,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:18:34,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1441973.3333333333, ans=0.0 2023-10-03 23:18:35,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:18:35,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:18:35,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:36,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 23:18:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 23:18:41,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:18:44,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:18:47,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:18:47,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:18:47,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1442040.0, ans=0.125 2023-10-03 23:18:50,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:18:50,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:51,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:53,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:56,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:18:56,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 23:19:00,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:19:07,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:19:13,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:19:14,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 23:19:16,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 23:19:17,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:19:18,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:19:19,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:19,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1442173.3333333333, ans=0.5 2023-10-03 23:19:21,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 23:19:23,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 23:19:24,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 23:19:24,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:25,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:19:26,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1442173.3333333333, ans=0.025 2023-10-03 23:19:30,863 INFO [train.py:1046] (2/4) Epoch 41, batch 3850, loss[loss=0.1395, simple_loss=0.2027, pruned_loss=0.03815, over 23515.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2344, pruned_loss=0.03798, over 4690285.63 frames. ], batch size: 285, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:19:32,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:19:32,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:19:35,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1442240.0, ans=0.125 2023-10-03 23:19:35,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.69 vs. limit=15.0 2023-10-03 23:19:36,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:19:36,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1442240.0, ans=0.0 2023-10-03 23:19:39,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 23:19:39,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:19:41,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:44,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:19:46,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:19:48,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:19:48,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 23:19:55,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:19:58,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:59,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:19:59,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:20:02,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:03,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:20:03,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:03,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:20:04,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:05,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:07,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:07,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:20:08,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 23:20:08,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 23:20:10,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:20:11,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:13,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:14,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:14,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 23:20:15,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1442440.0, ans=0.5 2023-10-03 23:20:16,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 23:20:17,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:19,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 23:20:21,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:20:24,013 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.998e+02 2.145e+02 2.490e+02 4.261e+02, threshold=4.290e+02, percent-clipped=0.0 2023-10-03 23:20:26,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:27,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.36 vs. limit=15.0 2023-10-03 23:20:28,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:31,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:32,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 23:20:34,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 23:20:37,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:38,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:40,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1442506.6666666667, ans=0.0 2023-10-03 23:20:42,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:20:42,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:20:42,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:43,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:43,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:20:43,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 23:20:45,667 INFO [train.py:1046] (2/4) Epoch 41, batch 3900, loss[loss=0.1405, simple_loss=0.2196, pruned_loss=0.03067, over 24356.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2339, pruned_loss=0.03751, over 4700250.40 frames. ], batch size: 56, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:20:45,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:20:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 23:20:47,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:47,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:49,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:20:49,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:51,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:20:51,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:51,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:53,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:20:53,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 23:20:54,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:56,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:20:57,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:20:59,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:21:00,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:21:00,874 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:21:03,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:21:03,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:21:04,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:21:04,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1442640.0, ans=0.125 2023-10-03 23:21:05,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-10-03 23:21:06,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 23:21:06,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:21:08,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 23:21:08,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:21:08,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 23:21:10,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 23:21:15,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:21:16,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:21:16,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:21:16,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:17,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1442706.6666666667, ans=15.0 2023-10-03 23:21:22,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:21:24,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:21:25,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:21:25,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:21:27,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:21:31,929 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.27 vs. limit=15.0 2023-10-03 23:21:32,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:21:32,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:21:40,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:21:42,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:21:45,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.49 vs. limit=6.0 2023-10-03 23:21:50,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:21:50,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1442840.0, ans=0.125 2023-10-03 23:21:50,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1442840.0, ans=0.125 2023-10-03 23:21:51,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:53,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 23:21:53,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 23:21:53,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:53,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1442840.0, ans=0.0 2023-10-03 23:21:55,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 23:21:56,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:21:58,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 23:21:59,415 INFO [train.py:1046] (2/4) Epoch 41, batch 3950, loss[loss=0.1557, simple_loss=0.2413, pruned_loss=0.03508, over 24474.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2334, pruned_loss=0.03714, over 4700521.61 frames. ], batch size: 66, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:22:01,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1442906.6666666667, ans=0.1 2023-10-03 23:22:02,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:22:03,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 23:22:03,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:22:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:22:08,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:22:14,128 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 23:22:16,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:22:16,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 23:22:16,090 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 23:22:17,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:22:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:22:20,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:22:20,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:22:22,058 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.64 vs. limit=22.5 2023-10-03 23:22:22,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 23:22:26,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:22:28,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:22:28,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:22:28,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:22:29,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:22:34,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.20 vs. limit=10.0 2023-10-03 23:22:38,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:22:38,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:22:41,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1443040.0, ans=0.125 2023-10-03 23:22:42,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 23:22:43,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1443106.6666666667, ans=0.125 2023-10-03 23:22:47,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 23:22:47,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 23:22:47,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:22:48,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:22:51,418 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.901e+02 2.096e+02 2.372e+02 3.248e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-03 23:22:56,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:22:56,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:22:57,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:22:59,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:22:59,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 23:22:59,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1443173.3333333333, ans=0.0 2023-10-03 23:23:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:23:03,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:23:06,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 23:23:13,718 INFO [train.py:1046] (2/4) Epoch 41, batch 4000, loss[loss=0.1443, simple_loss=0.2269, pruned_loss=0.03088, over 24552.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2346, pruned_loss=0.03789, over 4709660.56 frames. ], batch size: 60, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:23:15,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:22,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:22,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1443240.0, ans=0.125 2023-10-03 23:23:26,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:23:28,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:23:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:28,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 23:23:30,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:23:30,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 23:23:30,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:23:30,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 23:23:33,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:23:35,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:23:35,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:23:35,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:23:35,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:23:35,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:23:37,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1443306.6666666667, ans=0.125 2023-10-03 23:23:38,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:23:40,364 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 23:23:41,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:23:42,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:23:44,972 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 23:23:46,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:23:46,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:23:51,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1443373.3333333333, ans=0.1 2023-10-03 23:23:53,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 23:23:54,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:23:57,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:23:58,013 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 23:24:00,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:24:00,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 23:24:00,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:24:01,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:24:03,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:24:05,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:24:05,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:24:05,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:24:06,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 23:24:06,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:24:08,379 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 23:24:08,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1443440.0, ans=0.1 2023-10-03 23:24:14,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:24:15,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 23:24:16,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1443506.6666666667, ans=0.0 2023-10-03 23:24:17,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:24:17,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:24:17,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:24:19,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:24:22,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.73 vs. limit=10.0 2023-10-03 23:24:25,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:24:25,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1443506.6666666667, ans=0.125 2023-10-03 23:24:26,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:24:26,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 23:24:28,130 INFO [train.py:1046] (2/4) Epoch 41, batch 4050, loss[loss=0.1554, simple_loss=0.242, pruned_loss=0.03437, over 24348.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2347, pruned_loss=0.03768, over 4704719.87 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:24:30,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:24:30,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:24:32,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:24:32,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:24:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:24:35,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:24:39,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:24:40,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:24:41,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:24:43,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:24:46,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:24:48,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:24:51,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 23:24:53,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 23:24:53,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 23:24:56,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:25:03,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 23:25:03,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:25:08,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:25:12,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:25:12,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:25:12,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:25:17,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:25:18,886 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.94 vs. limit=15.0 2023-10-03 23:25:21,088 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.958e+02 2.133e+02 2.374e+02 3.448e+02, threshold=4.266e+02, percent-clipped=0.0 2023-10-03 23:25:21,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 23:25:21,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:25:21,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:25:23,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 23:25:26,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:25:33,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 23:25:35,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:25:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:25:37,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 23:25:37,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 23:25:37,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:25:39,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:25:40,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:41,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:25:42,273 INFO [train.py:1046] (2/4) Epoch 41, batch 4100, loss[loss=0.1715, simple_loss=0.2517, pruned_loss=0.0456, over 23355.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2352, pruned_loss=0.03753, over 4710825.91 frames. ], batch size: 93, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:25:48,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 23:25:49,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 23:25:52,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 23:25:53,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 23:25:53,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:25:53,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:53,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:55,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:25:55,875 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 23:25:58,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:25:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:25:58,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:26:00,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:26:01,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1443973.3333333333, ans=0.0 2023-10-03 23:26:04,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:26:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:26:05,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:26:05,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 23:26:07,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:26:07,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:26:07,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:26:07,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:26:07,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 23:26:13,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:13,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 23:26:16,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:26:16,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1444040.0, ans=0.2 2023-10-03 23:26:18,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:26:18,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 23:26:18,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:26:20,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:26:20,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:26:23,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 23:26:23,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:26:25,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:26:26,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 23:26:26,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:26:27,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1444106.6666666667, ans=0.125 2023-10-03 23:26:28,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:26:29,759 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.54 vs. limit=10.0 2023-10-03 23:26:30,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:35,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:26:38,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:26:38,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:26:38,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1444106.6666666667, ans=0.1 2023-10-03 23:26:47,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:26:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:50,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:26:52,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.75 vs. limit=15.0 2023-10-03 23:26:53,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:26:56,958 INFO [train.py:1046] (2/4) Epoch 41, batch 4150, loss[loss=0.1484, simple_loss=0.2103, pruned_loss=0.0433, over 22880.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03823, over 4700220.67 frames. ], batch size: 322, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:26:56,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:26:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:26:59,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:26:59,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:27:01,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1444240.0, ans=0.125 2023-10-03 23:27:03,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 23:27:03,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:27:03,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 23:27:04,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 23:27:04,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 23:27:06,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:27:11,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:27:11,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:27:15,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.91 vs. limit=22.5 2023-10-03 23:27:15,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:15,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:27:17,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:27:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:27:19,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:27:20,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:27:23,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:27:26,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:27:26,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1444373.3333333333, ans=0.0 2023-10-03 23:27:28,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 23:27:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 23:27:30,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:27:32,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 23:27:32,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:27:32,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:27:34,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1444373.3333333333, ans=0.0 2023-10-03 23:27:35,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:36,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:40,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 23:27:42,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:27:45,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:27:45,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 23:27:45,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:27:46,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 23:27:48,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:27:49,673 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.998e+02 2.194e+02 2.506e+02 4.254e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-03 23:27:49,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:27:51,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:51,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 23:27:51,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:27:51,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:27:55,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:27:56,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 23:27:57,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-03 23:27:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:57,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:27:57,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:27:58,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 23:27:59,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:59,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:28:00,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:28:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:28:02,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 23:28:02,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1444506.6666666667, ans=0.125 2023-10-03 23:28:03,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:28:08,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:28:11,380 INFO [train.py:1046] (2/4) Epoch 41, batch 4200, loss[loss=0.1565, simple_loss=0.2269, pruned_loss=0.04303, over 23844.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.03819, over 4699411.66 frames. ], batch size: 195, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:28:11,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 23:28:12,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:28:14,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:28:15,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:28:17,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:28:17,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:28:21,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 23:28:24,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 23:28:24,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:26,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.78 vs. limit=15.0 2023-10-03 23:28:27,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:28:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:28:33,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:28:33,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:28:33,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:35,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 23:28:35,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:28:36,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:36,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:28:36,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:28:38,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:28:40,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 23:28:40,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:42,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1444706.6666666667, ans=0.125 2023-10-03 23:28:45,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:28:46,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:28:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:28:48,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:28:49,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:28:49,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 23:28:49,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:28:51,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:28:56,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:28:58,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:28:59,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1444773.3333333333, ans=0.2 2023-10-03 23:29:03,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:29:06,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 23:29:09,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:29:15,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:29:15,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:16,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 23:29:23,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:29:24,832 INFO [train.py:1046] (2/4) Epoch 41, batch 4250, loss[loss=0.1451, simple_loss=0.2313, pruned_loss=0.02938, over 24656.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2347, pruned_loss=0.03783, over 4707065.01 frames. ], batch size: 65, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:29:28,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:29:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:29:29,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:35,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:29:35,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1444906.6666666667, ans=0.0 2023-10-03 23:29:36,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 23:29:36,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:29:39,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:41,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1444973.3333333333, ans=0.125 2023-10-03 23:29:42,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:29:42,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1444973.3333333333, ans=0.1 2023-10-03 23:29:46,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:47,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:47,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.52 vs. limit=12.0 2023-10-03 23:29:49,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:29:49,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:29:51,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:51,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:53,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:53,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:29:56,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:29:56,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 23:29:59,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 23:29:59,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:59,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:01,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:30:01,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:30:01,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:01,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:30:04,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:30:04,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:30:06,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1445040.0, ans=0.125 2023-10-03 23:30:08,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:30:08,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:10,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 23:30:10,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:30:12,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 23:30:13,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:30:14,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:30:14,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:16,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:30:16,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1445106.6666666667, ans=0.125 2023-10-03 23:30:17,407 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.864e+02 1.993e+02 2.259e+02 3.017e+02, threshold=3.986e+02, percent-clipped=0.0 2023-10-03 23:30:17,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 23:30:19,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:30:20,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:30:23,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1445173.3333333333, ans=0.1 2023-10-03 23:30:25,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:27,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:28,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:30:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:30:29,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:30:31,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1445173.3333333333, ans=0.125 2023-10-03 23:30:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:30:32,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:30:32,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 23:30:34,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:36,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.85 vs. limit=22.5 2023-10-03 23:30:39,197 INFO [train.py:1046] (2/4) Epoch 41, batch 4300, loss[loss=0.1439, simple_loss=0.2233, pruned_loss=0.03222, over 24445.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2345, pruned_loss=0.03799, over 4700409.79 frames. ], batch size: 58, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:30:39,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:30:40,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:30:42,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1445240.0, ans=0.125 2023-10-03 23:30:43,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:50,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:50,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 23:30:51,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1445240.0, ans=0.125 2023-10-03 23:30:52,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:30:53,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:30:55,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:30:55,426 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 23:30:58,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:30:59,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:31:01,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 23:31:01,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:31:01,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1445306.6666666667, ans=0.0 2023-10-03 23:31:02,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 23:31:05,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:31:06,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:31:08,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:31:08,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:31:10,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:31:13,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:31:13,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:31:13,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 23:31:14,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 23:31:15,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:31:18,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:18,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:31:18,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:20,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:31:20,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 23:31:20,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 23:31:21,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 23:31:23,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:31:23,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 23:31:23,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 23:31:27,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.06 vs. limit=22.5 2023-10-03 23:31:27,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:31:29,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 23:31:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:31:30,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:31,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:31:35,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 23:31:35,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:31:35,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:35,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:31:36,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:31:36,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:31:39,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:31:42,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:42,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1445506.6666666667, ans=0.125 2023-10-03 23:31:43,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:43,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:31:48,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 23:31:49,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:31:51,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1445573.3333333333, ans=0.0 2023-10-03 23:31:52,851 INFO [train.py:1046] (2/4) Epoch 41, batch 4350, loss[loss=0.1478, simple_loss=0.2345, pruned_loss=0.03059, over 24473.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2354, pruned_loss=0.03797, over 4704847.33 frames. ], batch size: 63, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:31:54,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:31:56,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:58,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:31:58,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:32:00,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1445573.3333333333, ans=10.0 2023-10-03 23:32:05,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:32:09,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.99 vs. limit=15.0 2023-10-03 23:32:10,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:32:13,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:32:13,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:32:14,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:32:17,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:32:18,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:32:23,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 23:32:24,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:32:24,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:29,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:32,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 23:32:36,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:32:38,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:32:42,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 23:32:43,390 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.37 vs. limit=15.0 2023-10-03 23:32:43,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:32:43,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:32:45,479 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.734e+02 1.921e+02 2.106e+02 2.345e+02 3.480e+02, threshold=4.212e+02, percent-clipped=0.0 2023-10-03 23:32:45,620 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 23:32:46,921 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 23:32:46,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:32:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:32:48,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:32:49,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:32:49,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:32:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:32:53,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 23:32:53,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:53,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:32:55,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:55,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 23:32:55,637 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:32:56,630 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 23:32:56,635 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 23:32:56,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 23:33:00,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:33:00,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:33:01,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:01,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:33:02,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 23:33:04,105 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 23:33:05,329 INFO [train.py:1046] (2/4) Epoch 41, batch 4400, loss[loss=0.1494, simple_loss=0.2314, pruned_loss=0.03372, over 24488.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03828, over 4716645.16 frames. ], batch size: 63, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:33:05,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:05,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1445906.6666666667, ans=0.0 2023-10-03 23:33:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:33:08,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:13,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:33:15,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 23:33:15,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 23:33:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 23:33:17,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 23:33:18,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:33:18,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:33:22,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 23:33:23,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:23,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:23,768 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 23:33:27,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:27,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 23:33:27,835 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 23:33:28,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1445973.3333333333, ans=0.0 2023-10-03 23:33:32,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 23:33:32,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 23:33:32,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 23:33:33,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:33,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:33:33,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:33:35,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:33:36,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 23:33:36,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 23:33:38,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:39,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:33:39,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:41,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:41,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:41,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 23:33:42,704 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 23:33:46,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:54,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:33:56,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 23:34:00,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:34:02,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:34:05,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:34:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 23:34:05,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:34:05,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:34:05,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:34:06,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:34:10,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 23:34:13,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 23:34:14,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 23:34:14,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:14,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 23:34:14,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:34:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:34:17,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1446173.3333333333, ans=0.0 2023-10-03 23:34:18,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 23:34:20,050 INFO [train.py:1046] (2/4) Epoch 41, batch 4450, loss[loss=0.1565, simple_loss=0.2436, pruned_loss=0.0347, over 24658.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2373, pruned_loss=0.03851, over 4716172.55 frames. ], batch size: 65, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:34:24,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:34:25,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:25,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:34:26,750 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.52 vs. limit=6.0 2023-10-03 23:34:30,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1446240.0, ans=0.125 2023-10-03 23:34:33,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:34:33,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:34:37,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:38,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:34:40,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:34:40,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:42,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 23:34:42,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:34:42,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:42,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:34:42,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:34:44,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:34:50,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:34:50,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:34:52,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:34:52,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1446373.3333333333, ans=0.125 2023-10-03 23:34:54,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:55,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:34:56,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1446373.3333333333, ans=0.125 2023-10-03 23:35:00,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:35:01,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 23:35:01,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 23:35:01,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:35:02,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:35:04,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 23:35:08,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:35:11,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:35:12,365 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 2.032e+02 2.316e+02 2.697e+02 5.602e+02, threshold=4.632e+02, percent-clipped=2.0 2023-10-03 23:35:12,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 23:35:12,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:12,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:35:12,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:35:13,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:35:15,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:35:19,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:35:20,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 23:35:21,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:35:25,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:35:25,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:35:26,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:26,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:35:27,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.76 vs. limit=15.0 2023-10-03 23:35:28,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:35:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 23:35:32,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:35:33,930 INFO [train.py:1046] (2/4) Epoch 41, batch 4500, loss[loss=0.1968, simple_loss=0.2684, pruned_loss=0.06258, over 19779.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2384, pruned_loss=0.03902, over 4709479.41 frames. ], batch size: 388, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:35:35,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:35:36,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 23:35:36,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 23:35:38,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:35:42,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:43,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:35:43,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:35:44,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:35:44,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:35:46,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:35:50,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1446640.0, ans=0.0 2023-10-03 23:35:50,746 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.82 vs. limit=15.0 2023-10-03 23:35:57,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:35:59,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:36:01,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:36:02,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:36:03,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:36:08,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:36:12,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:36:12,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1446706.6666666667, ans=0.125 2023-10-03 23:36:12,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1446706.6666666667, ans=0.125 2023-10-03 23:36:13,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1446706.6666666667, ans=0.0 2023-10-03 23:36:15,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:36:18,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:36:18,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 23:36:19,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:19,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:36:22,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:36:24,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:36:25,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:36:25,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 23:36:25,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:36:26,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:30,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:36:30,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:36:34,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:34,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1446840.0, ans=0.2 2023-10-03 23:36:37,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:36:37,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:36:38,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 23:36:40,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 23:36:40,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 23:36:42,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 23:36:46,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 23:36:47,898 INFO [train.py:1046] (2/4) Epoch 41, batch 4550, loss[loss=0.1587, simple_loss=0.2461, pruned_loss=0.0357, over 24665.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.237, pruned_loss=0.03861, over 4707974.26 frames. ], batch size: 73, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:36:48,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1446906.6666666667, ans=0.0 2023-10-03 23:36:49,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:36:51,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:36:52,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:36:55,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:01,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:37:02,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:37:04,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:04,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:37:04,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:07,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:37:09,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:37:12,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 23:37:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 23:37:14,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:37:17,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 23:37:19,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 23:37:19,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:37:21,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 23:37:25,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:37:29,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:29,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:29,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:37:31,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 23:37:32,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:37:34,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:34,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:37:35,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:36,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 23:37:38,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 23:37:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:37:39,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 23:37:40,779 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.077e+02 2.278e+02 2.553e+02 3.904e+02, threshold=4.555e+02, percent-clipped=0.0 2023-10-03 23:37:40,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 23:37:40,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:43,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:43,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:37:45,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:45,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:37:46,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:37:46,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 23:37:47,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1447173.3333333333, ans=0.0 2023-10-03 23:37:48,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:37:48,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 23:37:50,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 23:37:50,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:37:51,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 23:37:53,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:37:53,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:37:56,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:37:56,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:56,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:37:59,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:37:59,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:38:00,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1447173.3333333333, ans=0.0 2023-10-03 23:38:02,606 INFO [train.py:1046] (2/4) Epoch 41, batch 4600, loss[loss=0.1492, simple_loss=0.2193, pruned_loss=0.03954, over 23459.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2358, pruned_loss=0.03825, over 4708412.34 frames. ], batch size: 285, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:38:04,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:04,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1447240.0, ans=0.125 2023-10-03 23:38:05,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:38:05,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1447240.0, ans=0.1 2023-10-03 23:38:08,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:38:08,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:38:09,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:09,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 23:38:10,061 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.69 vs. limit=15.0 2023-10-03 23:38:10,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:38:13,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:38:13,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:15,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1447306.6666666667, ans=0.0 2023-10-03 23:38:15,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1447306.6666666667, ans=0.125 2023-10-03 23:38:17,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:18,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1447306.6666666667, ans=0.1 2023-10-03 23:38:24,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 23:38:26,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:29,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:30,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:38:30,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:34,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1447373.3333333333, ans=0.2 2023-10-03 23:38:35,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 23:38:35,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:38:36,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:38:36,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1447373.3333333333, ans=0.125 2023-10-03 23:38:39,820 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:38:40,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:41,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1447373.3333333333, ans=0.2 2023-10-03 23:38:42,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:38:43,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:38:43,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1447373.3333333333, ans=0.125 2023-10-03 23:38:48,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 23:38:49,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:38:51,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.68 vs. limit=10.0 2023-10-03 23:38:54,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:38:56,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:38:57,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1447440.0, ans=0.1 2023-10-03 23:38:58,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:38:58,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 23:38:58,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:00,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 23:39:00,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:01,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1447506.6666666667, ans=0.125 2023-10-03 23:39:02,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:02,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:03,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:39:04,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:06,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 23:39:06,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 23:39:06,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 23:39:06,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:07,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:39:07,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:09,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:12,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=1447506.6666666667, ans=0.02 2023-10-03 23:39:16,067 INFO [train.py:1046] (2/4) Epoch 41, batch 4650, loss[loss=0.1672, simple_loss=0.245, pruned_loss=0.04464, over 23462.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2357, pruned_loss=0.03808, over 4706798.41 frames. ], batch size: 106, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:39:17,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:39:22,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:39:22,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:22,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:39:23,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:23,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:39:25,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:26,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 23:39:31,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:39:32,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 23:39:32,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:39:34,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 23:39:34,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:39:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 23:39:34,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 23:39:34,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:35,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:39:38,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:39:40,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:39:40,029 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 23:39:44,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:39:45,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 23:39:47,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:47,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:39:48,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 23:39:50,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:39:52,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:39:58,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:03,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:06,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:40:06,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:06,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:40:08,749 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.906e+02 2.155e+02 2.545e+02 4.224e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 23:40:08,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 23:40:08,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 23:40:10,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 23:40:10,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 23:40:11,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:17,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:40:17,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:40:17,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 23:40:17,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:19,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:40:19,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:40:20,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:40:22,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:40:22,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:40:22,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:40:24,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.74 vs. limit=6.0 2023-10-03 23:40:26,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:26,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:40:26,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:40:28,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 23:40:29,396 INFO [train.py:1046] (2/4) Epoch 41, batch 4700, loss[loss=0.1621, simple_loss=0.2386, pruned_loss=0.04279, over 23553.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2364, pruned_loss=0.03812, over 4712655.74 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:40:29,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:40:30,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 23:40:38,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:39,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:41,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:40:41,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:40:43,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:40:47,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 23:40:48,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 23:40:50,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:51,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:40:51,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:53,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:58,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1448040.0, ans=0.125 2023-10-03 23:40:59,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:41:00,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:41:04,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:41:09,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 23:41:10,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:41:12,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:16,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1448106.6666666667, ans=0.125 2023-10-03 23:41:17,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 23:41:18,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1448106.6666666667, ans=0.0 2023-10-03 23:41:19,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:41:22,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:41:24,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 23:41:24,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1448106.6666666667, ans=0.1 2023-10-03 23:41:26,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:26,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:28,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:41:28,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:41:30,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 23:41:32,118 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 23:41:33,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:36,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:36,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 23:41:38,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:39,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 23:41:43,427 INFO [train.py:1046] (2/4) Epoch 41, batch 4750, loss[loss=0.1572, simple_loss=0.2506, pruned_loss=0.03191, over 24305.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03793, over 4724899.60 frames. ], batch size: 74, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:41:43,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:41:43,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:41:45,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1448240.0, ans=0.125 2023-10-03 23:41:46,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:41:47,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:41:49,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 23:41:49,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:41:53,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 23:41:55,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1448240.0, ans=0.125 2023-10-03 23:41:56,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:41:56,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:57,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:41:59,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1448306.6666666667, ans=0.1 2023-10-03 23:42:02,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 23:42:05,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:42:08,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 23:42:09,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:11,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:42:11,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:42:12,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:42:14,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 23:42:14,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 23:42:18,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 23:42:18,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1448373.3333333333, ans=0.2 2023-10-03 23:42:20,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:42:22,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.69 vs. limit=22.5 2023-10-03 23:42:23,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:42:25,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:42:25,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 23:42:25,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:42:28,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:42:30,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:42:33,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 23:42:33,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 23:42:34,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:42:34,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:42:34,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:42:36,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:42:36,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 23:42:38,095 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.957e+02 2.175e+02 2.344e+02 3.042e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 23:42:39,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 23:42:40,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:42:43,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:42:43,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 23:42:45,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:45,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:42:46,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:42:47,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:42:49,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:42:52,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:42:53,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 23:42:53,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 23:42:53,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 23:42:55,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:42:55,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1448573.3333333333, ans=0.125 2023-10-03 23:42:56,570 INFO [train.py:1046] (2/4) Epoch 41, batch 4800, loss[loss=0.1667, simple_loss=0.2506, pruned_loss=0.04135, over 24352.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2378, pruned_loss=0.03874, over 4719978.01 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:42:56,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:42:56,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 23:43:03,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:05,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:11,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:43:12,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:13,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:13,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 23:43:14,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:43:14,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:43:15,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1448640.0, ans=0.125 2023-10-03 23:43:16,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:43:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:22,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:43:23,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:23,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 23:43:23,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:24,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:27,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:27,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1448706.6666666667, ans=0.125 2023-10-03 23:43:29,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1448706.6666666667, ans=0.125 2023-10-03 23:43:30,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:30,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:30,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:43:31,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:43:33,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:34,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 23:43:34,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 23:43:36,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:36,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:43:36,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:43:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:43:38,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:43:38,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:43:39,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:43:44,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.73 vs. limit=6.0 2023-10-03 23:43:45,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:43:46,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:48,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:43:48,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1448773.3333333333, ans=0.125 2023-10-03 23:43:48,822 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.87 vs. limit=15.0 2023-10-03 23:43:52,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 23:43:52,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:52,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:43:54,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:55,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:43:57,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:43:57,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:58,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:43:58,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:43:58,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:44:01,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:01,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:01,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:44:04,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 23:44:05,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-10-03 23:44:06,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 23:44:06,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:44:06,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:44:06,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1448840.0, ans=0.125 2023-10-03 23:44:07,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:44:07,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:10,447 INFO [train.py:1046] (2/4) Epoch 41, batch 4850, loss[loss=0.1533, simple_loss=0.2385, pruned_loss=0.03406, over 24325.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2373, pruned_loss=0.03858, over 4727581.57 frames. ], batch size: 61, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:44:10,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:44:19,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 23:44:20,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:23,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1448973.3333333333, ans=0.125 2023-10-03 23:44:26,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:44:27,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:44:27,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:30,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:31,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:44:33,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:44:33,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 23:44:35,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1448973.3333333333, ans=0.1 2023-10-03 23:44:37,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:44:39,105 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.02 vs. limit=15.0 2023-10-03 23:44:39,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:44:39,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:44:41,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:44:41,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 23:44:45,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:44:45,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:44:48,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:44:48,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 23:44:48,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 23:44:50,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:44:59,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:44:59,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 23:45:00,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:45:01,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:45:03,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:45:04,520 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.005e+02 2.187e+02 2.519e+02 4.301e+02, threshold=4.375e+02, percent-clipped=0.0 2023-10-03 23:45:04,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 23:45:04,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:04,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 23:45:04,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:06,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1449106.6666666667, ans=0.2 2023-10-03 23:45:07,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 23:45:17,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:22,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:45:22,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:45:22,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1449240.0, ans=0.04949747468305833 2023-10-03 23:45:24,432 INFO [train.py:1046] (2/4) Epoch 41, batch 4900, loss[loss=0.1618, simple_loss=0.2265, pruned_loss=0.04853, over 23547.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2368, pruned_loss=0.03836, over 4736864.90 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:45:28,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 23:45:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:45:29,676 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.98 vs. limit=8.0 2023-10-03 23:45:30,332 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:45:34,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:34,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:34,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:45:37,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 23:45:37,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1449306.6666666667, ans=0.0 2023-10-03 23:45:40,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 23:45:42,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1449306.6666666667, ans=0.125 2023-10-03 23:45:45,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 23:45:47,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 23:45:47,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:45:47,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:47,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:45:47,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:45:48,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:45:48,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 23:45:51,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 23:45:51,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:45:53,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:45:53,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:45:57,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:45:57,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:58,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:58,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 23:46:00,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:46:01,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:46:01,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 23:46:01,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 23:46:07,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 23:46:07,860 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:46:10,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:46:11,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:46:11,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:46:11,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:13,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 23:46:13,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:46:13,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 23:46:15,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:17,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:46:18,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:46:21,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 23:46:22,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:46:22,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 23:46:24,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 23:46:30,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:46:31,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:46:32,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 23:46:33,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:46:33,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:46:34,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:37,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:46:37,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:46:37,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:46:37,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 23:46:39,023 INFO [train.py:1046] (2/4) Epoch 41, batch 4950, loss[loss=0.1356, simple_loss=0.215, pruned_loss=0.02811, over 24310.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.03775, over 4739727.22 frames. ], batch size: 56, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:46:39,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:46:42,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:46:42,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:46:45,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 23:46:45,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 23:46:45,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:46:46,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 23:46:46,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:46,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:46:48,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:46:48,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:46:51,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:51,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:46:53,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:46:54,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:46:54,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1449640.0, ans=0.125 2023-10-03 23:46:56,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:57,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:47:00,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:47:00,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1449640.0, ans=0.125 2023-10-03 23:47:01,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1449640.0, ans=0.0 2023-10-03 23:47:05,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:07,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:47:07,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:07,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:08,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:47:12,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 23:47:12,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 23:47:14,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:16,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1449706.6666666667, ans=0.125 2023-10-03 23:47:17,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:47:17,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:47:19,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:47:19,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:47:20,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:47:21,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1449706.6666666667, ans=0.125 2023-10-03 23:47:23,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:47:25,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:47:26,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:47:28,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:28,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:28,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 23:47:28,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:47:31,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:47:33,935 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.903e+02 2.164e+02 2.599e+02 4.348e+02, threshold=4.328e+02, percent-clipped=0.0 2023-10-03 23:47:34,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:47:35,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:47:37,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:47:37,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:38,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:47:38,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:47:39,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:47:41,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:47:41,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:47:43,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 23:47:47,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:47:53,055 INFO [train.py:1046] (2/4) Epoch 41, batch 5000, loss[loss=0.1352, simple_loss=0.2151, pruned_loss=0.02763, over 23360.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2346, pruned_loss=0.03745, over 4728993.37 frames. ], batch size: 119, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:47:53,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 23:47:53,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:48:00,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:48:00,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:48:00,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 23:48:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 23:48:04,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:48:05,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 23:48:06,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:48:06,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:48:06,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 23:48:06,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:08,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:48:08,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1449973.3333333333, ans=0.125 2023-10-03 23:48:09,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 23:48:09,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:48:09,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:48:11,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 23:48:11,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 23:48:13,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:48:13,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 23:48:13,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:48:14,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:15,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:48:15,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 23:48:15,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 23:48:17,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 23:48:17,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:18,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:19,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 23:48:21,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:48:23,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:23,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:48:23,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1450040.0, ans=0.0 2023-10-03 23:48:24,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:48:25,233 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=22.5 2023-10-03 23:48:27,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 23:48:29,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:48:30,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:48:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 23:48:35,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1450040.0, ans=0.1 2023-10-03 23:48:36,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1450106.6666666667, ans=0.125 2023-10-03 23:48:37,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:48:39,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:39,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:48:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 23:48:43,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:43,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:48:43,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:48:45,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 23:48:47,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:48:49,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:48:51,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:48:55,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 23:48:59,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:07,308 INFO [train.py:1046] (2/4) Epoch 41, batch 5050, loss[loss=0.1649, simple_loss=0.2436, pruned_loss=0.04315, over 23203.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03756, over 4726084.24 frames. ], batch size: 119, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:49:07,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:49:08,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:08,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:49:10,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:49:10,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:49:10,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:49:10,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:13,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1450240.0, ans=0.125 2023-10-03 23:49:15,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1450240.0, ans=0.2 2023-10-03 23:49:16,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:16,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 23:49:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:49:20,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:49:22,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:49:22,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 23:49:23,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:49:23,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:49:25,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:49:26,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:49:26,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:49:36,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 23:49:36,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:49:38,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:49:38,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 23:49:38,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:49:39,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:39,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:49:40,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:49:40,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 23:49:40,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 23:49:42,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:44,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:49:48,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:48,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 23:49:49,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:49:52,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 23:49:52,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1450440.0, ans=0.2 2023-10-03 23:49:54,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:49:54,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:49:55,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:49:55,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:49:57,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1450440.0, ans=0.2 2023-10-03 23:49:58,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:49:58,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:49:59,366 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.31 vs. limit=15.0 2023-10-03 23:49:59,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:01,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:50:01,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:50:01,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 23:50:01,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:50:03,168 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 1.933e+02 2.126e+02 2.444e+02 3.244e+02, threshold=4.252e+02, percent-clipped=0.0 2023-10-03 23:50:04,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:50:06,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:50:06,176 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 23:50:06,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:50:06,830 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.89 vs. limit=22.5 2023-10-03 23:50:07,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:50:08,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:09,012 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 23:50:12,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:50:12,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 23:50:12,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:15,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1450506.6666666667, ans=0.0 2023-10-03 23:50:16,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:50:16,948 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:50:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:18,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 23:50:20,656 INFO [train.py:1046] (2/4) Epoch 41, batch 5100, loss[loss=0.1595, simple_loss=0.252, pruned_loss=0.03353, over 24452.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2361, pruned_loss=0.03784, over 4720069.60 frames. ], batch size: 69, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:50:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 23:50:22,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1450573.3333333333, ans=0.0 2023-10-03 23:50:23,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:23,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:50:23,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:50:25,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1450573.3333333333, ans=0.0 2023-10-03 23:50:26,324 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 23:50:28,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:50:29,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1450573.3333333333, ans=0.125 2023-10-03 23:50:31,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 23:50:31,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 23:50:31,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:33,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:50:35,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:50:35,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 23:50:35,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 23:50:41,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:50:41,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:50:44,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.94 vs. limit=15.0 2023-10-03 23:50:47,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:49,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 23:50:49,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:50:50,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:50,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:50:52,516 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.22 vs. limit=15.0 2023-10-03 23:50:53,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:53,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:53,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 23:50:56,015 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 23:50:56,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:57,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 23:50:57,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 23:51:00,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:51:05,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1450773.3333333333, ans=0.125 2023-10-03 23:51:09,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:11,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 23:51:13,157 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 23:51:13,168 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 23:51:16,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 23:51:16,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:51:18,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 23:51:22,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 23:51:23,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:51:25,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:51:26,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 23:51:27,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:51:29,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 23:51:33,708 INFO [train.py:1046] (2/4) Epoch 41, batch 5150, loss[loss=0.1588, simple_loss=0.2307, pruned_loss=0.04341, over 23779.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03834, over 4720446.44 frames. ], batch size: 164, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:51:33,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:51:33,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:51:33,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:51:35,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:51:35,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:51:37,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:51:37,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 23:51:37,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 23:51:38,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 23:51:38,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:51:38,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 23:51:40,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:40,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 23:51:42,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:51:42,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1450906.6666666667, ans=0.125 2023-10-03 23:51:44,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:51:45,457 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.20 vs. limit=6.0 2023-10-03 23:51:49,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:51:49,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 23:51:50,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:52,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:51:53,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:51:53,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:51:53,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:51:54,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:51:54,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:51:54,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1450973.3333333333, ans=0.0 2023-10-03 23:51:56,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 23:51:57,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:51:57,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:51:59,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:52:02,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 23:52:03,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:52:08,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:52:08,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 23:52:12,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:52:17,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1451106.6666666667, ans=0.0 2023-10-03 23:52:18,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:52:19,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:52:22,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:52:22,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:52:24,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1451106.6666666667, ans=0.125 2023-10-03 23:52:25,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 23:52:28,962 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=12.0 2023-10-03 23:52:29,391 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.988e+02 2.282e+02 2.710e+02 3.872e+02, threshold=4.565e+02, percent-clipped=0.0 2023-10-03 23:52:29,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:52:30,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:52:30,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:52:31,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1451173.3333333333, ans=0.0 2023-10-03 23:52:34,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:52:35,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:52:36,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 23:52:41,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:52:43,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:52:44,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:52:44,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:52:46,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:52:46,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:52:46,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:52:46,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:52:47,995 INFO [train.py:1046] (2/4) Epoch 41, batch 5200, loss[loss=0.1608, simple_loss=0.2336, pruned_loss=0.04402, over 23833.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2375, pruned_loss=0.03848, over 4713560.45 frames. ], batch size: 179, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:52:50,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:52:52,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:52:54,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:52:58,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 23:52:58,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:52:58,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:02,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:03,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:53:03,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:04,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 23:53:07,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:53:09,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 23:53:13,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:53:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:53:15,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 23:53:15,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1451373.3333333333, ans=0.125 2023-10-03 23:53:16,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 23:53:19,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 23:53:19,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:19,756 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 23:53:19,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:22,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:22,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:53:23,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 23:53:23,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:53:24,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1451373.3333333333, ans=0.1 2023-10-03 23:53:25,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:28,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 23:53:28,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 23:53:28,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 23:53:32,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 23:53:32,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1451440.0, ans=0.125 2023-10-03 23:53:34,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:53:40,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:53:40,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:53:42,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 23:53:42,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:42,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:53:42,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:44,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:53:45,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:53:46,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:53:48,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:51,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:53:51,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:55,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:53:57,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 23:53:57,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:53:57,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:53:58,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:58,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:53:58,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1451573.3333333333, ans=0.95 2023-10-03 23:54:00,004 INFO [train.py:1046] (2/4) Epoch 41, batch 5250, loss[loss=0.1643, simple_loss=0.2516, pruned_loss=0.03851, over 24268.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03828, over 4716016.63 frames. ], batch size: 74, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:54:00,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:54:03,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:54:05,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:54:05,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:54:07,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:54:09,749 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.62 vs. limit=15.0 2023-10-03 23:54:12,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:54:13,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:54:16,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:54:18,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:54:21,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 23:54:21,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:54:21,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:54:40,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.33 vs. limit=15.0 2023-10-03 23:54:41,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1451773.3333333333, ans=0.2 2023-10-03 23:54:52,782 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.869e+02 2.005e+02 2.219e+02 3.156e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-03 23:55:08,761 INFO [train.py:1046] (2/4) Epoch 41, batch 5300, loss[loss=0.1475, simple_loss=0.2351, pruned_loss=0.02999, over 24485.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2343, pruned_loss=0.03812, over 4704524.23 frames. ], batch size: 63, lr: 2.46e-03, grad_scale: 16.0 2023-10-03 23:55:17,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1451906.6666666667, ans=0.5 2023-10-03 23:55:22,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1451973.3333333333, ans=0.125 2023-10-03 23:55:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:55:23,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 23:55:23,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 23:55:23,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:23,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:23,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:23,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:23,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:23,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:23,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:23,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:55:24,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:55:24,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 23:55:24,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 23:55:24,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 23:55:24,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:55:24,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 23:55:24,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 23:55:24,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:25,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:25,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:55:25,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:55:25,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:55:25,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:55:25,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:25,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:55:25,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:25,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:55:25,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:25,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:55:26,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 23:55:26,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:55:26,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:26,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 23:55:26,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 23:55:26,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:55:26,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:55:26,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 23:55:27,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 23:55:27,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:55:27,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:55:28,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:55:28,138 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 23:55:28,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 23:55:28,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:55:28,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:28,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 23:55:28,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 23:55:28,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 23:55:28,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:55:32,746 INFO [train.py:1046] (2/4) Epoch 42, batch 0, loss[loss=0.1451, simple_loss=0.2259, pruned_loss=0.0322, over 23774.00 frames. ], tot_loss[loss=0.1451, simple_loss=0.2259, pruned_loss=0.0322, over 23774.00 frames. ], batch size: 179, lr: 2.43e-03, grad_scale: 32.0 2023-10-03 23:55:32,747 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-03 23:55:44,908 INFO [train.py:1078] (2/4) Epoch 42, validation: loss=0.3268, simple_loss=0.2729, pruned_loss=0.1903, over 1125622.00 frames. 2023-10-03 23:55:44,909 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-03 23:55:48,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 23:55:48,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:55:48,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1451986.6666666667, ans=0.0 2023-10-03 23:55:51,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:55:55,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:55:55,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:55:56,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:56,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 23:55:58,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 23:55:59,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:59,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:56:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:56:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:03,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:56:03,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:56:03,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1452053.3333333333, ans=0.125 2023-10-03 23:56:05,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 23:56:05,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1452053.3333333333, ans=0.0 2023-10-03 23:56:05,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1452053.3333333333, ans=0.125 2023-10-03 23:56:06,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:56:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:56:15,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:15,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1452120.0, ans=0.5 2023-10-03 23:56:17,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 23:56:19,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:56:19,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:56:22,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:56:25,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1452120.0, ans=0.0 2023-10-03 23:56:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:56:32,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:56:36,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 23:56:41,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 23:56:41,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:56:41,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:43,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:56:43,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:46,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 23:56:48,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:49,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:53,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:56:55,943 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 23:56:57,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:56:58,743 INFO [train.py:1046] (2/4) Epoch 42, batch 50, loss[loss=0.1537, simple_loss=0.2446, pruned_loss=0.0314, over 24307.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2403, pruned_loss=0.03879, over 1070250.26 frames. ], batch size: 74, lr: 2.43e-03, grad_scale: 16.0 2023-10-03 23:57:01,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:57:04,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:57:04,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 23:57:05,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:57:05,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:57:08,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:08,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:09,295 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.60 vs. limit=6.0 2023-10-03 23:57:11,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:57:13,999 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.68 vs. limit=15.0 2023-10-03 23:57:15,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 23:57:15,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:20,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:57:21,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 23:57:23,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 23:57:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:57:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:57:26,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:27,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:57:28,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:57:28,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:57:28,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:35,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:57:37,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:57:37,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:57:38,508 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.003e+02 2.232e+02 2.630e+02 3.790e+02, threshold=4.463e+02, percent-clipped=0.0 2023-10-03 23:57:38,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 23:57:38,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:57:40,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:57:40,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 23:57:40,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:57:42,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 23:57:45,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1452520.0, ans=0.125 2023-10-03 23:57:51,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:57:51,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:57:53,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:53,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1452520.0, ans=0.125 2023-10-03 23:57:54,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:57:54,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:57:57,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 23:57:57,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 23:57:58,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:58,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:58:00,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:58:00,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:58:00,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 23:58:01,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 23:58:02,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:58:04,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:58:06,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 23:58:06,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 23:58:06,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:08,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:58:08,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:58:09,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:58:12,874 INFO [train.py:1046] (2/4) Epoch 42, batch 100, loss[loss=0.1412, simple_loss=0.222, pruned_loss=0.03024, over 22781.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2389, pruned_loss=0.03892, over 1874087.56 frames. ], batch size: 50, lr: 2.43e-03, grad_scale: 16.0 2023-10-03 23:58:12,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:58:15,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:58:19,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:58:21,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 23:58:21,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:58:23,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1452653.3333333333, ans=0.125 2023-10-03 23:58:25,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:58:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:58:25,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:58:25,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:58:26,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:58:28,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 23:58:31,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:58:31,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:31,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:58:31,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:58:34,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 23:58:35,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:35,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:58:36,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:58:38,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:58:41,100 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 23:58:41,121 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 23:58:42,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:58:42,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:58:45,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:58:48,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:50,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:58:55,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:58:57,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 23:59:00,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 23:59:02,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:59:04,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:59:06,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:09,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:09,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1452920.0, ans=0.125 2023-10-03 23:59:13,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:59:14,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:59:16,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:18,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:19,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:19,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:59:19,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:21,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 23:59:21,126 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 23:59:21,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:22,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:59:24,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:24,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:24,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:59:24,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:59:24,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1452986.6666666667, ans=0.125 2023-10-03 23:59:26,529 INFO [train.py:1046] (2/4) Epoch 42, batch 150, loss[loss=0.1505, simple_loss=0.2356, pruned_loss=0.03272, over 24474.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2389, pruned_loss=0.03866, over 2513724.39 frames. ], batch size: 66, lr: 2.43e-03, grad_scale: 8.0 2023-10-03 23:59:26,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:59:26,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:26,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:26,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1452986.6666666667, ans=0.125 2023-10-03 23:59:28,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:28,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:59:28,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:59:30,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:33,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:59:33,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:59:34,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:35,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:36,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:37,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:59:39,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:39,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1453053.3333333333, ans=0.1 2023-10-03 23:59:43,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 23:59:43,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 23:59:43,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 23:59:46,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:59:46,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:59:46,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1453053.3333333333, ans=0.1 2023-10-03 23:59:47,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:59:48,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:48,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:48,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:49,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:50,945 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 23:59:52,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:56,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:59:58,193 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:59:59,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:00:00,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 00:00:03,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:00:04,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:00:04,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:00:06,212 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.980e+02 2.184e+02 2.445e+02 3.491e+02, threshold=4.367e+02, percent-clipped=0.0 2023-10-04 00:00:06,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1453120.0, ans=0.125 2023-10-04 00:00:07,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:00:08,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:00:10,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:00:11,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:11,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 00:00:16,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:17,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:19,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:00:19,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:00:19,927 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.14 vs. limit=12.0 2023-10-04 00:00:21,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:23,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 00:00:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:00:26,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1453253.3333333333, ans=0.0 2023-10-04 00:00:27,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:00:28,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:00:30,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:00:30,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 00:00:30,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:00:31,496 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 00:00:34,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:00:38,123 INFO [train.py:1046] (2/4) Epoch 42, batch 200, loss[loss=0.16, simple_loss=0.2509, pruned_loss=0.03459, over 24450.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2397, pruned_loss=0.03924, over 2992690.93 frames. ], batch size: 69, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:00:38,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:00:38,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:00:41,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 00:00:41,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:00:42,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:44,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 00:00:45,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:00:46,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:48,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:54,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:00:54,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:00:54,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:12,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:01:12,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:01:14,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:01:14,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:01:14,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1453453.3333333333, ans=0.0 2023-10-04 00:01:15,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:01:15,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:01:18,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:19,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:01:19,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:01:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:01:20,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 00:01:21,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.72 vs. limit=22.5 2023-10-04 00:01:22,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 00:01:22,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:26,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:01:33,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:01:39,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1453586.6666666667, ans=0.0 2023-10-04 00:01:40,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:40,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:01:47,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 00:01:49,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:49,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:01:49,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:01:50,401 INFO [train.py:1046] (2/4) Epoch 42, batch 250, loss[loss=0.1466, simple_loss=0.2132, pruned_loss=0.03996, over 23364.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2392, pruned_loss=0.03864, over 3382021.38 frames. ], batch size: 285, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:01:50,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:01:52,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 00:01:52,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:01:52,587 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 00:01:55,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:57,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:02:00,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:02:00,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:02:03,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:02:04,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:02:04,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:02:07,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:02:10,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1453720.0, ans=0.05 2023-10-04 00:02:14,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:02:14,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1453720.0, ans=0.125 2023-10-04 00:02:14,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1453720.0, ans=0.0 2023-10-04 00:02:17,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:02:17,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:02:23,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:02:24,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:02:24,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:02:24,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:02:27,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:02:27,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:02:28,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:02:29,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1453786.6666666667, ans=0.1 2023-10-04 00:02:31,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:02:32,296 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.009e+02 2.241e+02 2.579e+02 4.202e+02, threshold=4.483e+02, percent-clipped=0.0 2023-10-04 00:02:32,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 00:02:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:02:35,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:02:35,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:02:35,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:02:36,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:02:36,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:02:37,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:02:40,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:02:40,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1453853.3333333333, ans=0.125 2023-10-04 00:02:41,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:02:41,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:02:46,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1453853.3333333333, ans=0.0 2023-10-04 00:02:47,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:02:50,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:02:53,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:02:59,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:03:01,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:03:03,956 INFO [train.py:1046] (2/4) Epoch 42, batch 300, loss[loss=0.1374, simple_loss=0.2217, pruned_loss=0.02661, over 24344.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2366, pruned_loss=0.03765, over 3673852.99 frames. ], batch size: 61, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:03:04,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1453986.6666666667, ans=0.125 2023-10-04 00:03:05,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 00:03:05,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:03:05,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:03:06,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 00:03:06,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:03:08,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:03:09,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 00:03:13,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:03:13,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:03:17,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:03:17,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 00:03:19,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:03:19,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:03:19,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 00:03:19,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:03:20,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1454053.3333333333, ans=0.125 2023-10-04 00:03:24,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:03:28,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:03:28,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 00:03:28,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1454053.3333333333, ans=0.1 2023-10-04 00:03:32,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 00:03:33,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:34,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:03:36,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:36,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 00:03:36,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:03:39,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:03:40,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:03:40,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:03:43,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:03:43,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 00:03:44,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:03:47,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:48,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 00:03:48,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:03:55,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:03:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:03:58,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 00:04:04,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:04,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:04:05,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:07,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:04:07,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 00:04:07,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:04:08,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:10,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 00:04:11,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:12,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:14,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:04:14,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:15,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:16,927 INFO [train.py:1046] (2/4) Epoch 42, batch 350, loss[loss=0.1808, simple_loss=0.2644, pruned_loss=0.04859, over 24379.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2338, pruned_loss=0.03744, over 3896808.93 frames. ], batch size: 77, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:04:18,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:04:18,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 00:04:21,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:27,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:04:31,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:31,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 00:04:34,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:04:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 00:04:36,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:37,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 00:04:38,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:38,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1454386.6666666667, ans=0.025 2023-10-04 00:04:39,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1454386.6666666667, ans=0.0 2023-10-04 00:04:41,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 00:04:43,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:04:45,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:46,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:04:46,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:04:47,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:04:47,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:04:47,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:48,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:04:49,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1454453.3333333333, ans=0.1 2023-10-04 00:04:50,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:04:50,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:58,162 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.875e+02 2.208e+02 2.566e+02 3.808e+02, threshold=4.416e+02, percent-clipped=0.0 2023-10-04 00:04:58,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:04:58,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:04:59,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:04:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:04,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 00:05:04,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:05:08,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:08,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:08,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:05:11,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 00:05:12,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:13,945 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 00:05:15,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 00:05:15,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:16,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.71 vs. limit=15.0 2023-10-04 00:05:18,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:05:18,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 00:05:19,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:22,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:05:22,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:24,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:24,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:26,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:28,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1454586.6666666667, ans=0.2 2023-10-04 00:05:29,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:05:30,689 INFO [train.py:1046] (2/4) Epoch 42, batch 400, loss[loss=0.1645, simple_loss=0.2384, pruned_loss=0.0453, over 23639.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2335, pruned_loss=0.03754, over 4079654.73 frames. ], batch size: 149, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:05:30,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:05:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 00:05:32,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:32,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:34,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:05:34,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:38,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:39,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:41,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 00:05:42,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 00:05:42,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:42,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1454653.3333333333, ans=0.125 2023-10-04 00:05:45,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 00:05:45,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:49,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:05:49,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:49,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 00:05:49,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:05:49,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1454720.0, ans=0.125 2023-10-04 00:05:50,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:50,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:50,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:54,104 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 00:05:54,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 00:05:59,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:59,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:06:00,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 00:06:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 00:06:04,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:06:08,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:15,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 00:06:17,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:06:17,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 00:06:20,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:06:20,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:06:20,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 00:06:25,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:06:26,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:06:28,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:06:29,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1454920.0, ans=0.05 2023-10-04 00:06:32,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:32,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 00:06:34,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 00:06:35,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 00:06:37,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:06:37,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:06:40,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 00:06:42,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:06:42,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:06:44,244 INFO [train.py:1046] (2/4) Epoch 42, batch 450, loss[loss=0.1712, simple_loss=0.2552, pruned_loss=0.04361, over 24392.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2344, pruned_loss=0.03718, over 4234254.22 frames. ], batch size: 77, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:06:44,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:06:45,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 00:06:45,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:06:47,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:06:47,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:06:47,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 00:06:48,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:06:48,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:06:48,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1454986.6666666667, ans=0.125 2023-10-04 00:06:51,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:06:55,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1454986.6666666667, ans=0.0 2023-10-04 00:06:56,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1454986.6666666667, ans=0.2 2023-10-04 00:06:59,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:59,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.06 vs. limit=15.0 2023-10-04 00:07:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:02,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 00:07:02,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 00:07:06,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:07:08,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:08,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1455053.3333333333, ans=0.0 2023-10-04 00:07:11,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:15,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:07:15,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:07:17,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 00:07:17,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1455120.0, ans=0.025 2023-10-04 00:07:18,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 00:07:21,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 00:07:21,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:07:22,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:23,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:07:24,646 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.15 vs. limit=15.0 2023-10-04 00:07:25,418 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 00:07:25,426 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 00:07:25,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:27,289 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.923e+02 2.068e+02 2.346e+02 3.607e+02, threshold=4.137e+02, percent-clipped=0.0 2023-10-04 00:07:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:07:27,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 00:07:30,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:07:32,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:07:32,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:07:32,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1455186.6666666667, ans=0.0 2023-10-04 00:07:33,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 00:07:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:36,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:07:36,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:07:39,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 00:07:42,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:07:42,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 00:07:43,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 00:07:45,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:50,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:07:51,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:07:53,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:07:54,548 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 00:07:57,758 INFO [train.py:1046] (2/4) Epoch 42, batch 500, loss[loss=0.1578, simple_loss=0.2443, pruned_loss=0.03559, over 23264.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2354, pruned_loss=0.03764, over 4339041.71 frames. ], batch size: 93, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:07:59,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:59,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:08:00,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:00,572 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 00:08:02,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 00:08:02,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:04,228 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.60 vs. limit=15.0 2023-10-04 00:08:06,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:08:09,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 00:08:10,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:08:12,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:08:12,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:08:13,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:15,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1455386.6666666667, ans=0.125 2023-10-04 00:08:24,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:26,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:08:26,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1455453.3333333333, ans=0.0 2023-10-04 00:08:27,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:08:27,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:27,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 00:08:27,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:08:29,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:08:32,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:08:32,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:08:32,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:32,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 00:08:37,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 00:08:39,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:08:39,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1455520.0, ans=0.125 2023-10-04 00:08:41,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1455520.0, ans=0.1 2023-10-04 00:08:42,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1455520.0, ans=0.125 2023-10-04 00:08:43,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:08:45,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 00:08:45,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1455520.0, ans=0.1 2023-10-04 00:08:45,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1455520.0, ans=0.125 2023-10-04 00:08:46,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1455520.0, ans=0.2 2023-10-04 00:08:48,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:08:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:08:52,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:55,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:09:01,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:04,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 00:09:04,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:08,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 00:09:10,884 INFO [train.py:1046] (2/4) Epoch 42, batch 550, loss[loss=0.166, simple_loss=0.2553, pruned_loss=0.03832, over 24430.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2372, pruned_loss=0.03802, over 4432695.60 frames. ], batch size: 69, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:09:10,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:09:12,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:16,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 00:09:17,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 00:09:17,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:09:17,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 00:09:17,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:09:17,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:09:19,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:19,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:19,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:09:20,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:09:22,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:22,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 00:09:22,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:09:29,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:29,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:32,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:09:33,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:34,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 00:09:36,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 00:09:36,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:09:42,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:09:42,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:09:43,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:09:45,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:45,112 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 00:09:46,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:48,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:09:52,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:09:52,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:09:52,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:09:53,654 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.909e+02 2.051e+02 2.343e+02 3.682e+02, threshold=4.101e+02, percent-clipped=0.0 2023-10-04 00:09:53,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:55,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 00:09:57,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 00:09:57,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:57,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:09:57,428 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:09:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:09:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:10:01,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:10:04,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:10:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:10:07,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:09,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 00:10:10,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:10:11,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:10:12,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:10:13,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:14,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:10:14,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 00:10:21,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 00:10:22,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1455920.0, ans=0.0 2023-10-04 00:10:24,569 INFO [train.py:1046] (2/4) Epoch 42, batch 600, loss[loss=0.1337, simple_loss=0.2011, pruned_loss=0.03318, over 23383.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2376, pruned_loss=0.03811, over 4493792.89 frames. ], batch size: 285, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:10:24,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 00:10:26,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:10:26,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:10:26,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:10:28,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1455986.6666666667, ans=0.125 2023-10-04 00:10:32,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:10:34,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:10:37,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 00:10:40,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:10:40,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:10:41,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:44,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 00:10:44,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:10:51,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 00:10:54,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:10:54,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:54,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:11:00,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:11:02,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:11:02,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:02,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1456120.0, ans=0.1 2023-10-04 00:11:07,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1456120.0, ans=0.125 2023-10-04 00:11:08,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:11:13,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:13,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:11:13,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:11:13,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-10-04 00:11:15,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1456186.6666666667, ans=0.125 2023-10-04 00:11:18,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 00:11:23,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1456253.3333333333, ans=0.2 2023-10-04 00:11:24,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:11:24,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:11:26,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1456253.3333333333, ans=0.0 2023-10-04 00:11:26,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1456253.3333333333, ans=0.125 2023-10-04 00:11:27,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 00:11:28,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:11:30,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 00:11:30,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:11:32,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:11:38,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 00:11:40,227 INFO [train.py:1046] (2/4) Epoch 42, batch 650, loss[loss=0.1447, simple_loss=0.228, pruned_loss=0.03073, over 24448.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2365, pruned_loss=0.03766, over 4545745.07 frames. ], batch size: 63, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:11:40,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:11:42,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:11:44,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:11:46,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1456320.0, ans=0.2 2023-10-04 00:11:47,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:11:48,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 00:11:48,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:56,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:11:56,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:11:59,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:02,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 00:12:05,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:12:05,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:12:09,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:12:09,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:12:11,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1456453.3333333333, ans=0.125 2023-10-04 00:12:13,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:14,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:14,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:12:14,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:16,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:12:18,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:12:18,726 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 00:12:18,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:18,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:12:20,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1456453.3333333333, ans=0.125 2023-10-04 00:12:22,744 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 1.909e+02 2.145e+02 2.381e+02 3.459e+02, threshold=4.289e+02, percent-clipped=0.0 2023-10-04 00:12:22,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:22,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:12:22,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:24,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:12:25,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1456520.0, ans=0.2 2023-10-04 00:12:26,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 00:12:26,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:12:26,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:12:27,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:12:27,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:12:29,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:12:31,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 00:12:32,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 00:12:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:32,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:12:32,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:12:32,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:12:35,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:12:36,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1456520.0, ans=0.125 2023-10-04 00:12:41,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:43,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:12:43,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:46,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:46,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:12:46,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:49,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1456586.6666666667, ans=0.125 2023-10-04 00:12:53,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:12:53,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:12:53,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:12:54,892 INFO [train.py:1046] (2/4) Epoch 42, batch 700, loss[loss=0.1605, simple_loss=0.2464, pruned_loss=0.03728, over 24044.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03738, over 4573888.43 frames. ], batch size: 80, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:12:54,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:12:56,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1456653.3333333333, ans=0.5 2023-10-04 00:12:58,464 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.33 vs. limit=10.0 2023-10-04 00:12:59,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 00:13:00,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 00:13:04,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 00:13:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:05,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:13:06,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 00:13:09,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1456720.0, ans=0.1 2023-10-04 00:13:12,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:13:15,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:13:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:17,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:13:19,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:13:20,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:23,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 00:13:23,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:13:24,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 00:13:28,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 00:13:32,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:13:32,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:13:35,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:13:38,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:13:39,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 00:13:43,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:13:45,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:13:45,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 00:13:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:13:50,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:13:52,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:13:52,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1456853.3333333333, ans=0.2 2023-10-04 00:13:57,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:13:57,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 00:14:01,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 00:14:01,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 00:14:02,648 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.16 vs. limit=6.0 2023-10-04 00:14:03,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:05,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:06,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:14:06,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:06,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 00:14:10,006 INFO [train.py:1046] (2/4) Epoch 42, batch 750, loss[loss=0.1511, simple_loss=0.2223, pruned_loss=0.03994, over 23471.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2352, pruned_loss=0.0374, over 4613721.49 frames. ], batch size: 285, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:14:10,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1456986.6666666667, ans=0.0 2023-10-04 00:14:11,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 00:14:11,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 00:14:11,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 00:14:11,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 00:14:12,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 00:14:13,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:14:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 00:14:14,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:15,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1456986.6666666667, ans=10.0 2023-10-04 00:14:15,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-04 00:14:16,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:14:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:16,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1456986.6666666667, ans=0.125 2023-10-04 00:14:16,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1456986.6666666667, ans=0.125 2023-10-04 00:14:18,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-10-04 00:14:19,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:14:19,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:14:20,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:23,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:14:23,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:14:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:14:26,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:27,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:14:28,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 00:14:31,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:14:31,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:14:31,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1457053.3333333333, ans=0.1 2023-10-04 00:14:32,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:14:34,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:14:34,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 00:14:34,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:14:34,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1457053.3333333333, ans=0.0 2023-10-04 00:14:36,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 00:14:36,950 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 00:14:38,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 00:14:38,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:14:38,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:14:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:14:46,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:14:46,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:14:47,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:14:47,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1457120.0, ans=0.125 2023-10-04 00:14:48,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:50,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:50,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 00:14:51,087 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:14:52,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:14:52,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 00:14:53,509 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 1.922e+02 2.060e+02 2.309e+02 3.255e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-04 00:14:53,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:14:55,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:14:55,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 00:14:56,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:14:58,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.94 vs. limit=22.5 2023-10-04 00:15:02,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:02,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:15:02,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1457186.6666666667, ans=0.125 2023-10-04 00:15:03,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:06,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:15:09,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 00:15:09,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:15:11,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:13,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:13,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:15,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1457253.3333333333, ans=0.09899494936611666 2023-10-04 00:15:15,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1457253.3333333333, ans=0.2 2023-10-04 00:15:17,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:17,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:15:24,719 INFO [train.py:1046] (2/4) Epoch 42, batch 800, loss[loss=0.1666, simple_loss=0.2551, pruned_loss=0.03911, over 24425.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2354, pruned_loss=0.03694, over 4656535.01 frames. ], batch size: 77, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:15:26,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:26,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:29,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:15:29,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:30,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:31,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:35,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1457320.0, ans=0.0 2023-10-04 00:15:36,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:36,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:15:40,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 00:15:40,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:42,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:42,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:15:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:15:44,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 00:15:44,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:44,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 00:15:47,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:50,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:51,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:52,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:15:56,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:56,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:56,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1457453.3333333333, ans=0.125 2023-10-04 00:16:00,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:16:00,346 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:16:01,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:16:01,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 00:16:03,682 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 00:16:04,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 00:16:04,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:16:04,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:06,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:06,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:16:11,932 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 00:16:13,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 00:16:15,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:16:16,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:16:19,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1457520.0, ans=0.125 2023-10-04 00:16:21,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:16:21,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1457520.0, ans=0.1 2023-10-04 00:16:23,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:25,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 00:16:27,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:16:29,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 00:16:35,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:16:36,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:16:37,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 00:16:38,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:16:38,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:39,406 INFO [train.py:1046] (2/4) Epoch 42, batch 850, loss[loss=0.17, simple_loss=0.2591, pruned_loss=0.04042, over 23993.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2359, pruned_loss=0.03713, over 4666309.32 frames. ], batch size: 80, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:16:39,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 00:16:39,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:40,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:16:40,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:16:43,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:16:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:16:45,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 00:16:46,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 00:16:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 00:16:46,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:16:48,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:16:49,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:16:50,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:51,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:16:54,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1457720.0, ans=0.2 2023-10-04 00:16:54,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1457720.0, ans=0.1 2023-10-04 00:16:55,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:56,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:57,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 00:17:00,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 00:17:01,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:17:03,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 00:17:05,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 00:17:07,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 00:17:08,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.70 vs. limit=15.0 2023-10-04 00:17:10,023 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 00:17:10,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:17:10,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:17:10,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:17:12,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:14,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:14,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 00:17:17,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:17:19,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:17:20,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:17:21,766 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.961e+02 2.256e+02 2.503e+02 3.661e+02, threshold=4.513e+02, percent-clipped=0.0 2023-10-04 00:17:21,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:17:23,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:17:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:17:24,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 00:17:28,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:17:28,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:17:28,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1457853.3333333333, ans=0.0 2023-10-04 00:17:29,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:17:29,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:17:29,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:17:29,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1457853.3333333333, ans=0.035 2023-10-04 00:17:31,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:34,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:17:35,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:17:35,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:17:37,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:17:37,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1457920.0, ans=0.125 2023-10-04 00:17:37,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1457920.0, ans=0.2 2023-10-04 00:17:38,133 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.67 vs. limit=22.5 2023-10-04 00:17:45,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:17:47,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:17:47,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 00:17:47,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:17:47,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:17:47,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1457920.0, ans=0.0 2023-10-04 00:17:50,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 00:17:52,885 INFO [train.py:1046] (2/4) Epoch 42, batch 900, loss[loss=0.1543, simple_loss=0.2405, pruned_loss=0.03402, over 23222.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2362, pruned_loss=0.03731, over 4677879.58 frames. ], batch size: 105, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:17:54,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1457986.6666666667, ans=0.1 2023-10-04 00:17:57,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:18:00,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:18:00,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 00:18:00,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1457986.6666666667, ans=0.125 2023-10-04 00:18:03,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:18:05,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 00:18:05,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 00:18:06,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:18:06,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:08,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:18:08,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:18:11,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.41 vs. limit=15.0 2023-10-04 00:18:19,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:19,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:18:19,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:18:20,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:25,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 00:18:28,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:18:31,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1458120.0, ans=0.125 2023-10-04 00:18:32,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:18:34,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:18:36,032 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 00:18:36,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 00:18:42,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:18:42,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:18:44,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:18:49,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:49,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:18:50,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1458253.3333333333, ans=0.0 2023-10-04 00:18:51,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 00:18:51,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:54,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 00:18:55,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:18:55,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:57,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:18:57,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:00,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 00:19:00,180 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 00:19:03,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:19:03,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 00:19:06,347 INFO [train.py:1046] (2/4) Epoch 42, batch 950, loss[loss=0.1636, simple_loss=0.2371, pruned_loss=0.04506, over 23813.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2357, pruned_loss=0.03714, over 4695604.06 frames. ], batch size: 195, lr: 2.43e-03, grad_scale: 4.0 2023-10-04 00:19:06,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:19:06,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1458320.0, ans=0.125 2023-10-04 00:19:09,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 00:19:14,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:14,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1458320.0, ans=0.04949747468305833 2023-10-04 00:19:16,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:16,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:16,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:19:19,650 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 00:19:23,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:24,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:19:24,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:24,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:19:24,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 00:19:26,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:19:28,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:29,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 00:19:31,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:19:36,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:36,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:19:37,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:19:37,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 00:19:40,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 00:19:41,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:19:43,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:19:47,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:19:47,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:50,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 00:19:51,440 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.895e+02 2.094e+02 2.430e+02 3.415e+02, threshold=4.187e+02, percent-clipped=0.0 2023-10-04 00:19:52,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 00:19:52,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:19:52,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:19:54,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:54,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:19:57,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 00:19:59,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:20:02,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:02,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:20:02,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 00:20:02,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:20:02,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:20:03,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 00:20:08,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:20:10,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:20:12,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1458586.6666666667, ans=0.125 2023-10-04 00:20:13,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:20:13,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1458586.6666666667, ans=0.125 2023-10-04 00:20:15,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 00:20:15,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 00:20:17,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:20:19,151 INFO [train.py:1046] (2/4) Epoch 42, batch 1000, loss[loss=0.1581, simple_loss=0.2484, pruned_loss=0.03388, over 24659.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03723, over 4692759.85 frames. ], batch size: 73, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:20:22,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 00:20:23,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:20:28,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:20:29,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 00:20:29,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 00:20:34,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:20:34,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:20:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 00:20:39,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 00:20:42,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 00:20:42,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:20:44,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 00:20:45,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 00:20:45,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 00:20:45,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1458720.0, ans=0.2 2023-10-04 00:20:46,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:20:48,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:20:49,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1458786.6666666667, ans=0.125 2023-10-04 00:20:55,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1458786.6666666667, ans=0.125 2023-10-04 00:20:56,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:58,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:20:58,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1458786.6666666667, ans=0.125 2023-10-04 00:21:00,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:00,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:21:00,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 00:21:00,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:21:02,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:21:02,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:21:02,730 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 00:21:06,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 00:21:08,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 00:21:10,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 00:21:11,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:21:17,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:17,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:21:17,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:19,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:21:21,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 00:21:22,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:21:22,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 00:21:23,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 00:21:24,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:21:24,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:21:24,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-10-04 00:21:26,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:21:28,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:21:30,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:21:32,062 INFO [train.py:1046] (2/4) Epoch 42, batch 1050, loss[loss=0.1388, simple_loss=0.2085, pruned_loss=0.03458, over 23588.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2341, pruned_loss=0.0369, over 4695475.10 frames. ], batch size: 256, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:21:34,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1458986.6666666667, ans=0.0 2023-10-04 00:21:35,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:21:35,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:21:38,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:21:39,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:41,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:21:42,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:21:44,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:21:46,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:21:47,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1459053.3333333333, ans=0.125 2023-10-04 00:21:47,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1459053.3333333333, ans=0.025 2023-10-04 00:21:48,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:21:48,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:21:49,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:21:49,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 00:21:51,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:21:51,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 00:21:52,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:21:53,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 00:21:53,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:21:54,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=12.0 2023-10-04 00:21:58,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:58,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:21:59,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:22:02,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 00:22:02,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 00:22:02,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:22:05,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 00:22:05,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1459120.0, ans=0.125 2023-10-04 00:22:07,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1459120.0, ans=0.0 2023-10-04 00:22:09,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 00:22:09,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:12,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:22:15,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:22:15,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:22:15,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:22:18,167 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.961e+02 2.126e+02 2.339e+02 3.298e+02, threshold=4.252e+02, percent-clipped=0.0 2023-10-04 00:22:19,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:22:22,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 00:22:23,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 00:22:24,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 00:22:25,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:22:25,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:22:26,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 00:22:28,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:22:28,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1459186.6666666667, ans=0.125 2023-10-04 00:22:29,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.73 vs. limit=15.0 2023-10-04 00:22:30,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:22:30,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:22:31,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:22:32,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.65 vs. limit=12.0 2023-10-04 00:22:32,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:36,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1459253.3333333333, ans=0.125 2023-10-04 00:22:37,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:37,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 00:22:38,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:22:38,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 00:22:40,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 00:22:40,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:22:42,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.70 vs. limit=15.0 2023-10-04 00:22:44,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:22:46,930 INFO [train.py:1046] (2/4) Epoch 42, batch 1100, loss[loss=0.1506, simple_loss=0.2225, pruned_loss=0.03931, over 23797.00 frames. ], tot_loss[loss=0.154, simple_loss=0.234, pruned_loss=0.03695, over 4691592.88 frames. ], batch size: 212, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:22:48,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:22:50,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1459320.0, ans=0.125 2023-10-04 00:22:52,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:22:52,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:22:52,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:22:54,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 00:22:55,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.29 vs. limit=15.0 2023-10-04 00:22:55,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:22:59,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:23:00,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:23:02,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1459386.6666666667, ans=0.0 2023-10-04 00:23:03,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:23:03,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 00:23:03,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:23:05,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:23:05,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:23:08,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:23:11,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:23:13,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=15.0 2023-10-04 00:23:15,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:23:18,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 00:23:18,137 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 00:23:19,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:19,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:21,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:23:21,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:23:22,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 00:23:24,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:23:24,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:23:24,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:23:24,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:24,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 00:23:31,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:23:31,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1459520.0, ans=10.0 2023-10-04 00:23:33,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 00:23:34,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:23:34,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1459520.0, ans=0.125 2023-10-04 00:23:40,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:23:43,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 00:23:43,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 00:23:45,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:47,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:23:47,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:23:49,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 00:23:50,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:23:50,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:23:52,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 00:23:52,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:23:52,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 00:23:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:23:53,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:23:55,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:23:56,188 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.92 vs. limit=15.0 2023-10-04 00:24:00,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:00,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.13 vs. limit=15.0 2023-10-04 00:24:01,475 INFO [train.py:1046] (2/4) Epoch 42, batch 1150, loss[loss=0.1501, simple_loss=0.2408, pruned_loss=0.0297, over 24644.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03715, over 4705244.50 frames. ], batch size: 68, lr: 2.43e-03, grad_scale: 4.0 2023-10-04 00:24:01,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:24:03,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:24:03,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:24:04,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 00:24:04,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:24:07,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 00:24:09,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:09,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:24:15,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 00:24:16,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:24:20,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:20,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:22,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 00:24:22,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:24:22,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:24:25,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 00:24:25,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:24:27,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:24:36,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:42,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:42,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 00:24:42,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:43,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:46,332 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 00:24:48,976 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.006e+02 2.296e+02 2.643e+02 4.791e+02, threshold=4.591e+02, percent-clipped=2.0 2023-10-04 00:24:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:50,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1459853.3333333333, ans=0.125 2023-10-04 00:24:55,670 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 00:24:59,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:01,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:25:01,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:25:02,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:25:05,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1459920.0, ans=0.125 2023-10-04 00:25:06,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:25:11,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:25:13,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:25:14,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:14,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:14,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:25:15,930 INFO [train.py:1046] (2/4) Epoch 42, batch 1200, loss[loss=0.1659, simple_loss=0.2388, pruned_loss=0.04646, over 23428.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.235, pruned_loss=0.03729, over 4713724.46 frames. ], batch size: 285, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:25:17,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:25:19,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:25:20,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:25:20,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:25:23,326 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 00:25:23,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1459986.6666666667, ans=0.1 2023-10-04 00:25:24,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 00:25:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:25:30,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:25:32,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:35,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:25:35,519 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 00:25:36,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:43,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:25:43,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:25:44,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 00:25:45,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:25:47,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 00:25:50,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1460120.0, ans=0.5 2023-10-04 00:25:51,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 00:25:51,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:52,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1460120.0, ans=0.2 2023-10-04 00:25:53,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:25:54,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:25:56,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:25:58,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:58,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:25:58,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:25:58,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 00:26:01,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:26:01,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:26:01,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:26:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:26:02,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:26:02,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1460186.6666666667, ans=0.0 2023-10-04 00:26:08,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:26:10,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:26:11,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 00:26:15,720 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 00:26:17,403 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:26:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:26:21,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:26:22,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:26:24,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:26:26,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 00:26:29,720 INFO [train.py:1046] (2/4) Epoch 42, batch 1250, loss[loss=0.1718, simple_loss=0.2437, pruned_loss=0.04998, over 23619.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.237, pruned_loss=0.0383, over 4704855.85 frames. ], batch size: 256, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:26:31,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:26:33,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:26:35,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 00:26:36,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:26:36,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:26:39,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1460320.0, ans=0.0 2023-10-04 00:26:39,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.80 vs. limit=6.0 2023-10-04 00:26:41,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:26:42,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:26:44,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:26:44,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:26:46,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:26:48,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 00:26:48,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:26:48,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:26:51,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:26:51,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:26:55,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:26:55,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:27:01,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 00:27:01,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:27:04,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:27:06,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 00:27:06,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:27:06,595 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 00:27:06,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:06,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:09,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:27:12,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:27:14,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:27:15,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 00:27:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 00:27:17,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 00:27:18,734 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.951e+02 2.115e+02 2.289e+02 3.132e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-04 00:27:20,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1460520.0, ans=0.0 2023-10-04 00:27:21,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:27:23,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 00:27:23,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:25,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 00:27:25,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:27:27,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 00:27:27,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:27:28,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:27:28,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:27:28,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:27:30,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 00:27:31,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1460586.6666666667, ans=0.1 2023-10-04 00:27:33,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:27:33,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:27:34,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:27:37,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:27:40,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:27:42,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 00:27:45,168 INFO [train.py:1046] (2/4) Epoch 42, batch 1300, loss[loss=0.141, simple_loss=0.2229, pruned_loss=0.02953, over 24300.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2374, pruned_loss=0.03853, over 4708184.14 frames. ], batch size: 61, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:27:46,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:27:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:27:48,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:27:51,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:52,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:27:54,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 00:27:57,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:27:58,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:27:59,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 00:28:01,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1460720.0, ans=0.1 2023-10-04 00:28:04,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:28:08,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:10,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:28:11,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:28:11,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:11,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1460720.0, ans=0.125 2023-10-04 00:28:11,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1460720.0, ans=0.0 2023-10-04 00:28:12,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:28:12,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:28:14,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 00:28:17,449 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:28:20,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:28:21,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:28:22,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 00:28:24,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:28:24,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:28:26,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1460786.6666666667, ans=0.0 2023-10-04 00:28:27,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:28:28,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 00:28:28,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:28:29,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1460853.3333333333, ans=0.2 2023-10-04 00:28:30,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 00:28:31,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:28:34,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1460853.3333333333, ans=0.125 2023-10-04 00:28:35,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:28:35,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:28:37,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 00:28:39,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 00:28:42,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 00:28:45,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:28:48,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 00:28:51,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:56,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 00:28:58,614 INFO [train.py:1046] (2/4) Epoch 42, batch 1350, loss[loss=0.1621, simple_loss=0.2295, pruned_loss=0.04732, over 23828.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.03832, over 4712432.35 frames. ], batch size: 164, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:29:00,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:02,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:05,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:29:05,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:08,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:29:09,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:29:16,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:29:17,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 00:29:17,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:29:19,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:29:20,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 00:29:20,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:29:22,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:29:22,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 00:29:23,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 00:29:26,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 00:29:27,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:27,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 00:29:39,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:46,267 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.991e+02 2.237e+02 2.537e+02 4.042e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-04 00:29:47,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:49,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:29:49,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 00:29:51,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:29:51,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 00:29:51,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:29:53,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:54,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:29:56,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 00:29:57,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:30:03,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 00:30:04,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 00:30:09,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 00:30:09,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1461253.3333333333, ans=0.0 2023-10-04 00:30:09,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1461253.3333333333, ans=0.125 2023-10-04 00:30:11,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:30:11,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1461320.0, ans=0.125 2023-10-04 00:30:12,366 INFO [train.py:1046] (2/4) Epoch 42, batch 1400, loss[loss=0.1702, simple_loss=0.236, pruned_loss=0.05224, over 23737.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2344, pruned_loss=0.03825, over 4695536.52 frames. ], batch size: 179, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:30:16,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:30:16,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:30:19,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 00:30:20,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 00:30:24,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1461320.0, ans=0.0 2023-10-04 00:30:29,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:30:29,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1461386.6666666667, ans=0.0 2023-10-04 00:30:30,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:30:33,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:30:33,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:30:35,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1461386.6666666667, ans=0.2 2023-10-04 00:30:38,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:30:38,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 00:30:44,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1461453.3333333333, ans=0.0 2023-10-04 00:30:47,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:30:47,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:30:51,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 00:30:52,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:30:52,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:30:54,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:30:54,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:30:55,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:30:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:30:57,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:30:57,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 00:30:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:31:00,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1461520.0, ans=0.05 2023-10-04 00:31:03,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:07,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:31:10,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1461586.6666666667, ans=0.0 2023-10-04 00:31:14,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 00:31:15,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:31:15,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1461586.6666666667, ans=0.1 2023-10-04 00:31:17,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1461586.6666666667, ans=0.1 2023-10-04 00:31:18,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:31:19,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 00:31:20,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.90 vs. limit=22.5 2023-10-04 00:31:21,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:23,246 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.13 vs. limit=15.0 2023-10-04 00:31:23,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:31:26,680 INFO [train.py:1046] (2/4) Epoch 42, batch 1450, loss[loss=0.1511, simple_loss=0.2402, pruned_loss=0.03104, over 24595.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2337, pruned_loss=0.03763, over 4710882.32 frames. ], batch size: 71, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:31:26,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:31:28,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:31:28,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:28,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 00:31:29,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1461653.3333333333, ans=0.1 2023-10-04 00:31:33,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:34,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:31:35,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:31:35,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 00:31:37,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:31:37,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 00:31:38,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:40,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:40,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 00:31:41,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:31:41,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:31:43,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 00:31:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:45,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:31:46,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:48,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:53,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:31:53,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:31:55,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:55,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:58,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:58,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:31:58,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:58,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:02,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 00:32:03,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:32:06,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.10 vs. limit=10.0 2023-10-04 00:32:09,617 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 00:32:11,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:32:13,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:32:14,444 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.045e+02 2.411e+02 2.943e+02 4.436e+02, threshold=4.821e+02, percent-clipped=0.0 2023-10-04 00:32:14,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:14,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 00:32:16,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1461853.3333333333, ans=0.1 2023-10-04 00:32:19,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:21,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 00:32:22,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 00:32:23,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:26,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:32:26,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:32:27,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 00:32:30,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 00:32:30,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 00:32:30,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:31,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:32:40,518 INFO [train.py:1046] (2/4) Epoch 42, batch 1500, loss[loss=0.1591, simple_loss=0.2543, pruned_loss=0.03198, over 24435.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2345, pruned_loss=0.0379, over 4707367.57 frames. ], batch size: 69, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:32:43,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 00:32:44,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:32:44,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:32:46,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:46,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:32:48,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:32:48,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 00:32:49,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:32:50,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:32:50,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:32:50,922 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.58 vs. limit=15.0 2023-10-04 00:32:51,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:32:54,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:32:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:32:57,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.02 vs. limit=15.0 2023-10-04 00:33:02,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:02,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 00:33:02,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:33:02,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:33:03,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:33:06,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 00:33:12,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 00:33:14,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:33:14,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 00:33:16,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:33:18,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:33:18,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:33:18,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:33:22,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 00:33:22,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:33:22,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:33:23,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 00:33:23,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:33:29,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:33:29,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 00:33:29,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1462186.6666666667, ans=0.0 2023-10-04 00:33:34,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:33:36,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:33:40,390 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 00:33:40,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:40,449 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 00:33:41,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:33:43,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:33:43,688 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 00:33:45,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:33:47,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 00:33:48,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:52,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:52,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:53,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:54,245 INFO [train.py:1046] (2/4) Epoch 42, batch 1550, loss[loss=0.1759, simple_loss=0.2546, pruned_loss=0.04858, over 23822.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.0378, over 4732628.45 frames. ], batch size: 179, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:33:54,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:33:55,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 00:33:55,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 00:33:57,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:33:57,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 00:33:59,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 00:34:00,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:34:02,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:02,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:34:02,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:34:02,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:04,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:07,103 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 00:34:07,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:07,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:34:08,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:34:09,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:34:09,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 00:34:11,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:34:11,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 00:34:13,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 00:34:13,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 00:34:14,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:14,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:16,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1462386.6666666667, ans=0.125 2023-10-04 00:34:20,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:34:21,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 00:34:21,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 00:34:30,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:33,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1462453.3333333333, ans=0.125 2023-10-04 00:34:34,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:34:34,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:34:34,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:34:35,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 00:34:40,999 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.990e+02 2.197e+02 2.410e+02 4.079e+02, threshold=4.394e+02, percent-clipped=0.0 2023-10-04 00:34:42,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:34:42,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:46,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:34:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:34:47,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1462520.0, ans=0.1 2023-10-04 00:34:48,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:48,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 00:34:50,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:34:52,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:34:52,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:53,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 00:34:53,518 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 00:34:56,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:01,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 00:35:03,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1462586.6666666667, ans=0.1 2023-10-04 00:35:03,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1462586.6666666667, ans=0.0 2023-10-04 00:35:05,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:35:05,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1462586.6666666667, ans=0.125 2023-10-04 00:35:06,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:06,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 00:35:08,284 INFO [train.py:1046] (2/4) Epoch 42, batch 1600, loss[loss=0.1562, simple_loss=0.2485, pruned_loss=0.03198, over 24628.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2362, pruned_loss=0.03808, over 4723007.90 frames. ], batch size: 73, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:35:08,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:35:09,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:35:09,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:35:09,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:35:10,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1462653.3333333333, ans=0.125 2023-10-04 00:35:11,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:35:15,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:15,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 00:35:16,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 00:35:19,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 00:35:21,093 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-04 00:35:22,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:35:23,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 00:35:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:35:27,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:35:31,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:35:35,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 00:35:35,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1462720.0, ans=0.0 2023-10-04 00:35:36,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1462720.0, ans=0.125 2023-10-04 00:35:38,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:35:38,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 00:35:39,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:39,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 00:35:44,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1462786.6666666667, ans=0.125 2023-10-04 00:35:45,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 00:35:45,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1462786.6666666667, ans=0.2 2023-10-04 00:35:51,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:51,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 00:35:53,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:53,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:35:53,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:35:55,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 00:36:01,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 00:36:02,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:36:02,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:03,855 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.68 vs. limit=12.0 2023-10-04 00:36:04,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:04,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:36:05,126 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.61 vs. limit=6.0 2023-10-04 00:36:07,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:36:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:36:08,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:36:14,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:15,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:36:17,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 00:36:17,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:36:18,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 00:36:20,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1462920.0, ans=0.2 2023-10-04 00:36:22,921 INFO [train.py:1046] (2/4) Epoch 42, batch 1650, loss[loss=0.1441, simple_loss=0.2288, pruned_loss=0.0297, over 24535.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2369, pruned_loss=0.03818, over 4717514.80 frames. ], batch size: 63, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:36:23,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:36:23,779 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-10-04 00:36:24,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:36:26,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:36:26,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 00:36:26,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 00:36:26,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 00:36:26,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 00:36:31,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:32,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:36:34,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:36:34,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:36:34,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1462986.6666666667, ans=0.2 2023-10-04 00:36:35,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:36:36,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 00:36:39,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:36:39,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:36:39,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:36:39,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:36:40,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 00:36:41,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 00:36:46,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:36:49,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:36:57,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 00:36:57,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1463120.0, ans=0.125 2023-10-04 00:36:58,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:00,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 00:37:01,490 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.24 vs. limit=10.0 2023-10-04 00:37:04,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:05,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1463120.0, ans=0.125 2023-10-04 00:37:06,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:37:06,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:37:06,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:08,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:37:08,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:11,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:11,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:12,337 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.974e+02 2.211e+02 2.480e+02 3.454e+02, threshold=4.423e+02, percent-clipped=0.0 2023-10-04 00:37:12,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:37:12,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:37:13,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:37:14,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1463186.6666666667, ans=0.1 2023-10-04 00:37:15,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:37:16,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:37:16,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 00:37:18,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:37:20,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 00:37:21,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 00:37:21,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 00:37:21,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:37:21,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:37:23,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:24,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:24,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 00:37:24,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1463253.3333333333, ans=0.1 2023-10-04 00:37:28,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:30,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:37:30,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:32,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 00:37:38,120 INFO [train.py:1046] (2/4) Epoch 42, batch 1700, loss[loss=0.1574, simple_loss=0.2315, pruned_loss=0.04164, over 23567.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03822, over 4714881.64 frames. ], batch size: 120, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:37:38,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:38,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:37:38,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 00:37:39,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:37:39,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:37:39,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:40,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.22 vs. limit=22.5 2023-10-04 00:37:43,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:37:43,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:37:43,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 00:37:45,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:37:51,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:54,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:38:00,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:38:01,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:38:01,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:38:01,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:38:04,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 00:38:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:38:06,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:07,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:38:07,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:38:10,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 00:38:10,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 00:38:12,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:13,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 00:38:14,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:38:18,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1463453.3333333333, ans=0.035 2023-10-04 00:38:21,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:21,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:22,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:38:25,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:38:25,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 00:38:25,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:38:28,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 00:38:29,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:38:29,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:38:29,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:29,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:38:30,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1463520.0, ans=0.125 2023-10-04 00:38:34,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:38:34,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:38:35,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:37,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:38:37,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:41,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:38:42,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 00:38:44,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:45,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:38:47,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 00:38:50,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1463653.3333333333, ans=0.125 2023-10-04 00:38:52,288 INFO [train.py:1046] (2/4) Epoch 42, batch 1750, loss[loss=0.167, simple_loss=0.2432, pruned_loss=0.04545, over 23268.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2349, pruned_loss=0.03783, over 4704460.04 frames. ], batch size: 105, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:38:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:56,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:38:56,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:38:58,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 00:38:59,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:39:02,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:39:02,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:04,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1463653.3333333333, ans=0.125 2023-10-04 00:39:06,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 00:39:08,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:09,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 00:39:09,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:39:09,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1463720.0, ans=0.025 2023-10-04 00:39:11,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:39:13,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:39:15,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1463720.0, ans=0.125 2023-10-04 00:39:16,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 00:39:17,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:39:18,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 00:39:18,743 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.61 vs. limit=22.5 2023-10-04 00:39:24,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1463786.6666666667, ans=0.125 2023-10-04 00:39:27,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:39:29,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:39:30,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:39:32,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:33,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:39:35,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:39:37,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:39,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:39:41,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:39:42,564 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.890e+02 2.169e+02 2.400e+02 4.108e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-04 00:39:42,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 00:39:43,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:39:45,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 00:39:46,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:39:48,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:48,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1463853.3333333333, ans=0.05 2023-10-04 00:39:49,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:39:53,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:39:53,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:39:53,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:53,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1463920.0, ans=0.0 2023-10-04 00:39:54,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:39:56,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:59,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:39:59,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:40:00,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 00:40:00,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:40:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:40:04,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:04,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:40:04,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:40:04,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:40:05,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1463986.6666666667, ans=0.0 2023-10-04 00:40:06,912 INFO [train.py:1046] (2/4) Epoch 42, batch 1800, loss[loss=0.1314, simple_loss=0.2146, pruned_loss=0.02411, over 24592.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2338, pruned_loss=0.03752, over 4693349.17 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:40:07,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:40:08,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:40:10,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:40:12,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:40:14,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:40:15,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:40:17,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:40:20,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:21,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:40:25,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:40:25,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 00:40:25,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:27,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:31,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 00:40:33,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 00:40:33,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 00:40:33,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:40:36,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:36,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:40:36,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:40:36,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1464120.0, ans=0.2 2023-10-04 00:40:42,882 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 00:40:42,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:40:44,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1464120.0, ans=0.1 2023-10-04 00:40:44,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1464120.0, ans=0.0 2023-10-04 00:40:45,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:48,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 00:40:48,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1464120.0, ans=0.125 2023-10-04 00:40:50,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 00:40:50,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:40:51,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:40:53,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:40:55,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 00:41:04,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:41:05,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 00:41:06,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:41:06,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:41:08,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:41:08,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 00:41:08,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1464253.3333333333, ans=0.1 2023-10-04 00:41:08,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1464253.3333333333, ans=0.0 2023-10-04 00:41:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:41:11,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:41:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 00:41:12,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:41:15,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:41:15,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:41:15,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:41:17,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:41:19,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:41:20,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1464320.0, ans=0.2 2023-10-04 00:41:21,125 INFO [train.py:1046] (2/4) Epoch 42, batch 1850, loss[loss=0.1496, simple_loss=0.2278, pruned_loss=0.03565, over 23765.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2347, pruned_loss=0.03762, over 4706368.17 frames. ], batch size: 179, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:41:21,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:41:21,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:41:23,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:41:24,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:41:25,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1464320.0, ans=0.95 2023-10-04 00:41:26,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1464320.0, ans=0.125 2023-10-04 00:41:30,460 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.90 vs. limit=15.0 2023-10-04 00:41:31,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:41:31,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 00:41:38,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 00:41:40,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 00:41:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:41:43,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 00:41:43,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 00:41:53,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:41:55,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 00:41:58,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:41:58,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:42:02,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 00:42:03,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:03,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:42:05,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:42:07,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:42:07,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1464520.0, ans=0.0 2023-10-04 00:42:11,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:42:12,389 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.933e+02 2.158e+02 2.386e+02 3.653e+02, threshold=4.316e+02, percent-clipped=0.0 2023-10-04 00:42:13,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:42:13,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:15,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:42:15,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:16,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:42:17,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:42:20,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 00:42:22,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:42:25,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:42:26,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:42:26,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 00:42:26,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 00:42:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 00:42:29,600 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 00:42:30,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:42:30,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:42:31,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:42:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 00:42:32,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:42:32,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:33,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:42:35,115 INFO [train.py:1046] (2/4) Epoch 42, batch 1900, loss[loss=0.1762, simple_loss=0.2588, pruned_loss=0.04683, over 24002.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2362, pruned_loss=0.03775, over 4712175.39 frames. ], batch size: 86, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:42:36,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:42:36,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:42:36,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 00:42:36,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1464653.3333333333, ans=0.2 2023-10-04 00:42:37,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:37,990 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 00:42:39,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:42:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:45,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:48,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:42:48,804 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 00:42:48,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 00:42:50,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:42:51,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:42:51,558 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 00:42:51,599 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 00:42:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 00:42:56,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:42:58,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 00:43:00,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 00:43:00,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1464720.0, ans=0.0 2023-10-04 00:43:04,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1464786.6666666667, ans=0.125 2023-10-04 00:43:13,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 00:43:17,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 00:43:17,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:43:17,205 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 00:43:17,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 00:43:17,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 00:43:18,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 00:43:18,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:43:21,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1464853.3333333333, ans=0.125 2023-10-04 00:43:22,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 00:43:27,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:43:27,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1464853.3333333333, ans=0.0 2023-10-04 00:43:28,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:43:28,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 00:43:30,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:43:34,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 00:43:34,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1464920.0, ans=0.2 2023-10-04 00:43:35,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:43:37,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1464920.0, ans=0.0 2023-10-04 00:43:40,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:43:40,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:43:40,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:43:41,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:43:43,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:43:43,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:43:44,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.35 vs. limit=10.0 2023-10-04 00:43:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:43:47,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:43:47,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:43:49,768 INFO [train.py:1046] (2/4) Epoch 42, batch 1950, loss[loss=0.1749, simple_loss=0.2486, pruned_loss=0.05063, over 22740.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03797, over 4714881.08 frames. ], batch size: 322, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:43:49,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:43:49,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:43:49,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:43:52,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:43:55,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:43:56,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.36 vs. limit=15.0 2023-10-04 00:43:57,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:43:57,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:43:57,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:43:58,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 00:44:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:44:00,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:02,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:05,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:44:05,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:06,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:06,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:44:09,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:44:09,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:44:09,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:44:09,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:10,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1465053.3333333333, ans=0.1 2023-10-04 00:44:10,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1465053.3333333333, ans=0.0 2023-10-04 00:44:13,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.41 vs. limit=15.0 2023-10-04 00:44:14,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:18,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:44:18,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:18,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:44:18,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 00:44:19,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:44:19,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:44:19,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:22,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:25,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:44:28,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:44:31,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:44:31,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:44:31,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 00:44:32,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:44:36,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:44:37,320 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=7.74 vs. limit=12.0 2023-10-04 00:44:38,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:44:39,322 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 1.957e+02 2.205e+02 2.513e+02 3.289e+02, threshold=4.410e+02, percent-clipped=0.0 2023-10-04 00:44:39,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:44:47,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:49,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:50,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:52,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:53,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:44:55,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:56,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 00:44:56,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:44:57,646 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.11 vs. limit=6.0 2023-10-04 00:44:58,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:59,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 00:45:01,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:45:03,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.05 vs. limit=22.5 2023-10-04 00:45:03,800 INFO [train.py:1046] (2/4) Epoch 42, batch 2000, loss[loss=0.1495, simple_loss=0.2371, pruned_loss=0.03096, over 24639.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2382, pruned_loss=0.03854, over 4712884.12 frames. ], batch size: 68, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:45:03,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:45:05,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:45:05,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:45:06,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:45:09,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:12,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 00:45:12,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:45:17,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:45:20,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 00:45:20,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:45:20,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:45:24,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:45:25,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 00:45:25,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:27,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:27,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:28,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 00:45:28,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:45:32,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 00:45:32,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:45:33,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1465453.3333333333, ans=0.0 2023-10-04 00:45:34,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:45:35,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1465453.3333333333, ans=0.125 2023-10-04 00:45:36,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:45:36,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:37,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:45:37,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:45:37,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 00:45:40,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1465453.3333333333, ans=10.0 2023-10-04 00:45:41,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 00:45:41,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:45:41,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:45:48,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:48,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:45:48,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:45:50,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:45:51,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:45:51,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:45:52,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:54,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:57,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:45:57,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 00:46:01,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:46:04,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:07,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:07,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:46:10,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:10,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1465586.6666666667, ans=0.5 2023-10-04 00:46:11,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:46:11,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:13,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:46:13,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1465586.6666666667, ans=0.125 2023-10-04 00:46:14,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:46:16,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:17,968 INFO [train.py:1046] (2/4) Epoch 42, batch 2050, loss[loss=0.1644, simple_loss=0.2427, pruned_loss=0.0431, over 23416.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03803, over 4712201.27 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:46:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:21,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:46:21,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:27,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:46:28,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:46:29,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:29,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:46:31,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 00:46:31,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:46:34,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:46:34,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:46:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:46:44,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:48,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 00:46:49,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:51,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 00:46:51,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:46:52,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:46:52,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1465786.6666666667, ans=0.2 2023-10-04 00:46:55,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:46:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:46:57,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:46:58,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:47:00,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:47:00,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:47:03,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:47:06,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:47:08,811 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.968e+02 2.128e+02 2.407e+02 4.254e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-04 00:47:08,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:47:10,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:47:13,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:47:18,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:47:18,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 00:47:24,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:47:24,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:47:26,434 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:47:27,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:47:28,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 00:47:33,504 INFO [train.py:1046] (2/4) Epoch 42, batch 2100, loss[loss=0.1653, simple_loss=0.2513, pruned_loss=0.03962, over 24385.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03772, over 4707761.72 frames. ], batch size: 77, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:47:33,554 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 00:47:33,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:47:33,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:47:34,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:47:36,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:47:36,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 00:47:36,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 00:47:37,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:47:40,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:47:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:47:44,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:47:46,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:47:46,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 00:47:46,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:47:46,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 00:47:46,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 00:47:48,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:47:48,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:47:48,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 00:47:48,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 00:47:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 00:47:53,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:47:58,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:47:58,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:48:01,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:48:03,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 00:48:03,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:03,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:48:04,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 00:48:04,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:05,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 00:48:05,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 00:48:05,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 00:48:08,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:48:10,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:48:11,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:48:13,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:48:14,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:17,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:17,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 00:48:17,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:18,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:20,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 00:48:22,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 00:48:22,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 00:48:26,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:48:28,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:48:28,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 00:48:29,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1466186.6666666667, ans=0.0 2023-10-04 00:48:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:37,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:48:37,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:48:37,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:48:37,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 00:48:38,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:48:39,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:48:39,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:48:41,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:42,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 00:48:44,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 00:48:44,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:48:46,748 INFO [train.py:1046] (2/4) Epoch 42, batch 2150, loss[loss=0.1579, simple_loss=0.2399, pruned_loss=0.03797, over 23464.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2346, pruned_loss=0.03748, over 4707787.08 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:48:46,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:48:46,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:48:46,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:48:50,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1466320.0, ans=0.2 2023-10-04 00:48:54,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:48:54,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:48:55,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:55,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1466320.0, ans=0.0 2023-10-04 00:48:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:48:57,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:48:58,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:49:03,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:03,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:49:03,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:49:04,267 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.62 vs. limit=10.0 2023-10-04 00:49:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:07,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 00:49:11,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:13,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:49:14,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:14,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:15,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:15,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:49:15,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:49:15,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:49:17,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:49:17,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 00:49:20,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:49:22,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:22,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:23,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:49:24,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:49:26,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:27,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:49:29,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:29,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 00:49:29,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:49:30,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.58 vs. limit=15.0 2023-10-04 00:49:32,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:34,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:34,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:34,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1466520.0, ans=0.125 2023-10-04 00:49:35,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:49:35,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:37,167 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.733e+02 1.971e+02 2.124e+02 2.467e+02 3.717e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-04 00:49:37,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:37,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 00:49:40,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 00:49:40,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:49:40,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 00:49:40,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:41,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:49:42,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 00:49:42,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:49:42,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 00:49:42,875 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 00:49:42,875 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 00:49:44,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 00:49:45,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:45,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:49:45,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:49:45,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:46,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.16 vs. limit=22.5 2023-10-04 00:49:47,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:49:48,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:48,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:48,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1466586.6666666667, ans=0.0 2023-10-04 00:49:53,709 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:49:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:49:57,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 00:50:00,805 INFO [train.py:1046] (2/4) Epoch 42, batch 2200, loss[loss=0.1454, simple_loss=0.2334, pruned_loss=0.0287, over 24482.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2344, pruned_loss=0.0375, over 4709488.13 frames. ], batch size: 66, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:50:02,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:50:07,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:07,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:50:09,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:09,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:50:10,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:50:10,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:50:10,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 00:50:17,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 00:50:20,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:50:23,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1466720.0, ans=0.0 2023-10-04 00:50:24,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 00:50:27,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:28,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:50:28,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:50:32,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:50:33,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 00:50:37,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:50:38,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:39,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1466786.6666666667, ans=0.035 2023-10-04 00:50:40,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 00:50:40,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1466786.6666666667, ans=0.125 2023-10-04 00:50:41,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:50:44,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:50:45,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:50:47,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:47,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1466853.3333333333, ans=0.125 2023-10-04 00:50:48,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 00:50:50,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:50,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 00:50:50,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1466853.3333333333, ans=0.125 2023-10-04 00:50:52,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:52,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:50:52,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:56,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:50:56,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:50:56,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:56,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:59,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:50:59,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1466853.3333333333, ans=0.0 2023-10-04 00:51:01,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:51:02,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:51:07,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:51:07,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:51:07,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1466920.0, ans=0.125 2023-10-04 00:51:09,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:51:11,248 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 00:51:11,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:51:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 00:51:12,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:51:12,833 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 00:51:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:51:15,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:51:16,804 INFO [train.py:1046] (2/4) Epoch 42, batch 2250, loss[loss=0.1576, simple_loss=0.2396, pruned_loss=0.0378, over 23129.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03769, over 4714999.83 frames. ], batch size: 105, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:51:18,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:51:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 00:51:20,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:51:21,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=1466986.6666666667, ans=15.0 2023-10-04 00:51:23,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:51:29,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:51:31,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:51:34,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:34,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:51:35,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:51:36,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1467053.3333333333, ans=0.125 2023-10-04 00:51:38,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 00:51:38,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:51:38,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:51:41,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 00:51:42,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:51:42,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:43,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:51:48,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:51:48,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 00:51:48,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:51:50,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 00:51:51,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:54,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:51:58,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1467120.0, ans=0.0 2023-10-04 00:52:00,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:52:00,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1467186.6666666667, ans=0.2 2023-10-04 00:52:02,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:52:03,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:03,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:52:05,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:52:06,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1467186.6666666667, ans=0.125 2023-10-04 00:52:07,086 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.956e+02 2.105e+02 2.368e+02 2.905e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-04 00:52:08,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:52:11,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:52:13,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:52:17,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:52:17,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:52:17,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:52:23,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 00:52:25,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:52:25,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 00:52:25,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:25,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:52:28,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 00:52:30,571 INFO [train.py:1046] (2/4) Epoch 42, batch 2300, loss[loss=0.1295, simple_loss=0.2118, pruned_loss=0.02353, over 24374.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2364, pruned_loss=0.03806, over 4726870.91 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:52:32,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1467320.0, ans=0.125 2023-10-04 00:52:33,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:52:35,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:39,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:41,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:52:43,181 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 00:52:43,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1467320.0, ans=0.125 2023-10-04 00:52:44,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:50,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:52:51,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:52:51,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:52:51,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:51,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 00:52:52,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:52:53,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.75 vs. limit=6.0 2023-10-04 00:52:54,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:52:54,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:52:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:52:59,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:53:03,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:53:07,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:53:07,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:53:11,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:53:15,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:53:17,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:53:19,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:53:20,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:53:20,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 00:53:23,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 00:53:23,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:53:24,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:53:24,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:53:26,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:53:27,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 00:53:27,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:53:27,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 00:53:27,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:53:27,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:53:28,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 00:53:35,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:53:38,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:53:42,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:53:42,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:53:42,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:53:44,429 INFO [train.py:1046] (2/4) Epoch 42, batch 2350, loss[loss=0.1389, simple_loss=0.2172, pruned_loss=0.0303, over 24381.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.0382, over 4724986.81 frames. ], batch size: 56, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:53:44,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:53:44,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:53:44,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:53:44,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1467653.3333333333, ans=0.0 2023-10-04 00:53:45,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 00:53:50,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1467653.3333333333, ans=0.07 2023-10-04 00:53:52,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:53:52,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 00:53:53,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1467653.3333333333, ans=0.125 2023-10-04 00:53:57,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 00:53:59,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:54:03,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:03,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:03,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:54:03,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:54:04,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 00:54:08,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:54:16,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 00:54:16,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:54:16,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1467786.6666666667, ans=0.025 2023-10-04 00:54:18,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-10-04 00:54:19,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:54:19,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:54:20,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:54:21,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 00:54:22,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1467786.6666666667, ans=0.0 2023-10-04 00:54:23,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:54:23,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:54:23,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1467786.6666666667, ans=0.1 2023-10-04 00:54:24,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:54:24,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:54:27,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:54:28,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1467853.3333333333, ans=0.0 2023-10-04 00:54:31,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 00:54:31,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:54:33,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:33,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:54:34,604 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.014e+02 2.206e+02 2.508e+02 4.663e+02, threshold=4.412e+02, percent-clipped=1.0 2023-10-04 00:54:36,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 00:54:36,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1467853.3333333333, ans=0.125 2023-10-04 00:54:37,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:54:40,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 00:54:40,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:54:42,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1467920.0, ans=0.2 2023-10-04 00:54:42,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1467920.0, ans=0.125 2023-10-04 00:54:43,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 00:54:47,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 00:54:48,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:54:48,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:54:48,714 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 00:54:48,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 00:54:51,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 00:54:54,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:54:55,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1467920.0, ans=0.1 2023-10-04 00:54:57,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:54:58,384 INFO [train.py:1046] (2/4) Epoch 42, batch 2400, loss[loss=0.1532, simple_loss=0.2248, pruned_loss=0.04077, over 23719.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.236, pruned_loss=0.03823, over 4727217.82 frames. ], batch size: 212, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:55:02,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:55:03,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:55:03,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 00:55:03,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 00:55:11,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 00:55:11,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:55:12,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 00:55:12,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:55:14,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:14,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 00:55:18,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:20,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 00:55:20,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1468053.3333333333, ans=0.1 2023-10-04 00:55:24,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:55:30,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 00:55:32,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:55:34,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:39,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:55:39,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 00:55:39,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:55:41,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1468120.0, ans=0.2 2023-10-04 00:55:49,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:55:50,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1468186.6666666667, ans=22.5 2023-10-04 00:55:51,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:55:54,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:55:55,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:55:55,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:55:55,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:55:55,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:55:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:55:55,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:56:01,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:56:01,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:56:02,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 00:56:04,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 00:56:06,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:56:06,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:56:06,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 00:56:07,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 00:56:07,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 00:56:07,706 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 00:56:07,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 00:56:09,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:56:10,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:10,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:56:12,399 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 00:56:12,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:13,718 INFO [train.py:1046] (2/4) Epoch 42, batch 2450, loss[loss=0.1335, simple_loss=0.1861, pruned_loss=0.04044, over 19225.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2347, pruned_loss=0.03807, over 4719264.62 frames. ], batch size: 388, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:56:13,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:56:16,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:56:16,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:56:21,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:21,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:21,418 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:56:22,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 00:56:24,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1468320.0, ans=0.0 2023-10-04 00:56:25,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1468320.0, ans=0.05 2023-10-04 00:56:28,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:56:28,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:30,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:56:30,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:56:32,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:56:32,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 00:56:34,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1468386.6666666667, ans=0.125 2023-10-04 00:56:35,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:38,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:56:38,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:56:41,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:56:42,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:56:44,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:56:44,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:45,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 00:56:47,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:56:54,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:55,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:56:57,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:56:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:57,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:56:58,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 00:57:00,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1468520.0, ans=0.1 2023-10-04 00:57:02,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:57:03,207 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 2.034e+02 2.290e+02 2.639e+02 4.932e+02, threshold=4.579e+02, percent-clipped=1.0 2023-10-04 00:57:03,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:57:06,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1468520.0, ans=0.125 2023-10-04 00:57:07,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:57:07,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:57:08,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.65 vs. limit=15.0 2023-10-04 00:57:12,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:57:12,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 00:57:14,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:57:15,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:57:15,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 00:57:15,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:57:15,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:57:19,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:57:20,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:57:21,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:57:25,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 00:57:25,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1468653.3333333333, ans=0.0 2023-10-04 00:57:26,967 INFO [train.py:1046] (2/4) Epoch 42, batch 2500, loss[loss=0.1588, simple_loss=0.249, pruned_loss=0.03431, over 24655.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.234, pruned_loss=0.0379, over 4715229.93 frames. ], batch size: 73, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:57:27,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:57:33,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:57:36,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1468653.3333333333, ans=0.125 2023-10-04 00:57:42,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:57:43,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:57:45,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:57:45,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 00:57:45,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1468720.0, ans=0.07 2023-10-04 00:57:51,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:57:51,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:57:51,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.68 vs. limit=12.0 2023-10-04 00:57:52,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:57:52,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 00:57:53,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 00:57:53,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:57:55,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:57:55,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 00:57:56,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:57:56,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 00:57:56,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:01,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:58:02,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:58:04,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:58:04,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 00:58:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:58:07,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:58:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:15,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:18,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:58:24,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:58:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 00:58:27,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:58:27,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:58:27,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1468920.0, ans=0.125 2023-10-04 00:58:28,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:58:28,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:58:30,281 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 00:58:30,281 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 00:58:30,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 00:58:34,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:58:36,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 00:58:36,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 00:58:36,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:58:37,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 00:58:40,300 INFO [train.py:1046] (2/4) Epoch 42, batch 2550, loss[loss=0.1495, simple_loss=0.2361, pruned_loss=0.03147, over 24625.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2347, pruned_loss=0.03807, over 4725370.49 frames. ], batch size: 65, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:58:40,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 00:58:43,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:58:46,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:58:46,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:58:47,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1468986.6666666667, ans=0.125 2023-10-04 00:58:49,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:58:49,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 00:58:51,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:58:53,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 00:58:55,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:58:56,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:58,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:58:58,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 00:58:58,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:59:00,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:59:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:59:03,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:59:03,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 00:59:03,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:59:04,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:04,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 00:59:16,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:59:19,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:59:19,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:19,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:59:20,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:59:27,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:59:31,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:59:31,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:59:33,036 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.986e+02 2.220e+02 2.495e+02 3.870e+02, threshold=4.440e+02, percent-clipped=0.0 2023-10-04 00:59:33,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:59:33,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:59:33,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:59:36,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:59:36,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:41,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:59:41,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 00:59:41,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:59:41,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:43,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:59:46,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:59:48,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:59:53,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:59:54,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.40 vs. limit=15.0 2023-10-04 00:59:54,958 INFO [train.py:1046] (2/4) Epoch 42, batch 2600, loss[loss=0.1649, simple_loss=0.2328, pruned_loss=0.04851, over 23694.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2358, pruned_loss=0.03873, over 4713669.88 frames. ], batch size: 164, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:59:56,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:59:58,122 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 01:00:00,878 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 01:00:00,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:00:00,934 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 01:00:01,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1469320.0, ans=0.0 2023-10-04 01:00:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 01:00:02,804 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 01:00:05,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:00:05,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 01:00:07,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 01:00:08,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 01:00:10,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:00:11,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 01:00:12,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 01:00:14,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:00:15,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 01:00:17,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 01:00:19,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 01:00:25,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:00:26,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:26,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:00:26,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 01:00:28,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:00:34,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 01:00:37,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:39,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:00:39,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 01:00:41,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:00:41,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:00:41,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 01:00:44,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:00:44,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:00:46,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:00:48,766 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 01:00:48,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:00:48,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:00:53,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1469586.6666666667, ans=15.0 2023-10-04 01:00:54,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:00:54,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:00:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 01:00:57,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:58,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:00:59,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:01:02,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1469586.6666666667, ans=0.125 2023-10-04 01:01:07,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 01:01:07,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:08,417 INFO [train.py:1046] (2/4) Epoch 42, batch 2650, loss[loss=0.1396, simple_loss=0.2222, pruned_loss=0.02845, over 24614.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2363, pruned_loss=0.0383, over 4726047.61 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:01:09,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:01:13,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 01:01:13,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:15,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:01:17,081 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 01:01:17,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:18,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:19,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:01:21,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:01:22,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:01:25,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 01:01:25,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:01:25,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:01:29,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 01:01:30,838 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 01:01:33,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:01:35,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 01:01:36,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:01:36,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 01:01:38,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1469786.6666666667, ans=0.09899494936611666 2023-10-04 01:01:42,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:42,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:01:42,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:42,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:01:46,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1469786.6666666667, ans=0.125 2023-10-04 01:01:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 01:01:47,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 01:01:49,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:01:53,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 01:01:53,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1469853.3333333333, ans=0.125 2023-10-04 01:01:54,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:54,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:01:54,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:01:56,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:56,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:01:57,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:58,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:01:58,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:02:00,143 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.901e+02 2.089e+02 2.276e+02 3.340e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-04 01:02:00,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:02:00,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:02:01,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:03,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:02:04,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:05,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:02:07,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:02:10,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:02:11,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:11,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 01:02:15,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1469920.0, ans=0.125 2023-10-04 01:02:16,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:02:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:18,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1469920.0, ans=0.1 2023-10-04 01:02:20,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:21,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:21,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:02:22,894 INFO [train.py:1046] (2/4) Epoch 42, batch 2700, loss[loss=0.148, simple_loss=0.2263, pruned_loss=0.03485, over 23631.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2373, pruned_loss=0.03813, over 4725903.74 frames. ], batch size: 134, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:02:22,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:24,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:02:24,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 01:02:27,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:02:28,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 01:02:29,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:02:30,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:30,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:31,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:02:31,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:31,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:02:32,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:02:32,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 01:02:34,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:02:35,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:02:35,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:02:36,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:40,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:02:40,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1470053.3333333333, ans=0.0 2023-10-04 01:02:41,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 01:02:42,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:02:46,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:02:46,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:02:51,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1470120.0, ans=0.1 2023-10-04 01:02:53,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:02:53,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:02:53,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:02:53,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:02:56,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:02:57,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1470120.0, ans=0.125 2023-10-04 01:02:59,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:02:59,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:02:59,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:03:02,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:02,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:03:09,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:03:11,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:03:13,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1470186.6666666667, ans=0.1 2023-10-04 01:03:13,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1470186.6666666667, ans=0.125 2023-10-04 01:03:14,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:03:14,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:17,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:19,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:19,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:03:19,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1470186.6666666667, ans=0.125 2023-10-04 01:03:20,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:20,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:22,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:03:24,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:03:26,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:03:26,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:03:27,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 01:03:29,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:03:31,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 01:03:33,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 01:03:34,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:36,327 INFO [train.py:1046] (2/4) Epoch 42, batch 2750, loss[loss=0.1552, simple_loss=0.215, pruned_loss=0.04775, over 19319.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.0383, over 4697837.94 frames. ], batch size: 389, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:03:37,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:03:37,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:40,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:40,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:03:40,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:43,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:03:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:03:43,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:03:43,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:43,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 01:03:45,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:03:45,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:50,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 01:03:52,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:03:52,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:54,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:03:54,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:03:55,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:57,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:03:57,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:03:58,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:04:01,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:04:01,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:04:03,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.58 vs. limit=6.0 2023-10-04 01:04:03,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:04:03,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:04:03,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:04:10,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:04:13,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:04:13,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:16,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1470453.3333333333, ans=0.2 2023-10-04 01:04:17,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1470453.3333333333, ans=0.0 2023-10-04 01:04:20,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:04:20,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:04:21,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:04:27,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:04:27,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:04:27,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 01:04:28,471 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 1.995e+02 2.181e+02 2.407e+02 3.708e+02, threshold=4.362e+02, percent-clipped=0.0 2023-10-04 01:04:32,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:34,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 01:04:38,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 01:04:39,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1470586.6666666667, ans=0.125 2023-10-04 01:04:41,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:04:41,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 01:04:42,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:04:44,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:04:44,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 01:04:44,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:04:49,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:04:50,531 INFO [train.py:1046] (2/4) Epoch 42, batch 2800, loss[loss=0.15, simple_loss=0.2261, pruned_loss=0.03695, over 23420.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2347, pruned_loss=0.03784, over 4693514.70 frames. ], batch size: 134, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:04:50,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:04:50,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:04:50,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1470653.3333333333, ans=0.125 2023-10-04 01:04:50,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1470653.3333333333, ans=0.125 2023-10-04 01:04:51,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 01:04:51,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:04:51,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:55,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:04:55,475 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 01:04:55,475 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 01:04:58,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:59,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:04:59,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:05:03,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:05:04,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.06 vs. limit=22.5 2023-10-04 01:05:05,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 01:05:07,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:05:07,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1470720.0, ans=0.2 2023-10-04 01:05:08,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 01:05:08,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:10,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:05:10,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:15,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:05:15,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:15,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:05:16,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:05:23,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:05:24,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:05:27,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:29,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:05:30,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:33,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:05:34,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 01:05:34,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:05:35,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:05:35,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:05:39,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:05:40,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:43,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:05:46,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:05:46,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:46,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:05:46,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:05:48,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:05:48,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:48,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 01:05:49,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:05:49,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:05:50,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:05:52,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 01:05:53,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:53,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:05:53,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:05:55,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 01:06:02,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:06:02,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:06:02,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:06:04,304 INFO [train.py:1046] (2/4) Epoch 42, batch 2850, loss[loss=0.1608, simple_loss=0.2367, pruned_loss=0.04241, over 23827.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2335, pruned_loss=0.03764, over 4691174.28 frames. ], batch size: 179, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:06:04,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:07,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:06:07,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:07,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:06:09,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:11,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:06:12,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:06:12,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 01:06:14,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1470986.6666666667, ans=0.09899494936611666 2023-10-04 01:06:19,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 01:06:19,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:20,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 01:06:22,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:23,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 01:06:24,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 01:06:25,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:30,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1471053.3333333333, ans=0.125 2023-10-04 01:06:33,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1471120.0, ans=0.125 2023-10-04 01:06:36,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1471120.0, ans=0.05 2023-10-04 01:06:37,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:38,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:06:38,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:06:38,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:06:38,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:06:38,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:06:40,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:06:41,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 01:06:43,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1471120.0, ans=0.125 2023-10-04 01:06:44,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:06:44,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:06:44,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:46,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:48,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:48,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:50,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:52,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:06:53,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:06:53,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:54,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:56,318 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.878e+02 2.070e+02 2.200e+02 2.793e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-04 01:06:56,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:07:01,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:07:03,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 01:07:03,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 01:07:05,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:07:05,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:05,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 01:07:07,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:07:07,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:08,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:08,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:07:08,448 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 01:07:08,481 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 01:07:08,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:07:08,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:15,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:07:15,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:07:17,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1471320.0, ans=0.125 2023-10-04 01:07:19,095 INFO [train.py:1046] (2/4) Epoch 42, batch 2900, loss[loss=0.1406, simple_loss=0.2185, pruned_loss=0.0313, over 24359.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2344, pruned_loss=0.03785, over 4700827.21 frames. ], batch size: 56, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:07:19,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 01:07:23,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:07:23,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 01:07:24,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 01:07:26,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:07:26,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:07:28,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:07:30,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:07:32,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:07:34,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:07:37,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:07:38,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 01:07:39,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:07:39,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:42,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 01:07:42,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 01:07:45,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:45,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 01:07:45,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:07:47,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:07:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:07:50,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:07:50,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:53,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:54,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:07:57,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 01:07:57,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 01:07:57,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:08:00,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:08:03,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 01:08:03,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:08:06,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1471520.0, ans=0.125 2023-10-04 01:08:08,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:08:14,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.57 vs. limit=15.0 2023-10-04 01:08:17,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:08:17,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:08:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 01:08:22,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:22,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 01:08:23,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:08:23,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:08:31,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:08:32,941 INFO [train.py:1046] (2/4) Epoch 42, batch 2950, loss[loss=0.1664, simple_loss=0.2501, pruned_loss=0.04135, over 24642.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.0388, over 4691658.65 frames. ], batch size: 65, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:08:33,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 01:08:34,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:08:34,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:35,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:08:37,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:08:37,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 01:08:39,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 01:08:39,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:08:39,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:08:43,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:08:45,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:08:46,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:08:48,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:08:51,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:08:51,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:08:54,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:54,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:54,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:08:55,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 01:08:56,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1471720.0, ans=0.09899494936611666 2023-10-04 01:09:00,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 01:09:00,201 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 01:09:01,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:09:02,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1471786.6666666667, ans=0.2 2023-10-04 01:09:04,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 01:09:04,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 01:09:04,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:09:06,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:09:06,073 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 01:09:06,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:09:10,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 01:09:10,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:09:11,608 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-10-04 01:09:12,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:09:15,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:09:18,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:09:18,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:19,698 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 01:09:19,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:09:19,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 01:09:24,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:24,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1471853.3333333333, ans=0.125 2023-10-04 01:09:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:09:26,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 01:09:26,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:09:27,396 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.891e+02 2.131e+02 2.333e+02 3.581e+02, threshold=4.262e+02, percent-clipped=0.0 2023-10-04 01:09:28,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 01:09:29,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1471853.3333333333, ans=0.04949747468305833 2023-10-04 01:09:31,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:09:33,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.49 vs. limit=15.0 2023-10-04 01:09:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:09:34,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:09:35,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:35,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:09:37,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:09:37,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1471920.0, ans=0.125 2023-10-04 01:09:39,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:39,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:09:39,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:09:39,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:09:39,540 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:09:40,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:09:41,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:41,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 01:09:43,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:45,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:09:46,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:09:47,906 INFO [train.py:1046] (2/4) Epoch 42, batch 3000, loss[loss=0.1397, simple_loss=0.2238, pruned_loss=0.02782, over 24396.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2371, pruned_loss=0.03871, over 4696133.22 frames. ], batch size: 61, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:09:47,907 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 01:09:55,334 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.3937, 2.6727, 3.5268, 2.3655], device='cuda:2') 2023-10-04 01:09:59,530 INFO [train.py:1078] (2/4) Epoch 42, validation: loss=0.3457, simple_loss=0.2797, pruned_loss=0.2058, over 1125622.00 frames. 2023-10-04 01:09:59,530 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 01:09:59,687 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 01:09:59,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1471986.6666666667, ans=0.125 2023-10-04 01:10:01,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 01:10:01,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1471986.6666666667, ans=0.04949747468305833 2023-10-04 01:10:04,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:10:05,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:10:05,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 01:10:07,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:10:13,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:10:17,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1472053.3333333333, ans=0.125 2023-10-04 01:10:22,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:10:27,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 01:10:28,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:10:31,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:10:32,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:10:32,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:10:35,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:10:35,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 01:10:38,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 01:10:40,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:10:40,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:10:42,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:10:43,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:10:43,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:43,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:10:47,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:10:48,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:10:48,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:10:50,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:10:52,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 01:10:53,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:10:53,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:10:53,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:10:56,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:56,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:58,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:10:59,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 01:10:59,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:10:59,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 01:11:00,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:11:02,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 01:11:05,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:11:05,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:11:05,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 01:11:07,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 01:11:07,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:11:08,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:11:09,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.41 vs. limit=22.5 2023-10-04 01:11:09,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:11:09,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:11:09,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:11,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:11:12,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 01:11:13,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-10-04 01:11:14,111 INFO [train.py:1046] (2/4) Epoch 42, batch 3050, loss[loss=0.1577, simple_loss=0.243, pruned_loss=0.03622, over 24375.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2375, pruned_loss=0.0386, over 4700211.08 frames. ], batch size: 77, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:11:15,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:11:16,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.30 vs. limit=15.0 2023-10-04 01:11:18,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:18,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:11:18,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1472320.0, ans=0.0 2023-10-04 01:11:23,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:26,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 01:11:31,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 01:11:31,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 01:11:31,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:11:33,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:11:37,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:37,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:38,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:11:39,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:11:40,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:11:40,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:11:41,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:41,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:11:43,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:45,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:11:45,977 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.66 vs. limit=15.0 2023-10-04 01:11:46,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1472453.3333333333, ans=0.035 2023-10-04 01:11:49,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:11:49,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 01:11:49,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:51,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:11:54,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:11:55,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:11:55,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:11:56,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:02,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:12:02,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:04,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.64 vs. limit=15.0 2023-10-04 01:12:06,774 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.976e+02 2.144e+02 2.378e+02 3.256e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-04 01:12:06,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:07,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1472520.0, ans=0.125 2023-10-04 01:12:07,643 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.70 vs. limit=22.5 2023-10-04 01:12:08,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:12:08,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:12:10,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:12:10,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:12:11,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:12:12,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 01:12:12,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:12:14,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:14,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 01:12:16,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:23,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:25,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:12:27,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:12:28,407 INFO [train.py:1046] (2/4) Epoch 42, batch 3100, loss[loss=0.1469, simple_loss=0.2276, pruned_loss=0.03312, over 24501.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03819, over 4702677.91 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:12:28,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 01:12:31,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 01:12:32,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 01:12:34,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:12:34,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1472653.3333333333, ans=0.0 2023-10-04 01:12:38,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:12:39,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:42,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 01:12:45,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:47,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1472720.0, ans=0.0 2023-10-04 01:12:51,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 01:12:54,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:12:55,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.04 vs. limit=15.0 2023-10-04 01:12:56,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:12:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:12:57,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:12:57,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 01:13:00,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:13:00,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 01:13:00,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:13:01,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:01,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 01:13:04,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:13:08,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:13:08,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 01:13:09,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 01:13:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:11,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:14,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:14,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:14,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:13:16,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:13:16,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:13:17,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:13:18,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:13:18,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:18,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:13:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:13:24,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 01:13:26,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:13:26,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 01:13:28,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:28,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:28,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 01:13:39,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 01:13:40,606 INFO [train.py:1046] (2/4) Epoch 42, batch 3150, loss[loss=0.1565, simple_loss=0.2391, pruned_loss=0.03694, over 23266.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2359, pruned_loss=0.03764, over 4709463.34 frames. ], batch size: 105, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:13:40,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:13:42,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:43,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:13:43,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:13:45,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 01:13:46,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:13:46,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:13:48,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 01:13:49,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:50,981 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 01:13:53,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 01:13:53,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:13:53,894 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 01:13:55,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:13:57,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 01:13:58,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 01:13:58,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 01:13:58,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:58,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:59,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:14:01,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 01:14:02,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:14:03,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:14:04,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:14:06,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:14:10,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 01:14:11,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:14:12,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:14:14,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:14:14,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 01:14:17,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1473120.0, ans=0.1 2023-10-04 01:14:18,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 01:14:18,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:14:19,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:14:19,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:14:19,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:14:19,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:14:21,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:14:21,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:14:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 01:14:22,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:14:22,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:23,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1473186.6666666667, ans=0.125 2023-10-04 01:14:24,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:14:24,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:14:24,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 01:14:26,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:14:28,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 01:14:28,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:29,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 01:14:30,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 01:14:32,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:14:32,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:14:33,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 01:14:34,959 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.982e+02 2.210e+02 2.426e+02 4.214e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-04 01:14:36,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 01:14:36,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:14:39,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:14:41,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:41,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:14:45,685 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:14:48,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:14:48,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:49,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 01:14:51,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1473253.3333333333, ans=0.0 2023-10-04 01:14:53,758 INFO [train.py:1046] (2/4) Epoch 42, batch 3200, loss[loss=0.1497, simple_loss=0.2418, pruned_loss=0.02876, over 24308.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2347, pruned_loss=0.03782, over 4699801.72 frames. ], batch size: 74, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:14:55,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:14:55,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 01:15:00,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:15:02,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:02,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 01:15:04,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:15:07,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1473386.6666666667, ans=0.1 2023-10-04 01:15:08,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:15:13,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:15:20,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:15:20,802 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:15:30,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 01:15:30,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:15:34,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 01:15:34,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:15:37,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:15:37,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:15:38,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:15:41,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 01:15:43,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 01:15:45,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 01:15:48,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 01:15:48,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1473520.0, ans=0.2 2023-10-04 01:15:51,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:15:57,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:57,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:15:57,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:58,596 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 01:15:58,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:16:01,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:05,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 01:16:05,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 01:16:06,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 01:16:06,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 01:16:07,910 INFO [train.py:1046] (2/4) Epoch 42, batch 3250, loss[loss=0.1577, simple_loss=0.2323, pruned_loss=0.04153, over 23509.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2348, pruned_loss=0.03813, over 4703012.17 frames. ], batch size: 285, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:16:08,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:16:08,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1473653.3333333333, ans=0.0 2023-10-04 01:16:10,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:16:10,082 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 01:16:10,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:16:11,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:12,770 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 01:16:16,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:16:18,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:16:23,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1473720.0, ans=0.125 2023-10-04 01:16:26,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:16:26,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 01:16:26,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:27,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:16:27,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:16:29,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:16:29,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:16:34,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:34,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:16:34,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:34,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:36,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:16:37,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:16:37,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:16:41,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:41,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:42,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:42,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:16:42,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:16:42,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1473786.6666666667, ans=0.125 2023-10-04 01:16:46,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 01:16:47,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:16:47,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:16:49,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1473786.6666666667, ans=0.125 2023-10-04 01:16:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:50,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:16:55,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:16:59,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1473853.3333333333, ans=0.125 2023-10-04 01:17:02,399 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.958e+02 2.109e+02 2.405e+02 3.560e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-04 01:17:02,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:17:02,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:02,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 01:17:02,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:17:02,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:17:03,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:04,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1473853.3333333333, ans=0.05 2023-10-04 01:17:06,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 01:17:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 01:17:07,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:17:07,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1473920.0, ans=0.0 2023-10-04 01:17:08,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:08,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:17:09,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 01:17:09,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1473920.0, ans=0.125 2023-10-04 01:17:10,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:17:12,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1473920.0, ans=0.125 2023-10-04 01:17:13,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:17:13,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:17:15,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 01:17:15,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:18,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:17:18,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 01:17:20,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1473986.6666666667, ans=0.09899494936611666 2023-10-04 01:17:21,328 INFO [train.py:1046] (2/4) Epoch 42, batch 3300, loss[loss=0.1423, simple_loss=0.225, pruned_loss=0.02979, over 24468.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2357, pruned_loss=0.03836, over 4707037.30 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:17:22,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:17:22,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 01:17:22,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 01:17:24,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 01:17:24,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:27,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:17:28,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:17:30,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:32,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:17:32,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:17:36,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:37,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:17:42,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 01:17:43,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:17:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:45,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:45,188 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 01:17:46,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:17:46,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:17:48,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:17:48,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:17:48,053 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 01:17:50,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:50,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:17:53,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:53,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 01:17:55,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 01:17:55,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:56,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:17:57,940 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 01:17:59,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 01:18:00,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:18:02,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 01:18:05,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:18:08,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:18:08,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:18:13,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:13,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:18:13,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:18:13,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:18:16,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:18:16,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:18:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:18:19,191 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 01:18:20,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 01:18:23,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:18:23,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:18:23,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:24,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:18:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:26,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:18:26,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:26,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:18:27,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:18:29,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:18:30,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 01:18:31,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:32,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:35,184 INFO [train.py:1046] (2/4) Epoch 42, batch 3350, loss[loss=0.1546, simple_loss=0.2381, pruned_loss=0.03558, over 23205.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.03833, over 4718648.15 frames. ], batch size: 105, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:18:36,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:18:36,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:18:38,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:39,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:39,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:41,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:18:42,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:18:47,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:18:50,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:50,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:18:51,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 01:18:52,975 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 01:18:54,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:56,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 01:18:57,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 01:18:58,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:18:58,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1474386.6666666667, ans=0.125 2023-10-04 01:18:59,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:19:01,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:01,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 01:19:01,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:01,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:19:03,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:04,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:04,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:05,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:19:09,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:09,476 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:19:10,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:10,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:14,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:19:16,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:19,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:19,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:21,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:23,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 01:19:23,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:19:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 01:19:25,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:19:25,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 01:19:26,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:28,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:29,256 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.949e+02 2.124e+02 2.464e+02 3.729e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-04 01:19:35,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 01:19:36,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:19:38,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:19:38,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:19:41,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1474586.6666666667, ans=0.125 2023-10-04 01:19:44,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:19:45,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 01:19:45,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:19:47,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:19:49,100 INFO [train.py:1046] (2/4) Epoch 42, batch 3400, loss[loss=0.1488, simple_loss=0.2254, pruned_loss=0.03604, over 23203.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2366, pruned_loss=0.0383, over 4719164.33 frames. ], batch size: 105, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:19:49,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:50,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 01:19:50,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:50,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 01:19:52,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:19:53,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:19:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:19:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:19:56,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 01:19:57,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1474653.3333333333, ans=0.125 2023-10-04 01:19:59,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1474653.3333333333, ans=0.125 2023-10-04 01:20:00,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 01:20:00,748 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 01:20:02,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:06,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:20:06,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:20:06,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:06,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:20:11,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:20:11,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1474720.0, ans=0.2 2023-10-04 01:20:14,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 01:20:16,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.23 vs. limit=15.0 2023-10-04 01:20:17,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:20:17,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1474786.6666666667, ans=0.125 2023-10-04 01:20:19,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:20,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:20:20,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1474786.6666666667, ans=0.125 2023-10-04 01:20:22,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 01:20:26,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:20:29,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 01:20:36,150 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.83 vs. limit=15.0 2023-10-04 01:20:36,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:38,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:38,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 01:20:39,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:20:39,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:20:39,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:20:40,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:20:44,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:47,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:20:47,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:20:50,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:20:53,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 01:21:00,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:21:01,539 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.30 vs. limit=12.0 2023-10-04 01:21:03,411 INFO [train.py:1046] (2/4) Epoch 42, batch 3450, loss[loss=0.1493, simple_loss=0.2161, pruned_loss=0.04126, over 23601.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03837, over 4712602.02 frames. ], batch size: 256, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:21:03,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 01:21:05,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.13 vs. limit=8.0 2023-10-04 01:21:08,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 01:21:08,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:21:09,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:21:09,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 01:21:10,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:21:13,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:21:18,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1475053.3333333333, ans=0.0 2023-10-04 01:21:20,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:21:20,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:21:22,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:21:22,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:23,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:26,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1475053.3333333333, ans=0.125 2023-10-04 01:21:29,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 01:21:33,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 01:21:33,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:21:33,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1475120.0, ans=0.0 2023-10-04 01:21:35,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:21:36,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:21:40,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 01:21:42,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:21:46,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:21:46,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:21:49,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:21:51,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:21:51,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 01:21:51,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:21:53,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:56,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:21:58,676 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.992e+02 2.174e+02 2.512e+02 3.685e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 01:21:58,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 01:22:00,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:22:04,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1475253.3333333333, ans=0.0 2023-10-04 01:22:06,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:22:07,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:09,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:13,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:13,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:22:15,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:22:16,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:22:17,646 INFO [train.py:1046] (2/4) Epoch 42, batch 3500, loss[loss=0.1525, simple_loss=0.2435, pruned_loss=0.03078, over 24302.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.235, pruned_loss=0.03779, over 4718244.72 frames. ], batch size: 74, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:22:21,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:22,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:22:23,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1475320.0, ans=0.125 2023-10-04 01:22:24,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 01:22:26,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:22:28,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:22:29,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:29,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 01:22:33,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:22:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:22:37,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:22:37,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:22:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:22:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:38,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:22:38,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 01:22:39,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:41,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:22:44,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:22:44,744 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.278e-03 2023-10-04 01:22:46,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1475453.3333333333, ans=0.0 2023-10-04 01:22:48,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 01:22:49,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:22:52,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:22:53,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:22:55,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:56,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:22:56,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:22:59,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 01:23:00,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 01:23:00,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 01:23:00,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:23:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:02,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:23:02,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1475520.0, ans=0.0 2023-10-04 01:23:03,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:23:05,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:23:06,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:23:11,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:23:12,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 01:23:12,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 01:23:12,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:23:14,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:23:15,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:23:16,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1475586.6666666667, ans=0.125 2023-10-04 01:23:17,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:19,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 01:23:21,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:23:22,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1475586.6666666667, ans=0.0 2023-10-04 01:23:23,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:23:24,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 01:23:26,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 01:23:27,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:23:29,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:23:29,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:31,991 INFO [train.py:1046] (2/4) Epoch 42, batch 3550, loss[loss=0.1523, simple_loss=0.2271, pruned_loss=0.03878, over 23756.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2342, pruned_loss=0.03724, over 4716709.61 frames. ], batch size: 179, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:23:34,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:23:38,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1475653.3333333333, ans=0.2 2023-10-04 01:23:39,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:41,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 01:23:44,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:23:45,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:23:45,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:23:48,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:23:48,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:23:52,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:23:52,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:23:53,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:53,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:23:53,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1475720.0, ans=0.025 2023-10-04 01:23:55,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:24:01,282 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.50 vs. limit=15.0 2023-10-04 01:24:02,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:24:02,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:24:04,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:24:04,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:24:04,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:24:04,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 01:24:04,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:05,131 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:24:08,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:08,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:24:13,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:15,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:24:15,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:16,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 01:24:17,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:24:19,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 01:24:21,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:24:23,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:24:23,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:24:25,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 01:24:27,020 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.930e+02 2.101e+02 2.469e+02 3.261e+02, threshold=4.203e+02, percent-clipped=0.0 2023-10-04 01:24:27,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:24:27,382 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:24:32,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:24:32,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 01:24:32,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:38,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:40,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 01:24:45,547 INFO [train.py:1046] (2/4) Epoch 42, batch 3600, loss[loss=0.169, simple_loss=0.2397, pruned_loss=0.04917, over 23882.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2344, pruned_loss=0.0373, over 4722732.38 frames. ], batch size: 195, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:24:47,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 01:24:47,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:24:48,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:24:50,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:50,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:52,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:24:55,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:24:58,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:59,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:25:00,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:25:00,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:00,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 01:25:03,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:25:05,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:09,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:25:12,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:25:12,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1476053.3333333333, ans=0.05 2023-10-04 01:25:13,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:25:13,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:25:13,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 01:25:13,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:25:18,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:19,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:25:19,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:20,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:25:22,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:25:22,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 01:25:29,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1476186.6666666667, ans=0.125 2023-10-04 01:25:30,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:25:30,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:25:32,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 01:25:36,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:25:42,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:44,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:47,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:25:47,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:25:47,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 01:25:49,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 01:25:49,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 01:25:50,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:25:50,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:25:54,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 01:25:54,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:25:54,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:25:54,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:25:55,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 01:25:57,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 01:26:00,054 INFO [train.py:1046] (2/4) Epoch 42, batch 3650, loss[loss=0.1679, simple_loss=0.2547, pruned_loss=0.04059, over 24018.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2351, pruned_loss=0.03755, over 4721717.20 frames. ], batch size: 86, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:26:00,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:26:01,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 01:26:04,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 01:26:05,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:26:08,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 01:26:10,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 01:26:13,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:26:13,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:26:13,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:26:16,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:26:17,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:26:17,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 01:26:17,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:26:19,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:26:19,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 01:26:19,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:26:21,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:26:21,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:22,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:26:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 01:26:27,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 01:26:28,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:26:30,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 01:26:31,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:26:31,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:26:33,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1476453.3333333333, ans=0.125 2023-10-04 01:26:35,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:26:38,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:38,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:26:39,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:26:40,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:26:44,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:26:45,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1476520.0, ans=0.0 2023-10-04 01:26:46,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:26:48,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:26:48,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:26:48,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-04 01:26:49,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:26:50,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:51,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-04 01:26:52,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:26:55,665 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.914e+02 2.089e+02 2.349e+02 3.091e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-04 01:26:59,011 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 01:27:01,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:27:01,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:01,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:27:03,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:04,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:27:06,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:06,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1476586.6666666667, ans=0.95 2023-10-04 01:27:08,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 01:27:08,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:10,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:27:12,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:27:13,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:27:14,721 INFO [train.py:1046] (2/4) Epoch 42, batch 3700, loss[loss=0.1623, simple_loss=0.2532, pruned_loss=0.03568, over 24289.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.03743, over 4720511.43 frames. ], batch size: 74, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:27:16,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1476653.3333333333, ans=0.0 2023-10-04 01:27:17,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:17,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 01:27:17,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:17,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:27:18,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:27:22,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:27:25,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:27:25,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:26,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:27:26,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:27,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:27:30,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:31,747 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 01:27:37,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:27:37,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:27:40,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:27:40,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 01:27:40,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:27:43,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:43,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 01:27:44,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:27:49,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:49,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:27:52,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:27:53,002 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-10-04 01:27:53,137 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-10-04 01:27:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:27:57,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 01:27:57,742 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.15 vs. limit=6.0 2023-10-04 01:27:58,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:58,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 01:28:01,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1476853.3333333333, ans=0.0 2023-10-04 01:28:05,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:28:05,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:28:08,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:08,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 01:28:09,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:28:09,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:28:10,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:28:11,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:28:15,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 01:28:17,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 01:28:18,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:28:18,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:20,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:28:20,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:28:20,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1476920.0, ans=0.0 2023-10-04 01:28:22,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.45 vs. limit=6.0 2023-10-04 01:28:24,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:28:24,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:28:26,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:28:27,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 01:28:28,706 INFO [train.py:1046] (2/4) Epoch 42, batch 3750, loss[loss=0.1305, simple_loss=0.2162, pruned_loss=0.02239, over 20988.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2358, pruned_loss=0.03744, over 4716900.72 frames. ], batch size: 46, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:28:29,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 01:28:33,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:28:33,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 01:28:35,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:28:36,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:37,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:39,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:28:40,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:28:43,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:28:46,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:28:49,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:52,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:28:52,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 01:28:54,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:28:55,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:28:55,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:28:59,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 01:29:04,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 01:29:04,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:29:06,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:29:07,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:29:10,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:12,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 01:29:16,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 01:29:17,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:17,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1477186.6666666667, ans=0.0 2023-10-04 01:29:19,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.46 vs. limit=6.0 2023-10-04 01:29:22,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:29:24,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:29:25,563 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.079e+02 2.348e+02 2.758e+02 4.520e+02, threshold=4.696e+02, percent-clipped=4.0 2023-10-04 01:29:26,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:29:31,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:29:33,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:29:34,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:29:36,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:29:37,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:29:41,344 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.28 vs. limit=15.0 2023-10-04 01:29:43,149 INFO [train.py:1046] (2/4) Epoch 42, batch 3800, loss[loss=0.147, simple_loss=0.2095, pruned_loss=0.04221, over 23497.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2363, pruned_loss=0.03766, over 4700962.18 frames. ], batch size: 256, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:29:46,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:29:50,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:29:51,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:29:51,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 01:29:52,252 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:29:53,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:56,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:29:56,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:29:59,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 01:29:59,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:01,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:30:03,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:30:04,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:30:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:06,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 01:30:06,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1477386.6666666667, ans=0.1 2023-10-04 01:30:09,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 01:30:09,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:30:10,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:30:13,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:30:13,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1477453.3333333333, ans=0.2 2023-10-04 01:30:14,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:30:16,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:30:16,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:20,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:20,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:24,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:30:24,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 01:30:26,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:30:26,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1477520.0, ans=0.2 2023-10-04 01:30:26,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1477520.0, ans=0.1 2023-10-04 01:30:28,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.23 vs. limit=12.0 2023-10-04 01:30:32,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:30:33,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1477520.0, ans=0.125 2023-10-04 01:30:36,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:30:39,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 01:30:39,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1477520.0, ans=0.125 2023-10-04 01:30:41,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 01:30:42,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:30:44,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:30:45,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:47,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 01:30:51,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 01:30:51,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 01:30:51,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:51,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:30:57,786 INFO [train.py:1046] (2/4) Epoch 42, batch 3850, loss[loss=0.1604, simple_loss=0.2374, pruned_loss=0.04174, over 24485.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2357, pruned_loss=0.03765, over 4704730.96 frames. ], batch size: 66, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:30:57,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:30:59,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:31:02,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1477653.3333333333, ans=0.125 2023-10-04 01:31:03,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:31:03,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 01:31:05,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:31:05,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:31:09,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:31:10,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:31:14,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:31:14,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 01:31:20,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:22,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:31:24,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:31:24,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:31:26,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:27,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:31:29,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:31:29,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:31:29,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:31:32,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:31:33,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:33,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:31:35,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 01:31:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 01:31:36,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:31:36,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:39,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:39,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:40,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 01:31:43,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 01:31:44,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:46,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 01:31:46,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:31:52,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:52,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:53,981 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.866e+02 2.015e+02 2.383e+02 4.192e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-04 01:31:54,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1477853.3333333333, ans=0.125 2023-10-04 01:31:54,651 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.18 vs. limit=15.0 2023-10-04 01:31:56,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:56,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 01:32:00,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 01:32:00,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1477920.0, ans=0.125 2023-10-04 01:32:03,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:03,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:06,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:32:06,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:32:06,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:07,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:07,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:32:07,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 01:32:08,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:32:10,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 01:32:10,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:10,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:12,082 INFO [train.py:1046] (2/4) Epoch 42, batch 3900, loss[loss=0.1421, simple_loss=0.2308, pruned_loss=0.02672, over 24439.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2346, pruned_loss=0.03741, over 4695304.51 frames. ], batch size: 69, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:32:12,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:32:12,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1477986.6666666667, ans=0.125 2023-10-04 01:32:13,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:13,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:32:14,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:14,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:32:14,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:32:14,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 01:32:16,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:19,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:32:20,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:32:20,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:32:21,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:32:24,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:32:24,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:26,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:32:29,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 01:32:29,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:32:31,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 01:32:33,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:34,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 01:32:34,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 01:32:38,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:32:40,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:32:40,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:32:41,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:32:44,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:32:45,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1478120.0, ans=15.0 2023-10-04 01:32:46,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:32:49,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:32:49,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:32:49,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:32:52,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1478120.0, ans=0.0 2023-10-04 01:32:56,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:32:56,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:33:02,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:33:03,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:33:13,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:33:16,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:33:16,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 01:33:16,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 01:33:16,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:33:16,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1478253.3333333333, ans=0.0 2023-10-04 01:33:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 01:33:20,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:33:20,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 01:33:25,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1478320.0, ans=0.0 2023-10-04 01:33:27,073 INFO [train.py:1046] (2/4) Epoch 42, batch 3950, loss[loss=0.1501, simple_loss=0.2291, pruned_loss=0.03552, over 23601.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2341, pruned_loss=0.03705, over 4701883.08 frames. ], batch size: 232, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:33:27,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:33:29,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 01:33:29,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:33:32,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:33:34,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:33:40,015 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 01:33:41,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:33:41,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 01:33:42,711 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 01:33:42,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:33:45,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:33:45,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:33:45,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:33:48,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 01:33:51,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:33:51,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:33:52,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:33:52,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:33:54,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:33:57,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1478453.3333333333, ans=0.0 2023-10-04 01:33:59,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.94 vs. limit=15.0 2023-10-04 01:34:04,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:34:04,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:34:10,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 01:34:14,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 01:34:14,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 01:34:16,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:34:17,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:34:21,629 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.959e+02 2.268e+02 2.678e+02 3.550e+02, threshold=4.535e+02, percent-clipped=0.0 2023-10-04 01:34:24,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:34:24,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:34:24,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:34:26,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:34:26,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 01:34:26,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1478586.6666666667, ans=0.1 2023-10-04 01:34:28,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1478586.6666666667, ans=0.125 2023-10-04 01:34:29,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:34:33,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:34:35,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 01:34:39,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1478653.3333333333, ans=0.0 2023-10-04 01:34:40,806 INFO [train.py:1046] (2/4) Epoch 42, batch 4000, loss[loss=0.121, simple_loss=0.2038, pruned_loss=0.01905, over 20235.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03697, over 4700500.37 frames. ], batch size: 44, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:34:41,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1478653.3333333333, ans=0.0 2023-10-04 01:34:45,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:50,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:51,310 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.73 vs. limit=15.0 2023-10-04 01:34:57,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:34:57,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:34:57,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:57,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 01:34:59,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:34:59,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 01:34:59,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:34:59,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 01:35:01,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:04,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:35:04,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:35:04,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:35:06,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:35:06,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:35:06,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1478720.0, ans=0.125 2023-10-04 01:35:08,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:35:09,515 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 01:35:10,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:35:10,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:13,630 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 01:35:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:35:14,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:35:17,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1478786.6666666667, ans=0.05 2023-10-04 01:35:20,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 01:35:20,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:35:21,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.83 vs. limit=22.5 2023-10-04 01:35:23,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:35:25,222 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 01:35:26,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:35:26,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 01:35:27,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:35:27,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:29,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:35:32,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:35:32,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:35:34,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:35:35,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 01:35:35,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:38,437 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 01:35:43,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1478920.0, ans=0.0 2023-10-04 01:35:44,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:35:45,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 01:35:48,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:35:48,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:48,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1478920.0, ans=0.125 2023-10-04 01:35:50,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:35:51,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:35:54,033 INFO [train.py:1046] (2/4) Epoch 42, batch 4050, loss[loss=0.1587, simple_loss=0.2331, pruned_loss=0.04216, over 23331.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2354, pruned_loss=0.03733, over 4703736.29 frames. ], batch size: 105, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:35:55,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:55,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1478986.6666666667, ans=0.125 2023-10-04 01:35:57,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1478986.6666666667, ans=0.125 2023-10-04 01:35:58,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:35:58,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 01:36:01,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:36:01,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:02,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:36:04,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:36:05,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:36:08,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:36:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:36:10,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:36:11,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:36:13,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:36:16,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:36:19,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:36:19,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1479053.3333333333, ans=0.04949747468305833 2023-10-04 01:36:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 01:36:23,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 01:36:23,821 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 01:36:25,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1479120.0, ans=0.125 2023-10-04 01:36:26,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:36:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 01:36:32,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:36:36,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:39,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:36:39,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:36:39,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:41,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:36:46,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 01:36:46,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:36:46,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1479186.6666666667, ans=0.0 2023-10-04 01:36:48,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:36:48,213 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:36:49,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 01:36:50,639 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.894e+02 2.127e+02 2.287e+02 3.562e+02, threshold=4.254e+02, percent-clipped=0.0 2023-10-04 01:36:53,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:37:00,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 01:37:01,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1479253.3333333333, ans=0.125 2023-10-04 01:37:02,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:37:02,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:37:03,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 01:37:03,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 01:37:03,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:05,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:37:07,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:07,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:37:08,897 INFO [train.py:1046] (2/4) Epoch 42, batch 4100, loss[loss=0.159, simple_loss=0.2289, pruned_loss=0.04457, over 22893.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03796, over 4698162.17 frames. ], batch size: 323, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:37:16,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 01:37:16,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 01:37:17,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 01:37:19,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 01:37:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:19,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:20,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:20,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:37:21,744 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 01:37:24,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:37:27,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:37:27,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:27,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:37:30,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1479386.6666666667, ans=0.0 2023-10-04 01:37:31,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:37:33,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:37:33,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:37:34,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 01:37:34,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:34,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:37:35,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:37:35,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:37:37,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 01:37:39,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:37:40,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 01:37:41,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:37:45,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:37:45,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 01:37:46,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:37:47,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1479453.3333333333, ans=0.0 2023-10-04 01:37:48,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:37:48,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:37:49,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 01:37:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:37:51,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:37:52,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1479520.0, ans=0.0 2023-10-04 01:37:53,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 01:37:54,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:54,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:37:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:37:58,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1479520.0, ans=0.0 2023-10-04 01:38:01,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:04,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:38:06,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:38:08,864 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.55 vs. limit=15.0 2023-10-04 01:38:13,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:13,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:38:18,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:38:19,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:38:22,315 INFO [train.py:1046] (2/4) Epoch 42, batch 4150, loss[loss=0.1662, simple_loss=0.2481, pruned_loss=0.04209, over 24641.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2367, pruned_loss=0.03785, over 4702239.97 frames. ], batch size: 65, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:38:22,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1479653.3333333333, ans=0.0 2023-10-04 01:38:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:38:25,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:38:25,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:38:25,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:38:27,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.65 vs. limit=15.0 2023-10-04 01:38:29,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 01:38:29,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:30,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 01:38:30,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 01:38:30,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 01:38:33,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:36,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:38:36,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:36,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1479720.0, ans=0.125 2023-10-04 01:38:41,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:38:42,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:38:42,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:38:45,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:38:45,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:38:47,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:38:50,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:53,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:38:53,839 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.52 vs. limit=10.0 2023-10-04 01:38:54,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 01:38:56,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 01:38:56,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:38:57,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 01:38:57,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:38:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:39:01,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:01,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:39:04,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 01:39:05,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1479853.3333333333, ans=0.125 2023-10-04 01:39:08,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:39:10,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:39:10,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 01:39:11,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:39:12,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 01:39:15,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:39:16,814 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=11.25 vs. limit=15.0 2023-10-04 01:39:17,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:39:18,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:18,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1479853.3333333333, ans=0.125 2023-10-04 01:39:19,991 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.917e+02 2.149e+02 2.587e+02 4.183e+02, threshold=4.298e+02, percent-clipped=0.0 2023-10-04 01:39:20,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 01:39:20,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:20,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:39:22,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:39:24,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 01:39:25,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:25,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:39:25,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:39:26,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 01:39:27,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:39:27,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:39:28,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:39:29,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:29,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 01:39:31,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:39:35,908 INFO [train.py:1046] (2/4) Epoch 42, batch 4200, loss[loss=0.1482, simple_loss=0.2423, pruned_loss=0.02705, over 24292.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.236, pruned_loss=0.03773, over 4688067.02 frames. ], batch size: 74, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:39:37,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:39:37,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 01:39:39,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:39:39,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1479986.6666666667, ans=0.0 2023-10-04 01:39:42,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:39:42,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:39:44,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:39:44,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:39:45,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 01:39:49,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 01:39:50,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:51,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:39:53,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:39:55,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:39:58,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:39:58,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:59,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 01:39:59,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:40:01,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:40:01,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:40:01,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:40:04,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:40:06,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 01:40:06,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:40:09,709 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:40:09,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1480120.0, ans=0.1 2023-10-04 01:40:11,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:40:11,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:40:14,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:40:14,470 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:40:15,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:40:17,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:40:17,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 01:40:17,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:40:18,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:40:24,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:40:25,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:40:32,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:40:33,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 01:40:37,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:40:41,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:40:43,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:40:44,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 01:40:46,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1480253.3333333333, ans=0.0 2023-10-04 01:40:49,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:40:50,741 INFO [train.py:1046] (2/4) Epoch 42, batch 4250, loss[loss=0.1636, simple_loss=0.253, pruned_loss=0.03707, over 24587.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2352, pruned_loss=0.03743, over 4693764.44 frames. ], batch size: 71, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:40:52,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:40:52,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:40:53,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.41 vs. limit=15.0 2023-10-04 01:40:56,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:40:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:40:59,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 01:40:59,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:41:02,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1480320.0, ans=0.125 2023-10-04 01:41:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:05,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1480386.6666666667, ans=0.0 2023-10-04 01:41:08,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:41:13,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:13,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:14,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:41:14,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:41:17,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:17,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:18,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1480386.6666666667, ans=0.0 2023-10-04 01:41:19,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:22,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:41:22,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1480453.3333333333, ans=0.07 2023-10-04 01:41:23,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:41:25,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 01:41:29,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 01:41:29,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:29,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:41:29,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:30,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:41:30,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:31,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:34,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:41:36,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:41:39,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:41:41,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:41:43,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 01:41:43,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:41:43,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 01:41:44,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:41:45,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:41:48,628 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.009e+02 2.213e+02 2.557e+02 3.155e+02, threshold=4.427e+02, percent-clipped=0.0 2023-10-04 01:41:48,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:48,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:41:51,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 01:41:52,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:41:52,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:41:57,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:58,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:42:00,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:42:01,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:42:03,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:42:04,510 INFO [train.py:1046] (2/4) Epoch 42, batch 4300, loss[loss=0.171, simple_loss=0.2546, pruned_loss=0.0437, over 23927.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2348, pruned_loss=0.03735, over 4712678.74 frames. ], batch size: 86, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:42:04,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:42:05,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:42:05,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 01:42:07,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:42:07,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1480653.3333333333, ans=0.125 2023-10-04 01:42:12,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:42:12,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:42:16,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1480653.3333333333, ans=0.125 2023-10-04 01:42:17,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:42:24,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:42:24,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 01:42:24,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:42:26,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:42:26,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:42:26,296 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 01:42:29,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:42:31,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:42:34,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 01:42:34,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:42:34,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 01:42:37,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:42:39,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:42:42,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:42:42,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:42:43,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:42:44,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:42:46,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:42:46,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 01:42:46,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 01:42:49,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:42:52,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:42:52,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:42:52,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:42:54,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:42:54,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 01:42:54,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 01:42:54,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 01:42:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:42:54,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 01:42:55,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 01:42:57,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1480853.3333333333, ans=0.0 2023-10-04 01:42:59,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:43:00,960 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 01:43:02,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:43:03,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:03,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:43:06,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 01:43:06,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:43:06,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:07,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:43:07,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:43:07,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:43:11,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:43:11,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1480920.0, ans=0.125 2023-10-04 01:43:14,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:16,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:16,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:43:19,050 INFO [train.py:1046] (2/4) Epoch 42, batch 4350, loss[loss=0.164, simple_loss=0.2355, pruned_loss=0.04623, over 23513.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03765, over 4700921.83 frames. ], batch size: 285, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:43:21,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 01:43:21,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:43:22,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1480986.6666666667, ans=0.125 2023-10-04 01:43:26,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:43:27,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.57 vs. limit=22.5 2023-10-04 01:43:27,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:29,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:43:29,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:43:29,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1480986.6666666667, ans=0.125 2023-10-04 01:43:32,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:43:36,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:41,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:43:41,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:43:45,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:43:47,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:43:47,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:43:50,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1481120.0, ans=0.035 2023-10-04 01:43:51,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1481120.0, ans=0.0 2023-10-04 01:43:51,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1481120.0, ans=0.0 2023-10-04 01:43:52,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 01:43:52,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:43:53,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:54,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1481120.0, ans=0.125 2023-10-04 01:43:59,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:02,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 01:44:04,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:05,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:44:09,750 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 01:44:11,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:12,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:44:14,186 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 01:44:15,567 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 01:44:15,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:44:15,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:16,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:44:17,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:18,231 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.906e+02 2.107e+02 2.374e+02 3.775e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-04 01:44:18,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:44:18,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:44:19,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.00 vs. limit=15.0 2023-10-04 01:44:21,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 01:44:21,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:21,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:22,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:24,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 01:44:24,402 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 01:44:24,406 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 01:44:24,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 01:44:26,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1481253.3333333333, ans=0.0 2023-10-04 01:44:27,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:44:28,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:44:29,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:44:29,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:44:30,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 01:44:31,929 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 01:44:31,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:33,256 INFO [train.py:1046] (2/4) Epoch 42, batch 4400, loss[loss=0.1591, simple_loss=0.2494, pruned_loss=0.03444, over 24438.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2358, pruned_loss=0.03755, over 4702149.29 frames. ], batch size: 69, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:44:36,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:44:36,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:37,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1481320.0, ans=0.0 2023-10-04 01:44:38,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:40,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 01:44:40,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 01:44:41,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 01:44:41,058 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 01:44:42,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:44:42,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:44:44,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 01:44:47,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:48,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:48,412 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 01:44:51,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:44:51,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 01:44:51,686 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 01:44:54,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 01:44:54,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 01:44:55,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 01:44:55,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:56,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:57,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:59,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:45:00,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 01:45:00,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 01:45:01,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:45:04,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:45:05,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:45:05,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1481453.3333333333, ans=0.125 2023-10-04 01:45:07,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:07,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:45:07,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 01:45:07,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 01:45:10,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1481453.3333333333, ans=0.125 2023-10-04 01:45:13,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:19,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:45:19,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1481520.0, ans=0.125 2023-10-04 01:45:20,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 01:45:23,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:45:26,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:45:28,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:45:29,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 01:45:29,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:45:29,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:45:29,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:45:30,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1481520.0, ans=0.0 2023-10-04 01:45:31,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:45:34,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 01:45:36,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 01:45:38,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 01:45:38,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:45:38,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 01:45:38,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:45:41,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:45:44,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 01:45:47,212 INFO [train.py:1046] (2/4) Epoch 42, batch 4450, loss[loss=0.178, simple_loss=0.2505, pruned_loss=0.05272, over 22775.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2371, pruned_loss=0.03852, over 4687950.81 frames. ], batch size: 322, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:45:47,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:45:50,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:50,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:45:54,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.11 vs. limit=15.0 2023-10-04 01:45:57,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:45:57,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:45:58,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1481653.3333333333, ans=0.125 2023-10-04 01:45:59,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:01,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1481720.0, ans=0.125 2023-10-04 01:46:02,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:46:05,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:46:06,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:46:06,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 01:46:06,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:46:08,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:08,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:46:08,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:46:11,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:46:13,156 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:46:15,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:15,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:17,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:46:17,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:46:18,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:46:20,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1481786.6666666667, ans=0.125 2023-10-04 01:46:23,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:46:23,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 01:46:24,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 01:46:25,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:46:29,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:46:29,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 01:46:31,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:46:36,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:36,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 01:46:36,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:36,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:46:36,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:46:36,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:46:38,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:41,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:46:42,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 01:46:43,415 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=22.5 2023-10-04 01:46:44,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:46:45,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:46:45,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1481920.0, ans=0.015 2023-10-04 01:46:46,649 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.039e+02 2.233e+02 2.622e+02 3.908e+02, threshold=4.466e+02, percent-clipped=0.0 2023-10-04 01:46:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:46:49,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:49,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:46:54,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:46:57,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 01:46:58,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:47:02,285 INFO [train.py:1046] (2/4) Epoch 42, batch 4500, loss[loss=0.1489, simple_loss=0.2374, pruned_loss=0.03023, over 24643.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2366, pruned_loss=0.03812, over 4696652.29 frames. ], batch size: 65, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:47:02,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:47:03,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 01:47:03,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 01:47:05,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:47:09,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:47:11,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:47:11,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1481986.6666666667, ans=0.04949747468305833 2023-10-04 01:47:12,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:47:12,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:47:12,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:12,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1481986.6666666667, ans=0.2 2023-10-04 01:47:13,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:25,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:47:26,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:47:29,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:47:31,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:47:31,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:47:31,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1482120.0, ans=0.125 2023-10-04 01:47:38,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:47:42,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:47:42,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1482120.0, ans=0.125 2023-10-04 01:47:46,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:47:46,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1482186.6666666667, ans=0.125 2023-10-04 01:47:49,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:47:49,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 01:47:50,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:47:50,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:47:53,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:47:53,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:47:53,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1482186.6666666667, ans=0.0 2023-10-04 01:47:55,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:56,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 01:47:56,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:47:56,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:00,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:48:00,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:48:01,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:03,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1482253.3333333333, ans=0.2 2023-10-04 01:48:04,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:48:04,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:48:07,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 01:48:08,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 01:48:08,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 01:48:10,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1482253.3333333333, ans=0.125 2023-10-04 01:48:12,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 01:48:15,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1482320.0, ans=0.0 2023-10-04 01:48:16,355 INFO [train.py:1046] (2/4) Epoch 42, batch 4550, loss[loss=0.1474, simple_loss=0.2267, pruned_loss=0.03404, over 23294.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.236, pruned_loss=0.03781, over 4705054.00 frames. ], batch size: 119, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:48:16,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 01:48:16,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:48:20,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:48:20,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:48:22,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:48:27,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:48:30,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1482386.6666666667, ans=0.2 2023-10-04 01:48:31,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:48:31,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:48:31,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:48:31,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:34,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:48:34,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:48:38,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:48:41,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 01:48:41,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 01:48:42,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:48:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 01:48:47,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 01:48:47,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:48:47,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1482453.3333333333, ans=0.0 2023-10-04 01:48:49,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1482453.3333333333, ans=0.0 2023-10-04 01:48:50,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 01:48:52,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:48:55,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:55,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:56,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:48:57,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=11.32 vs. limit=12.0 2023-10-04 01:48:57,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 01:49:00,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:49:01,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1482520.0, ans=0.0 2023-10-04 01:49:04,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:05,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:49:06,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:49:06,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 01:49:08,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 01:49:08,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:49:09,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 01:49:10,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1482520.0, ans=0.1 2023-10-04 01:49:12,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 01:49:12,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:49:14,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:14,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:49:15,348 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 1.972e+02 2.159e+02 2.425e+02 3.623e+02, threshold=4.318e+02, percent-clipped=0.0 2023-10-04 01:49:15,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:15,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:49:17,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:49:18,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 01:49:20,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:49:20,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 01:49:20,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 01:49:20,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:49:20,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 01:49:23,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:49:23,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:49:25,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:49:25,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:25,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:49:27,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:49:28,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1482586.6666666667, ans=0.125 2023-10-04 01:49:30,390 INFO [train.py:1046] (2/4) Epoch 42, batch 4600, loss[loss=0.1689, simple_loss=0.2502, pruned_loss=0.04385, over 24681.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.235, pruned_loss=0.03761, over 4706384.65 frames. ], batch size: 73, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:49:30,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:49:33,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:34,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:49:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:49:36,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:49:37,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:49:37,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1482653.3333333333, ans=0.125 2023-10-04 01:49:38,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 01:49:40,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:49:42,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:49:44,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:49:47,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:52,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1482720.0, ans=0.125 2023-10-04 01:49:53,096 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.72 vs. limit=6.0 2023-10-04 01:49:53,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 01:49:54,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:56,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:00,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:50:00,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:50:05,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 01:50:05,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:50:05,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:50:07,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1482786.6666666667, ans=0.125 2023-10-04 01:50:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:12,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:50:14,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:50:15,704 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:50:18,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 01:50:20,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:50:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:25,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:50:27,800 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=15.0 2023-10-04 01:50:29,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:29,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 01:50:30,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:30,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 01:50:30,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:30,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:33,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:34,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:50:35,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:35,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 01:50:36,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 01:50:37,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 01:50:37,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:37,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:50:39,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:40,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:45,194 INFO [train.py:1046] (2/4) Epoch 42, batch 4650, loss[loss=0.1677, simple_loss=0.2376, pruned_loss=0.04893, over 23765.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.235, pruned_loss=0.03726, over 4717759.93 frames. ], batch size: 179, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:50:47,546 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.95 vs. limit=8.0 2023-10-04 01:50:51,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:50:52,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:50:52,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:52,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:50:52,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:52,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:50:54,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:56,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 01:51:01,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:51:02,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 01:51:02,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:51:03,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 01:51:03,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:51:03,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 01:51:03,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 01:51:03,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:04,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:51:09,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:51:11,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:11,296 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 01:51:12,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1483120.0, ans=0.0 2023-10-04 01:51:14,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:15,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 01:51:17,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:17,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:51:18,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 01:51:20,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:51:22,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:51:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:51:30,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:32,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:32,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:33,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:51:37,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 01:51:37,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 01:51:37,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 01:51:37,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 01:51:40,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:51:45,016 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.878e+02 2.086e+02 2.466e+02 3.529e+02, threshold=4.172e+02, percent-clipped=0.0 2023-10-04 01:51:47,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:51:47,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:51:48,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.30 vs. limit=12.0 2023-10-04 01:51:49,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 01:51:49,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:51:51,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:51:51,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:51:52,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:51:55,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:51:55,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:51:56,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:57,868 INFO [train.py:1046] (2/4) Epoch 42, batch 4700, loss[loss=0.1552, simple_loss=0.2415, pruned_loss=0.03444, over 24007.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03751, over 4719918.71 frames. ], batch size: 86, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:51:59,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:52:00,641 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.78 vs. limit=6.0 2023-10-04 01:52:01,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:52:01,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:52:01,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 01:52:02,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:52:04,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 01:52:05,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.41 vs. limit=10.0 2023-10-04 01:52:11,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:13,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:52:13,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1483386.6666666667, ans=0.125 2023-10-04 01:52:13,985 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.28 vs. limit=15.0 2023-10-04 01:52:14,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:52:14,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:52:14,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:52:18,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 01:52:19,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1483386.6666666667, ans=10.0 2023-10-04 01:52:20,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 01:52:22,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:23,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:52:23,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:52:24,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:28,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1483453.3333333333, ans=0.0 2023-10-04 01:52:30,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:52:32,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:52:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:52:39,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 01:52:39,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:52:40,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.03 vs. limit=22.5 2023-10-04 01:52:41,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1483520.0, ans=0.125 2023-10-04 01:52:41,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.62 vs. limit=15.0 2023-10-04 01:52:42,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:52:42,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1483520.0, ans=0.125 2023-10-04 01:52:46,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 01:52:48,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:52:52,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:52:54,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 01:52:56,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:52:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:52:58,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:58,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:52:58,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 01:53:00,263 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 01:53:01,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:53:04,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:04,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 01:53:05,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:08,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 01:53:11,534 INFO [train.py:1046] (2/4) Epoch 42, batch 4750, loss[loss=0.1654, simple_loss=0.2351, pruned_loss=0.04789, over 23586.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2365, pruned_loss=0.03799, over 4714729.08 frames. ], batch size: 256, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:53:11,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:53:12,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:17,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:17,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:53:20,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 01:53:20,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:53:23,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 01:53:26,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:53:26,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:53:26,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1483720.0, ans=0.2 2023-10-04 01:53:28,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:53:33,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 01:53:37,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:53:39,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 01:53:40,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:53:43,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:53:43,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:53:43,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:44,575 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 01:53:44,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 01:53:50,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 01:53:52,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:53:54,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:53:56,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:53:56,812 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 01:53:56,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:01,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:54:01,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1483853.3333333333, ans=0.1 2023-10-04 01:54:04,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:54:05,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 01:54:06,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 01:54:06,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:54:06,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:54:07,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:08,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:54:08,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 01:54:09,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 01:54:11,195 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.906e+02 2.067e+02 2.489e+02 4.239e+02, threshold=4.133e+02, percent-clipped=1.0 2023-10-04 01:54:11,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:11,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1483920.0, ans=0.125 2023-10-04 01:54:12,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:54:12,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 01:54:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:54:15,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:54:16,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:17,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:54:21,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:54:21,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 01:54:23,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 01:54:23,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 01:54:24,965 INFO [train.py:1046] (2/4) Epoch 42, batch 4800, loss[loss=0.1471, simple_loss=0.2338, pruned_loss=0.03017, over 24334.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2367, pruned_loss=0.03773, over 4719602.67 frames. ], batch size: 61, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:54:28,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:54:28,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1483986.6666666667, ans=0.05 2023-10-04 01:54:29,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:54:31,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 01:54:35,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:35,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:40,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:54:42,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:42,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:42,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 01:54:43,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:54:43,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:54:45,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:54:48,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:54:48,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1484053.3333333333, ans=0.5 2023-10-04 01:54:50,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:50,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:54:51,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:51,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 01:54:51,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:53,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:56,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:58,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.77 vs. limit=15.0 2023-10-04 01:54:59,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:55:00,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:55:00,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:55:00,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:55:03,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:03,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 01:55:03,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 01:55:06,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:06,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:55:06,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:55:06,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:55:06,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:55:10,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:55:10,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:55:10,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.47 vs. limit=15.0 2023-10-04 01:55:13,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:55:16,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:17,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:18,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1484186.6666666667, ans=0.0 2023-10-04 01:55:20,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 01:55:20,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:55:20,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:21,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:55:21,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:25,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1484253.3333333333, ans=0.1 2023-10-04 01:55:26,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:55:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:55:27,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:28,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:55:28,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:55:30,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:55:31,142 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.21 vs. limit=15.0 2023-10-04 01:55:33,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:33,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:34,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:55:34,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 01:55:36,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1484253.3333333333, ans=0.125 2023-10-04 01:55:37,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 01:55:37,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:55:37,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:55:38,756 INFO [train.py:1046] (2/4) Epoch 42, batch 4850, loss[loss=0.1699, simple_loss=0.2575, pruned_loss=0.04111, over 24068.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03794, over 4705737.77 frames. ], batch size: 86, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:55:38,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:55:38,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:40,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:40,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1484320.0, ans=0.125 2023-10-04 01:55:48,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 01:55:50,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:53,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:55:53,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:55:53,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:56,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1484386.6666666667, ans=0.125 2023-10-04 01:55:56,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1484386.6666666667, ans=0.125 2023-10-04 01:55:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:59,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:56:01,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:56:01,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 01:56:06,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:56:09,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:56:09,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:56:09,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:56:09,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 01:56:12,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:56:13,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:16,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 01:56:16,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 01:56:18,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:56:25,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:56:27,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 01:56:27,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:56:28,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:56:30,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:56:31,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 01:56:31,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:33,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 01:56:33,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:56:33,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:56:33,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1484520.0, ans=0.0 2023-10-04 01:56:35,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 01:56:36,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1484586.6666666667, ans=0.1 2023-10-04 01:56:40,322 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.976e+02 2.279e+02 2.618e+02 4.353e+02, threshold=4.559e+02, percent-clipped=2.0 2023-10-04 01:56:42,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1484586.6666666667, ans=0.0 2023-10-04 01:56:43,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1484586.6666666667, ans=0.0 2023-10-04 01:56:44,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:47,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1484586.6666666667, ans=0.0 2023-10-04 01:56:48,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:56:48,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:56:51,961 INFO [train.py:1046] (2/4) Epoch 42, batch 4900, loss[loss=0.1516, simple_loss=0.2284, pruned_loss=0.03742, over 23388.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2356, pruned_loss=0.03751, over 4714866.63 frames. ], batch size: 119, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:56:54,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 01:56:54,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:56:59,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:56:59,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:57:01,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:57:04,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 01:57:05,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1484720.0, ans=0.0 2023-10-04 01:57:08,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 01:57:11,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1484720.0, ans=0.1 2023-10-04 01:57:12,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 01:57:14,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 01:57:14,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:57:15,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:57:15,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:57:15,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:57:15,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:57:16,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 01:57:18,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 01:57:19,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:57:21,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:57:21,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:57:24,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:57:24,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:57:26,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:57:26,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 01:57:28,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:57:28,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:57:28,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 01:57:28,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 01:57:28,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1484786.6666666667, ans=0.1 2023-10-04 01:57:33,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 01:57:34,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:57:36,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:57:36,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:57:37,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:57:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 01:57:37,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:57:37,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 01:57:40,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:57:41,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:57:43,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:57:45,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 01:57:47,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:57:49,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:57:49,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 01:57:51,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.29 vs. limit=22.5 2023-10-04 01:57:54,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:57:56,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:57:58,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 01:57:59,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:57:59,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:58:01,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:05,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:58:07,045 INFO [train.py:1046] (2/4) Epoch 42, batch 4950, loss[loss=0.1676, simple_loss=0.2553, pruned_loss=0.03993, over 24522.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.0372, over 4708063.96 frames. ], batch size: 71, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:58:07,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:58:07,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:58:07,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 01:58:08,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:58:12,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:58:12,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:58:12,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1484986.6666666667, ans=0.05 2023-10-04 01:58:14,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 01:58:14,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 01:58:14,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:58:15,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 01:58:16,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:16,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:58:16,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:58:16,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:19,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:19,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:58:20,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1485053.3333333333, ans=0.125 2023-10-04 01:58:22,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:58:22,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:58:24,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:58:28,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:58:33,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:33,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:58:35,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:36,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:36,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:58:39,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 01:58:39,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 01:58:42,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:43,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:58:43,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:58:43,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:58:45,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:58:47,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:58:47,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:48,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1485120.0, ans=0.0 2023-10-04 01:58:49,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:58:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:58:54,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:54,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:55,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 01:58:55,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:58:57,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:59:00,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:59:00,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1485186.6666666667, ans=0.125 2023-10-04 01:59:03,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:59:03,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:59:03,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:59:03,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:59:04,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1485186.6666666667, ans=0.125 2023-10-04 01:59:05,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:59:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:59:07,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:59:07,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:59:09,058 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.943e+02 2.204e+02 2.524e+02 4.078e+02, threshold=4.408e+02, percent-clipped=0.0 2023-10-04 01:59:09,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 01:59:11,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-10-04 01:59:12,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:16,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 01:59:16,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:59:20,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-10-04 01:59:20,893 INFO [train.py:1046] (2/4) Epoch 42, batch 5000, loss[loss=0.1519, simple_loss=0.2309, pruned_loss=0.03649, over 23632.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2341, pruned_loss=0.03737, over 4706601.67 frames. ], batch size: 149, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:59:24,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:59:24,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:59:25,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 01:59:26,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=12.0 2023-10-04 01:59:27,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 01:59:28,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:59:30,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 01:59:31,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:59:31,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:59:31,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 01:59:33,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:59:33,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:59:36,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 01:59:36,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:36,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:59:37,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 01:59:37,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 01:59:39,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:59:39,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 01:59:40,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:59:40,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:40,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:59:40,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 01:59:40,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 01:59:43,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 01:59:43,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:59:44,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:44,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 01:59:46,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:59:49,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:49,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:49,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:59:49,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.61 vs. limit=10.0 2023-10-04 01:59:52,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 01:59:52,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:59:53,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:59:57,892 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 02:00:02,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:00:02,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:00:02,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:08,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 02:00:08,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:00:08,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:00:08,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:00:11,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 02:00:11,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:00:14,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:00:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:00:19,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 02:00:21,162 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.40 vs. limit=15.0 2023-10-04 02:00:23,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:31,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:00:34,918 INFO [train.py:1046] (2/4) Epoch 42, batch 5050, loss[loss=0.136, simple_loss=0.2227, pruned_loss=0.02461, over 24326.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2354, pruned_loss=0.03748, over 4718672.89 frames. ], batch size: 61, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:00:34,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:34,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:00:34,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:00:35,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:00:35,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:00:36,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:41,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:41,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 02:00:42,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:00:42,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1485653.3333333333, ans=0.125 2023-10-04 02:00:43,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:00:45,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:00:45,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1485653.3333333333, ans=0.1 2023-10-04 02:00:46,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 02:00:46,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:00:47,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:00:50,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:00:52,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:00:53,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:01:00,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 02:01:01,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:01:02,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:01:04,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 02:01:04,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:01:06,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:06,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:06,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:01:06,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 02:01:07,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 02:01:09,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:10,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:13,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:13,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 02:01:16,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:01:18,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 02:01:18,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:01:19,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:01:19,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:01:19,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:01:19,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1485853.3333333333, ans=0.2 2023-10-04 02:01:22,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:01:24,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:01:25,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:26,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:01:26,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:01:26,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 02:01:28,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:01:30,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:01:30,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.36 vs. limit=22.5 2023-10-04 02:01:34,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:01:36,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 02:01:36,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:01:37,599 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.948e+02 2.169e+02 2.465e+02 3.458e+02, threshold=4.337e+02, percent-clipped=0.0 2023-10-04 02:01:37,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:01:37,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:37,771 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 02:01:40,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:40,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 02:01:40,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:43,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:01:45,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:45,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 02:01:46,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 02:01:48,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:48,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:01:49,447 INFO [train.py:1046] (2/4) Epoch 42, batch 5100, loss[loss=0.1683, simple_loss=0.2414, pruned_loss=0.0476, over 23799.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03752, over 4726134.66 frames. ], batch size: 212, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:01:49,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:01:52,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 02:01:53,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:56,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 02:01:56,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 02:01:58,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:59,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:02:02,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:02:02,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 02:02:04,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 02:02:06,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1486053.3333333333, ans=0.2 2023-10-04 02:02:08,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:02:09,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:02:09,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=15.0 2023-10-04 02:02:11,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:02:16,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 02:02:16,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:02:17,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:02:17,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 02:02:19,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:20,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:20,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 02:02:22,256 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 02:02:22,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:24,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 02:02:24,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 02:02:28,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:02:31,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1486120.0, ans=0.0 2023-10-04 02:02:34,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:02:36,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 02:02:37,483 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 02:02:37,490 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 02:02:38,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 02:02:38,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:40,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 02:02:42,845 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.66 vs. limit=22.5 2023-10-04 02:02:43,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 02:02:46,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:02:47,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:02:49,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 02:02:51,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:02:52,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 02:02:53,331 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.10 vs. limit=22.5 2023-10-04 02:02:56,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1486253.3333333333, ans=0.1 2023-10-04 02:02:57,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:02:57,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:02:57,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:02:59,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:02:59,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:02:59,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:03:01,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 02:03:01,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 02:03:03,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 02:03:03,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:03:03,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 02:03:04,529 INFO [train.py:1046] (2/4) Epoch 42, batch 5150, loss[loss=0.1436, simple_loss=0.2204, pruned_loss=0.0334, over 23361.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2373, pruned_loss=0.03814, over 4716339.77 frames. ], batch size: 119, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:03:05,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:05,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 02:03:08,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:10,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:10,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.94 vs. limit=6.0 2023-10-04 02:03:14,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:03:14,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 02:03:17,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:17,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:03:20,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:03:20,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:03:20,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:03:20,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1486386.6666666667, ans=0.125 2023-10-04 02:03:21,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:03:21,802 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:03:21,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 02:03:23,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:03:23,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:03:24,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 02:03:25,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 02:03:27,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:03:33,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:03:34,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 02:03:39,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:03:44,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:03:44,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:48,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:03:48,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:03:49,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1486520.0, ans=0.125 2023-10-04 02:03:50,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 02:03:55,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:55,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1486520.0, ans=0.125 2023-10-04 02:03:56,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:03:56,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:03:59,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:01,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:04:02,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 02:04:04,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1486586.6666666667, ans=0.125 2023-10-04 02:04:05,033 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.977e+02 2.120e+02 2.430e+02 3.829e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-04 02:04:08,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:04:09,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:04:10,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.30 vs. limit=15.0 2023-10-04 02:04:11,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:04:11,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:04:13,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:04:13,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:04:13,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:04:13,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:04:16,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:04:16,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:04:17,994 INFO [train.py:1046] (2/4) Epoch 42, batch 5200, loss[loss=0.1628, simple_loss=0.2486, pruned_loss=0.03844, over 24092.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2376, pruned_loss=0.03807, over 4730068.56 frames. ], batch size: 80, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 02:04:18,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=12.0 2023-10-04 02:04:19,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:24,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 02:04:24,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:04:25,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:28,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:28,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:04:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:29,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 02:04:32,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:04:33,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:35,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 02:04:36,471 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-04 02:04:37,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:04:38,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:04:39,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 02:04:40,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 02:04:42,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 02:04:43,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 02:04:43,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:04:45,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:04:46,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 02:04:46,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:04:49,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:52,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 02:04:52,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 02:04:52,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 02:04:54,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1486786.6666666667, ans=0.1 2023-10-04 02:04:58,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 02:04:59,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:05:06,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:05:06,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:09,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 02:05:09,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:05:09,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:05:09,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:11,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:05:14,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:05:15,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:05:18,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:05:19,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:19,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:20,879 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.79 vs. limit=15.0 2023-10-04 02:05:23,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:24,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 02:05:24,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:05:24,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:05:27,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:29,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:05:30,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:05:31,689 INFO [train.py:1046] (2/4) Epoch 42, batch 5250, loss[loss=0.1415, simple_loss=0.2086, pruned_loss=0.0372, over 23541.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2356, pruned_loss=0.03792, over 4719081.79 frames. ], batch size: 256, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:05:33,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:05:37,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:37,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:05:38,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:05:42,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1486986.6666666667, ans=0.2 2023-10-04 02:05:43,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:45,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:05:45,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1487053.3333333333, ans=0.2 2023-10-04 02:05:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:05:46,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1487053.3333333333, ans=0.125 2023-10-04 02:05:47,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:05:51,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 02:05:51,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:51,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:30,439 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 1.997e+02 2.174e+02 2.692e+02 4.160e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 02:06:33,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1487253.3333333333, ans=0.09899494936611666 2023-10-04 02:06:39,928 INFO [train.py:1046] (2/4) Epoch 42, batch 5300, loss[loss=0.1568, simple_loss=0.2421, pruned_loss=0.0358, over 24667.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2339, pruned_loss=0.03771, over 4701586.07 frames. ], batch size: 65, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:06:48,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1487320.0, ans=0.0 2023-10-04 02:06:54,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:06:54,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 02:06:54,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 02:06:54,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:54,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:55,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:55,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:55,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:55,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:06:55,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:55,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:06:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:06:55,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 02:06:55,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 02:06:55,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 02:06:55,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:06:55,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 02:06:55,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 02:06:56,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:56,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:56,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:56,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:06:56,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:06:57,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:06:57,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:57,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:57,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:57,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:57,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:06:57,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:57,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:06:58,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 02:06:58,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:06:58,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:58,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 02:06:58,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 02:06:58,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:06:58,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:06:58,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 02:06:58,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 02:06:58,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:06:59,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:06:59,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:06:59,809 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 02:06:59,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 02:06:59,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:06:59,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:07:00,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 02:07:00,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 02:07:00,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 02:07:00,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:07:06,274 INFO [train.py:1046] (2/4) Epoch 43, batch 0, loss[loss=0.1968, simple_loss=0.27, pruned_loss=0.06181, over 19306.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.27, pruned_loss=0.06181, over 19306.00 frames. ], batch size: 388, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:07:06,275 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 02:07:14,296 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.6331, 2.4214, 3.9502, 3.1065], device='cuda:2') 2023-10-04 02:07:17,993 INFO [train.py:1078] (2/4) Epoch 43, validation: loss=0.318, simple_loss=0.2688, pruned_loss=0.1836, over 1125622.00 frames. 2023-10-04 02:07:17,993 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 02:07:18,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 02:07:18,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:07:18,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1487400.0, ans=0.0 2023-10-04 02:07:19,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:07:24,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:26,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:07:26,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:26,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 02:07:27,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 02:07:30,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:31,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:33,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1487466.6666666667, ans=0.95 2023-10-04 02:07:34,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:34,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:36,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:07:36,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:07:37,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 02:07:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:07:46,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:07:48,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:50,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 02:07:54,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:07:54,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:07:55,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:08:05,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:09,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 02:08:13,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 02:08:14,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:08:14,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:14,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:08:16,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:08:17,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1487666.6666666667, ans=0.2 2023-10-04 02:08:18,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 02:08:21,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:23,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:25,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:08:28,302 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 02:08:29,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:08:31,047 INFO [train.py:1046] (2/4) Epoch 43, batch 50, loss[loss=0.1702, simple_loss=0.2574, pruned_loss=0.0415, over 24622.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2384, pruned_loss=0.0374, over 1064739.68 frames. ], batch size: 68, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:08:32,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:08:35,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:08:35,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 02:08:36,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:08:37,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:08:39,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:08:39,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:08:42,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:08:43,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 02:08:43,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:51,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:08:52,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 02:08:54,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 02:08:56,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:08:57,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:08:57,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:59,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:09:00,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:09:00,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:09:00,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:09:06,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:09:07,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:07,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:09:08,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 02:09:11,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:09:12,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.28 vs. limit=12.0 2023-10-04 02:09:13,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:09:13,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 02:09:13,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:09:14,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 02:09:15,780 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.079e+02 2.255e+02 2.464e+02 4.467e+02, threshold=4.509e+02, percent-clipped=1.0 2023-10-04 02:09:17,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1487933.3333333333, ans=0.125 2023-10-04 02:09:21,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:09:21,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1487933.3333333333, ans=0.09899494936611666 2023-10-04 02:09:23,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:09:24,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:26,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:09:26,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:09:29,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 02:09:29,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 02:09:32,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:32,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:09:33,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:09:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:09:33,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 02:09:35,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 02:09:35,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 02:09:36,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:09:36,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:09:38,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 02:09:38,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 02:09:40,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:09:40,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:42,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:09:42,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:09:44,783 INFO [train.py:1046] (2/4) Epoch 43, batch 100, loss[loss=0.1656, simple_loss=0.2535, pruned_loss=0.03886, over 24586.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2371, pruned_loss=0.03659, over 1890567.14 frames. ], batch size: 71, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:09:44,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:09:48,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:09:50,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:09:52,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 02:09:52,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:55,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:09:55,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:09:55,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:55,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:09:57,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:09:57,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 02:10:00,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:10:00,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:01,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:01,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:10:04,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 02:10:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:07,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:07,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:10:10,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:10:12,940 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 02:10:12,955 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 02:10:14,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:10:14,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:10:17,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1488200.0, ans=0.0 2023-10-04 02:10:18,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:10:19,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:21,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:28,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:28,175 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 02:10:30,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 02:10:35,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:10:37,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:10:39,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:42,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:46,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:10:48,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:10:49,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:50,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:52,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:52,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:10:52,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:52,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 02:10:52,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 02:10:52,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:53,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:10:54,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1488333.3333333333, ans=0.0 2023-10-04 02:10:55,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:10:55,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:10:55,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 02:10:57,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:10:57,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:10:57,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:10:58,913 INFO [train.py:1046] (2/4) Epoch 43, batch 150, loss[loss=0.153, simple_loss=0.2262, pruned_loss=0.03991, over 23660.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2368, pruned_loss=0.0364, over 2530216.99 frames. ], batch size: 232, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:10:58,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:00,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:00,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:11:00,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:11:03,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:11:06,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:07,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:12,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:11:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:14,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:11:16,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:19,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 02:11:19,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1488466.6666666667, ans=0.2 2023-10-04 02:11:20,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 02:11:20,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 02:11:21,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:11:21,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:11:23,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:11:24,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:11:25,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:27,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:28,716 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 02:11:32,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:35,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:39,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:11:39,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 02:11:42,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-10-04 02:11:43,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:11:43,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:43,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:11:44,686 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.926e+02 2.079e+02 2.360e+02 3.858e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-04 02:11:44,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:11:46,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:11:46,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:11:46,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1488600.0, ans=0.125 2023-10-04 02:11:47,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:49,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 02:11:53,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:54,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:11:54,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:11:54,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:11:57,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:58,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 02:12:02,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:12:03,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:12:06,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:08,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.71 vs. limit=6.0 2023-10-04 02:12:09,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:12:09,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 02:12:09,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:12:10,547 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 02:12:12,461 INFO [train.py:1046] (2/4) Epoch 43, batch 200, loss[loss=0.1465, simple_loss=0.235, pruned_loss=0.02897, over 24568.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2383, pruned_loss=0.03719, over 3026934.91 frames. ], batch size: 71, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:12:13,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:12:15,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1488733.3333333333, ans=0.125 2023-10-04 02:12:18,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:12:18,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:12:20,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 02:12:22,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:22,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:23,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 02:12:23,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1488733.3333333333, ans=0.125 2023-10-04 02:12:25,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1488800.0, ans=0.125 2023-10-04 02:12:26,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:12:26,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1488800.0, ans=0.125 2023-10-04 02:12:27,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:27,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:12:30,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:12:31,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:12:31,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:31,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1488800.0, ans=0.0 2023-10-04 02:12:47,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:12:48,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:12:48,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:12:49,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:12:51,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:12:51,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:12:51,971 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.78 vs. limit=10.0 2023-10-04 02:12:53,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:12:55,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:12:56,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:56,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:12:58,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 02:12:58,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:12:58,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:01,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:13:01,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1488933.3333333333, ans=0.125 2023-10-04 02:13:03,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.20 vs. limit=22.5 2023-10-04 02:13:05,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1488933.3333333333, ans=0.0 2023-10-04 02:13:07,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:13:15,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:15,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:13:15,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1489000.0, ans=0.0 2023-10-04 02:13:22,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:24,816 INFO [train.py:1046] (2/4) Epoch 43, batch 250, loss[loss=0.1496, simple_loss=0.2328, pruned_loss=0.03315, over 23254.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2372, pruned_loss=0.03706, over 3409822.96 frames. ], batch size: 119, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:13:24,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 02:13:24,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:13:24,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:13:25,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:13:26,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 02:13:27,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:13:27,790 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 02:13:30,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:31,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:13:32,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.86 vs. limit=15.0 2023-10-04 02:13:33,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:35,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:38,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:13:38,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:39,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:13:42,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:13:51,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:13:54,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:13:54,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:13:54,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1489200.0, ans=0.125 2023-10-04 02:14:01,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:14:01,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:14:02,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:14:04,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:14:04,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:14:04,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:14:06,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:14:07,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:14:10,524 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.128e+02 2.316e+02 2.574e+02 3.711e+02, threshold=4.632e+02, percent-clipped=0.0 2023-10-04 02:14:11,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 02:14:11,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:14:14,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:14:14,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:14:15,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:14:15,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:14:16,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:14:16,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:14:18,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:19,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:14:20,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:23,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:14:25,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1489333.3333333333, ans=0.1 2023-10-04 02:14:26,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:26,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1489333.3333333333, ans=0.0 2023-10-04 02:14:29,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1489333.3333333333, ans=0.0 2023-10-04 02:14:30,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:14:33,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:35,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:14:38,827 INFO [train.py:1046] (2/4) Epoch 43, batch 300, loss[loss=0.1631, simple_loss=0.2495, pruned_loss=0.03836, over 24476.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2368, pruned_loss=0.03705, over 3703089.44 frames. ], batch size: 69, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:14:38,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 02:14:40,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:14:40,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:14:41,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 02:14:41,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:14:43,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:14:43,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 02:14:45,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1489400.0, ans=0.035 2023-10-04 02:14:47,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:48,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1489400.0, ans=0.05 2023-10-04 02:14:49,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:14:53,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:14:53,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 02:14:54,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:56,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:14:56,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 02:14:57,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:00,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:15:04,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:15:04,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 02:15:08,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 02:15:08,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:10,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:11,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:11,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 02:15:11,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:15:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:15:16,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:15:16,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:15:18,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1489533.3333333333, ans=0.125 2023-10-04 02:15:21,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:15:21,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 02:15:22,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:15:24,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 02:15:28,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:15:31,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:15:34,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:15:34,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 02:15:37,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:37,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:15:40,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:42,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:15:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 02:15:43,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:15:43,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:15:44,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 02:15:46,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:46,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:15:47,277 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.97 vs. limit=15.0 2023-10-04 02:15:48,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1489666.6666666667, ans=0.1 2023-10-04 02:15:49,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:49,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:15:50,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:15:52,147 INFO [train.py:1046] (2/4) Epoch 43, batch 350, loss[loss=0.1708, simple_loss=0.2587, pruned_loss=0.04145, over 24309.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2354, pruned_loss=0.03672, over 3930761.49 frames. ], batch size: 77, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:15:53,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:15:53,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 02:15:56,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:02,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:16:02,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:03,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:05,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 02:16:06,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:16:06,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 02:16:10,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:11,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 02:16:11,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:16:14,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 02:16:16,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:16:17,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:16:18,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:16:21,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:21,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:21,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:16:21,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:22,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:16:25,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:16:25,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:31,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:16:31,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:16:33,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:16:33,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:34,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1489933.3333333333, ans=0.125 2023-10-04 02:16:37,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.923e+02 2.044e+02 2.262e+02 2.758e+02, threshold=4.089e+02, percent-clipped=0.0 2023-10-04 02:16:39,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 02:16:39,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:39,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1489933.3333333333, ans=0.125 2023-10-04 02:16:43,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:43,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:16:45,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:16:47,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 02:16:49,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:16:51,144 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 02:16:51,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 02:16:52,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:53,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:16:53,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 02:16:55,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:16:56,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:16:58,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1490000.0, ans=0.0 2023-10-04 02:16:59,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:01,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:01,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:17:02,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:17:05,350 INFO [train.py:1046] (2/4) Epoch 43, batch 400, loss[loss=0.154, simple_loss=0.2395, pruned_loss=0.03427, over 24633.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03689, over 4096688.48 frames. ], batch size: 65, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:17:07,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:17:08,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:17:09,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 02:17:09,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:11,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:12,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:17:12,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:16,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:17,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:19,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 02:17:22,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 02:17:22,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:23,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 02:17:23,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:23,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1490133.3333333333, ans=0.1 2023-10-04 02:17:26,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:17:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:17:26,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 02:17:26,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:17:27,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:27,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:17:27,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:31,006 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 02:17:31,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 02:17:36,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:37,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:37,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 02:17:39,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 02:17:43,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:17:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:17:53,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 02:17:56,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:17:57,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 02:17:58,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:18:00,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:18:02,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 02:18:02,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1490266.6666666667, ans=0.2 2023-10-04 02:18:05,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:18:07,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:18:09,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:18:11,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:12,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 02:18:16,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:18:17,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 02:18:19,704 INFO [train.py:1046] (2/4) Epoch 43, batch 450, loss[loss=0.1536, simple_loss=0.2307, pruned_loss=0.03827, over 23455.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2348, pruned_loss=0.03695, over 4241046.63 frames. ], batch size: 285, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:18:19,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:18:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:18:21,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 02:18:22,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:18:22,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:18:24,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:18:25,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1490400.0, ans=0.125 2023-10-04 02:18:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 02:18:27,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:18:27,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:18:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:18:28,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 02:18:28,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:18:30,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:18:30,813 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.60 vs. limit=6.0 2023-10-04 02:18:33,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:18:43,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:43,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:18:44,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 02:18:46,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 02:18:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:18:51,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:53,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:18:56,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:18:56,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:18:58,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 02:18:59,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 02:18:59,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 02:19:01,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:01,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:02,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:19:02,977 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 02:19:02,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 02:19:04,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:19:06,076 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.944e+02 2.181e+02 2.558e+02 3.848e+02, threshold=4.361e+02, percent-clipped=0.0 2023-10-04 02:19:06,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:19:07,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 02:19:09,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1490600.0, ans=0.125 2023-10-04 02:19:10,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:19:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:19:11,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:19:11,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 02:19:14,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:19:18,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:19:18,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:19:20,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 02:19:22,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:19:23,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1490666.6666666667, ans=0.0 2023-10-04 02:19:24,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 02:19:25,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 02:19:25,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:19:26,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1490666.6666666667, ans=0.125 2023-10-04 02:19:30,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:19:31,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:19:33,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:19:34,856 INFO [train.py:1046] (2/4) Epoch 43, batch 500, loss[loss=0.1577, simple_loss=0.2457, pruned_loss=0.03482, over 23264.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03729, over 4338377.77 frames. ], batch size: 105, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:19:34,903 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 02:19:37,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:40,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:19:40,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:41,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 02:19:42,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 02:19:42,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:45,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:19:52,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 02:19:52,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:19:54,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:19:55,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:55,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:04,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:04,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:20:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:20:05,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 02:20:05,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:20:07,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1490866.6666666667, ans=0.0 2023-10-04 02:20:08,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=12.00 vs. limit=15.0 2023-10-04 02:20:08,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:20:10,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:20:10,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1490866.6666666667, ans=0.125 2023-10-04 02:20:11,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:20:11,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:11,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 02:20:14,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1490866.6666666667, ans=0.125 2023-10-04 02:20:15,276 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 02:20:16,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:20,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:20,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:20,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1490933.3333333333, ans=0.125 2023-10-04 02:20:21,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:21,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:20:24,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 02:20:27,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:20:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:33,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:36,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1491000.0, ans=0.125 2023-10-04 02:20:41,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:44,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 02:20:44,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:44,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:46,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1491066.6666666667, ans=0.125 2023-10-04 02:20:47,641 INFO [train.py:1046] (2/4) Epoch 43, batch 550, loss[loss=0.1584, simple_loss=0.2389, pruned_loss=0.03898, over 23450.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.236, pruned_loss=0.03768, over 4425266.06 frames. ], batch size: 119, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:20:47,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 02:20:47,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:20:47,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1491066.6666666667, ans=0.125 2023-10-04 02:20:49,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:50,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1491066.6666666667, ans=0.0 2023-10-04 02:20:53,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 02:20:53,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1491066.6666666667, ans=0.1 2023-10-04 02:20:55,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 02:20:55,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:57,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 02:20:57,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:20:57,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:58,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:58,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:58,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:21:00,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:21:01,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:21:02,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 02:21:02,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:21:07,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1491133.3333333333, ans=0.125 2023-10-04 02:21:08,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:08,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:21:11,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:13,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.59 vs. limit=15.0 2023-10-04 02:21:15,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 02:21:17,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 02:21:18,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:21:23,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:21:23,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:21:25,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:21:27,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:27,989 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 02:21:28,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:30,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:21:32,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:21:32,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:21:32,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:21:33,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:35,580 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.973e+02 2.172e+02 2.445e+02 3.955e+02, threshold=4.345e+02, percent-clipped=0.0 2023-10-04 02:21:35,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 02:21:38,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 02:21:38,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:21:38,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:21:38,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:21:38,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:21:42,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:21:42,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:21:45,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:21:45,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:45,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1491333.3333333333, ans=0.0 2023-10-04 02:21:46,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 02:21:48,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:21:48,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1491333.3333333333, ans=0.0 2023-10-04 02:21:49,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:21:51,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:21:52,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:53,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1491333.3333333333, ans=0.0 2023-10-04 02:21:54,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:21:54,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 02:22:00,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 02:22:01,637 INFO [train.py:1046] (2/4) Epoch 43, batch 600, loss[loss=0.1617, simple_loss=0.2468, pruned_loss=0.03833, over 23294.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03798, over 4479003.82 frames. ], batch size: 105, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:22:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 02:22:03,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:22:03,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:22:05,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:22:13,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:22:15,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 02:22:16,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1491466.6666666667, ans=0.125 2023-10-04 02:22:16,624 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.23 vs. limit=22.5 2023-10-04 02:22:17,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:22:19,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:22:19,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1491466.6666666667, ans=0.125 2023-10-04 02:22:21,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:23,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 02:22:23,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:22:29,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 02:22:32,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:22:32,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:32,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:22:34,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1491533.3333333333, ans=0.125 2023-10-04 02:22:37,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:22:38,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:22:38,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:41,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1491533.3333333333, ans=0.125 2023-10-04 02:22:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:22:46,491 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=15.0 2023-10-04 02:22:48,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:48,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:22:48,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:52,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1491600.0, ans=0.2 2023-10-04 02:22:56,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 02:23:00,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:23:00,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:23:02,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1491666.6666666667, ans=0.125 2023-10-04 02:23:05,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 02:23:05,818 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.54 vs. limit=22.5 2023-10-04 02:23:06,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:23:08,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 02:23:08,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:23:09,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:23:14,842 INFO [train.py:1046] (2/4) Epoch 43, batch 650, loss[loss=0.1617, simple_loss=0.248, pruned_loss=0.03772, over 24439.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2361, pruned_loss=0.03774, over 4547868.24 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:23:14,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 02:23:16,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:23:19,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:23:20,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:23:23,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:25,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 02:23:26,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:23:31,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:23:31,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:23:33,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1491800.0, ans=0.125 2023-10-04 02:23:34,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:37,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 02:23:37,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1491800.0, ans=0.0 2023-10-04 02:23:38,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:23:40,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:23:41,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:23:41,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:23:44,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:44,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:45,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:23:45,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:47,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1491866.6666666667, ans=0.1 2023-10-04 02:23:48,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:23:51,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:23:51,320 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 02:23:51,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:51,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:23:54,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:55,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:23:55,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:23:57,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:23:57,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 02:23:59,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:23:59,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:24:01,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:24:01,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:24:02,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:24:02,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1491933.3333333333, ans=0.125 2023-10-04 02:24:03,913 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.962e+02 2.234e+02 2.555e+02 3.806e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 02:24:04,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 02:24:04,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 02:24:05,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:05,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:24:05,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:24:05,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:24:05,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1491933.3333333333, ans=0.125 2023-10-04 02:24:08,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:24:15,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:15,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:24:16,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:24:17,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1492000.0, ans=0.0 2023-10-04 02:24:19,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:24:19,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:24:21,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:24:27,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:24:27,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:24:28,419 INFO [train.py:1046] (2/4) Epoch 43, batch 700, loss[loss=0.1643, simple_loss=0.2394, pruned_loss=0.04456, over 23813.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2357, pruned_loss=0.03747, over 4591997.83 frames. ], batch size: 150, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:24:28,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:24:28,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:24:33,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 02:24:33,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 02:24:36,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 02:24:37,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:38,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:24:40,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 02:24:45,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:24:48,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:24:49,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:24:51,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:24:53,306 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:24:54,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:55,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 02:24:55,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:24:59,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 02:25:04,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 02:25:06,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:25:06,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:25:08,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:25:11,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:25:11,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 02:25:15,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:15,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:25:15,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 02:25:18,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:25:20,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:20,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1492266.6666666667, ans=0.125 2023-10-04 02:25:21,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:25:22,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.85 vs. limit=15.0 2023-10-04 02:25:28,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:25:28,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 02:25:32,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 02:25:33,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1492333.3333333333, ans=0.125 2023-10-04 02:25:34,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 02:25:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:37,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:25:37,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:25:39,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:39,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 02:25:42,712 INFO [train.py:1046] (2/4) Epoch 43, batch 750, loss[loss=0.1605, simple_loss=0.2484, pruned_loss=0.03626, over 24559.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03732, over 4618351.40 frames. ], batch size: 71, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:25:42,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 02:25:44,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 02:25:44,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 02:25:44,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 02:25:45,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 02:25:45,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:25:48,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 02:25:48,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:50,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:25:51,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:25:54,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:54,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:25:54,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:25:56,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:25:58,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:26:00,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:26:03,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:26:03,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:26:03,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 02:26:05,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:26:05,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:26:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:26:09,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:26:09,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 02:26:09,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:26:12,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 02:26:12,565 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 02:26:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 02:26:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:26:13,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:26:15,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:26:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:26:23,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:23,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:26:25,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:26:26,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:26:28,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 02:26:28,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:26:31,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 02:26:31,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:26:32,460 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.914e+02 2.119e+02 2.380e+02 3.754e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-04 02:26:35,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:26:35,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 02:26:35,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:41,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:26:41,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:26:43,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:26:44,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:26:48,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 02:26:48,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:26:50,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:26:52,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:26:52,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:26:56,301 INFO [train.py:1046] (2/4) Epoch 43, batch 800, loss[loss=0.1688, simple_loss=0.261, pruned_loss=0.03826, over 24423.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2362, pruned_loss=0.03718, over 4640896.05 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:26:56,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:56,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:26:56,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1492733.3333333333, ans=0.125 2023-10-04 02:27:00,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1492733.3333333333, ans=0.125 2023-10-04 02:27:05,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:27:05,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:06,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:27:07,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:27:08,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:08,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:08,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:11,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1492800.0, ans=0.125 2023-10-04 02:27:11,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1492800.0, ans=0.1 2023-10-04 02:27:13,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:14,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:27:17,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 02:27:18,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:18,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:27:19,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:27:19,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:27:19,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 02:27:19,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:20,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1492800.0, ans=0.125 2023-10-04 02:27:21,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 02:27:24,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:27,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:28,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:27:28,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:27:33,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:33,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:35,189 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.59 vs. limit=15.0 2023-10-04 02:27:37,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:27:39,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:27:39,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 02:27:39,197 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 02:27:40,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 02:27:40,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:27:40,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:27:43,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:43,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:27:46,740 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 02:27:48,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 02:27:49,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:27:50,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:27:54,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:27:58,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:59,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 02:27:59,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:28:02,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 02:28:08,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:28:09,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.10 vs. limit=22.5 2023-10-04 02:28:10,117 INFO [train.py:1046] (2/4) Epoch 43, batch 850, loss[loss=0.1588, simple_loss=0.2543, pruned_loss=0.03166, over 24698.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2366, pruned_loss=0.03739, over 4660331.79 frames. ], batch size: 73, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:28:11,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:28:11,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1493066.6666666667, ans=0.0 2023-10-04 02:28:13,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 02:28:13,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:28:16,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:28:16,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 02:28:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:18,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:28:20,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:20,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1493066.6666666667, ans=0.2 2023-10-04 02:28:21,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:28:23,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:28:23,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 02:28:24,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 02:28:24,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 02:28:25,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-10-04 02:28:25,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:28:26,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:28:27,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:27,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:28:29,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:28:32,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:33,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:28:33,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 02:28:35,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1493133.3333333333, ans=0.0 2023-10-04 02:28:36,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 02:28:38,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:39,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 02:28:42,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 02:28:44,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 02:28:46,306 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 02:28:47,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:28:47,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:28:47,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:28:50,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:50,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 02:28:54,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:28:56,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:28:56,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:28:56,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:28:57,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:28:59,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:29:00,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.958e+02 2.154e+02 2.504e+02 4.006e+02, threshold=4.308e+02, percent-clipped=0.0 2023-10-04 02:29:00,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 02:29:00,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1493266.6666666667, ans=0.05 2023-10-04 02:29:05,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:29:05,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:29:06,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:29:06,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:29:07,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:29:13,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:29:15,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:29:16,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:29:16,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:16,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:29:22,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.37 vs. limit=15.0 2023-10-04 02:29:23,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1493333.3333333333, ans=0.05 2023-10-04 02:29:24,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:29:25,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:29:26,255 INFO [train.py:1046] (2/4) Epoch 43, batch 900, loss[loss=0.1528, simple_loss=0.2391, pruned_loss=0.03327, over 24430.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2371, pruned_loss=0.03754, over 4672957.28 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:29:26,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 02:29:27,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:29:27,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:29:27,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 02:29:32,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:29:35,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=1493400.0, ans=0.1 2023-10-04 02:29:37,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:37,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 02:29:39,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:29:40,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 02:29:41,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 02:29:43,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:29:43,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:29:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:29:43,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:29:44,164 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.31 vs. limit=22.5 2023-10-04 02:29:46,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1493466.6666666667, ans=0.1 2023-10-04 02:29:49,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1493466.6666666667, ans=0.0 2023-10-04 02:29:52,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:29:52,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:53,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:29:56,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:29:57,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1493533.3333333333, ans=0.125 2023-10-04 02:30:02,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 02:30:03,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:30:06,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:30:07,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:30:07,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1493533.3333333333, ans=0.0 2023-10-04 02:30:08,354 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 02:30:09,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 02:30:09,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1493600.0, ans=0.125 2023-10-04 02:30:14,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:30:14,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:30:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:30:23,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:23,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:30:24,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 02:30:24,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:30:27,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 02:30:30,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:30:30,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:31,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1493666.6666666667, ans=0.125 2023-10-04 02:30:33,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:30:33,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:30:33,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1493666.6666666667, ans=0.2 2023-10-04 02:30:36,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 02:30:36,299 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 02:30:37,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:30:37,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 02:30:40,916 INFO [train.py:1046] (2/4) Epoch 43, batch 950, loss[loss=0.1527, simple_loss=0.2214, pruned_loss=0.04201, over 23749.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.237, pruned_loss=0.03764, over 4675460.93 frames. ], batch size: 232, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:30:42,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:43,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 02:30:50,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:30:50,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1493733.3333333333, ans=0.125 2023-10-04 02:30:51,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:51,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:53,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:30:54,775 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 02:30:57,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:57,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:30:58,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:30:58,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:30:58,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 02:31:00,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:31:02,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:03,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 02:31:04,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:31:09,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:09,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:31:09,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:31:11,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 02:31:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:31:14,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1493866.6666666667, ans=0.125 2023-10-04 02:31:15,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:31:17,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:31:21,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:31:21,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:31:26,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 02:31:26,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-04 02:31:27,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 02:31:27,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:31:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:31:30,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:30,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:31:32,266 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 1.990e+02 2.144e+02 2.470e+02 4.825e+02, threshold=4.288e+02, percent-clipped=1.0 2023-10-04 02:31:33,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 02:31:33,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:31:36,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:31:36,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:36,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 02:31:36,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:31:36,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:31:36,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 02:31:42,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:31:46,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:31:46,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1494000.0, ans=0.2 2023-10-04 02:31:50,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:31:52,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 02:31:52,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 02:31:55,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:56,472 INFO [train.py:1046] (2/4) Epoch 43, batch 1000, loss[loss=0.1372, simple_loss=0.1985, pruned_loss=0.03795, over 19339.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2367, pruned_loss=0.03773, over 4676768.49 frames. ], batch size: 390, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:31:59,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 02:31:59,917 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-10-04 02:32:00,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:04,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:32:05,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 02:32:05,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 02:32:07,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1494066.6666666667, ans=0.125 2023-10-04 02:32:11,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:11,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:32:12,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:14,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 02:32:14,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1494133.3333333333, ans=0.0 2023-10-04 02:32:19,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 02:32:20,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 02:32:20,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:32:22,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 02:32:22,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1494133.3333333333, ans=0.125 2023-10-04 02:32:24,849 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 02:32:24,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 02:32:26,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:33,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:35,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:32:35,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:35,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:36,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 02:32:36,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:32:38,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:32:38,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:39,443 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 02:32:42,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 02:32:44,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 02:32:46,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 02:32:48,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:32:54,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:54,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:32:56,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:57,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:32:59,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 02:32:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:32:59,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 02:33:00,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 02:33:02,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:33:02,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:33:02,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1494333.3333333333, ans=0.125 2023-10-04 02:33:04,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:33:08,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:33:09,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:33:10,793 INFO [train.py:1046] (2/4) Epoch 43, batch 1050, loss[loss=0.1592, simple_loss=0.2363, pruned_loss=0.041, over 23127.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2343, pruned_loss=0.03759, over 4678637.43 frames. ], batch size: 105, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:33:13,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:33:15,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:33:16,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:33:18,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:33:19,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:33:22,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:33:22,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:33:25,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:33:26,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:33:26,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:33:28,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:33:28,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 02:33:29,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:33:29,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 02:33:32,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:33:32,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 02:33:32,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:33:39,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:33:39,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:33:39,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:33:40,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1494533.3333333333, ans=0.2 2023-10-04 02:33:42,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 02:33:42,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 02:33:42,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:33:44,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1494533.3333333333, ans=0.0 2023-10-04 02:33:45,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 02:33:50,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 02:33:50,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:33:50,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1494533.3333333333, ans=0.1 2023-10-04 02:33:52,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1494533.3333333333, ans=0.2 2023-10-04 02:33:53,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:33:55,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 02:33:55,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:33:56,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.40 vs. limit=15.0 2023-10-04 02:33:56,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:33:59,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:34:01,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1494600.0, ans=0.0 2023-10-04 02:34:02,038 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.937e+02 2.146e+02 2.350e+02 6.827e+02, threshold=4.291e+02, percent-clipped=1.0 2023-10-04 02:34:03,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 02:34:04,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 02:34:06,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 02:34:06,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:34:06,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:34:08,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 02:34:12,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:34:12,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:34:14,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:34:14,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:34:14,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:34:19,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1494666.6666666667, ans=0.1 2023-10-04 02:34:20,031 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.44 vs. limit=22.5 2023-10-04 02:34:20,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:34:20,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 02:34:22,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:34:22,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 02:34:22,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 02:34:22,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:34:25,352 INFO [train.py:1046] (2/4) Epoch 43, batch 1100, loss[loss=0.1474, simple_loss=0.2265, pruned_loss=0.03418, over 24444.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2341, pruned_loss=0.03738, over 4680288.70 frames. ], batch size: 58, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:34:26,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:34:28,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1494733.3333333333, ans=0.1 2023-10-04 02:34:32,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:34:36,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:34:37,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:34:37,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:34:37,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 02:34:39,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:34:40,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:34:43,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:34:45,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:34:47,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 02:34:47,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:34:48,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:34:48,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:34:50,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:34:53,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:34:59,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:35:01,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 02:35:02,427 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 02:35:02,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:02,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1494866.6666666667, ans=0.0 2023-10-04 02:35:05,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:05,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:35:05,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:35:06,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 02:35:06,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:35:06,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:35:06,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:35:08,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:08,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 02:35:08,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1494933.3333333333, ans=0.125 2023-10-04 02:35:08,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-10-04 02:35:14,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:35:15,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 02:35:15,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:35:22,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:35:24,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1495000.0, ans=0.125 2023-10-04 02:35:25,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 02:35:25,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:35:25,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:27,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:35:28,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:35:29,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 02:35:29,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:35:31,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:35:31,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 02:35:32,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:35:32,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 02:35:32,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1495000.0, ans=0.0 2023-10-04 02:35:33,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:35:33,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:35:35,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:35:39,334 INFO [train.py:1046] (2/4) Epoch 43, batch 1150, loss[loss=0.1461, simple_loss=0.2386, pruned_loss=0.02677, over 24314.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2346, pruned_loss=0.038, over 4678905.89 frames. ], batch size: 74, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:35:41,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:35:42,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:35:44,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:35:45,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:35:45,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 02:35:47,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:35:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 02:35:50,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1495066.6666666667, ans=0.0 2023-10-04 02:35:51,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:35:51,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:35:54,229 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.33 vs. limit=10.0 2023-10-04 02:35:59,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 02:36:01,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:36:03,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:36:04,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:04,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 02:36:04,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:36:06,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:36:10,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 02:36:11,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:36:12,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:36:18,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1495200.0, ans=0.125 2023-10-04 02:36:22,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:25,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1495266.6666666667, ans=0.1 2023-10-04 02:36:28,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:28,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 02:36:30,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:30,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:31,604 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.978e+02 2.214e+02 2.534e+02 4.016e+02, threshold=4.429e+02, percent-clipped=0.0 2023-10-04 02:36:36,097 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 02:36:38,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:44,549 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 02:36:49,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:36:52,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:36:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:36:52,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:36:53,396 INFO [train.py:1046] (2/4) Epoch 43, batch 1200, loss[loss=0.166, simple_loss=0.2403, pruned_loss=0.04586, over 23875.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.03769, over 4706329.38 frames. ], batch size: 179, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:36:55,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:36:58,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1495400.0, ans=0.5 2023-10-04 02:37:02,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:37:02,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:37:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:03,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:37:04,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:37:06,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:37:07,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:07,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:37:11,483 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 02:37:12,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 02:37:17,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:37:18,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:37:20,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:23,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:37:23,385 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 02:37:25,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:25,917 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.06 vs. limit=6.0 2023-10-04 02:37:32,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:37:32,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:37:32,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 02:37:33,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:37:36,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 02:37:37,254 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.48 vs. limit=12.0 2023-10-04 02:37:40,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 02:37:40,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:42,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:37:43,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:37:45,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:37:45,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:46,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:37:46,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:37:46,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 02:37:47,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:37:47,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:37:47,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:37:51,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:51,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:37:53,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:37:55,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:37:57,834 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.01 vs. limit=15.0 2023-10-04 02:37:58,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 02:38:01,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 02:38:03,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:38:06,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:38:07,762 INFO [train.py:1046] (2/4) Epoch 43, batch 1250, loss[loss=0.1296, simple_loss=0.2125, pruned_loss=0.02336, over 19290.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2361, pruned_loss=0.03802, over 4710638.46 frames. ], batch size: 42, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:38:07,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:38:10,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:38:10,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 02:38:14,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:38:16,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:17,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 02:38:17,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1495733.3333333333, ans=0.125 2023-10-04 02:38:18,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:38:20,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:38:24,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:38:26,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:28,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:38:28,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:38:31,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1495800.0, ans=0.125 2023-10-04 02:38:32,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:38:34,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 02:38:34,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:38:34,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:38:38,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:38:38,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:41,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:42,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:38:46,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 02:38:46,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:38:49,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:38:50,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 02:38:52,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:52,722 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 02:38:52,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:52,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:55,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:58,746 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.956e+02 2.157e+02 2.335e+02 3.543e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-04 02:38:58,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:58,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:39:00,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 02:39:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 02:39:01,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 02:39:02,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1495933.3333333333, ans=0.125 2023-10-04 02:39:03,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:04,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 02:39:04,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:39:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 02:39:08,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:39:11,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 02:39:11,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:39:12,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:39:12,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 02:39:12,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:39:15,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 02:39:16,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:39:18,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:39:18,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:39:20,696 INFO [train.py:1046] (2/4) Epoch 43, batch 1300, loss[loss=0.1648, simple_loss=0.251, pruned_loss=0.03934, over 23280.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2368, pruned_loss=0.03778, over 4716866.30 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:39:20,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:39:24,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1496066.6666666667, ans=0.125 2023-10-04 02:39:25,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:39:26,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 02:39:30,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:32,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:39:32,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:39:33,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:39:35,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:39:36,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 02:39:38,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1496133.3333333333, ans=0.125 2023-10-04 02:39:41,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:39:42,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:39:42,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1496133.3333333333, ans=0.025 2023-10-04 02:39:43,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 02:39:47,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:39:49,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1496200.0, ans=0.125 2023-10-04 02:39:50,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:39:50,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:39:52,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:54,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:39:55,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:39:55,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:39:56,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 02:40:03,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:40:03,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:40:04,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 02:40:06,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:40:07,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:40:09,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:40:09,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 02:40:11,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:40:11,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 02:40:12,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:40:16,054 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.40 vs. limit=15.0 2023-10-04 02:40:17,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:40:17,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:40:20,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 02:40:22,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 02:40:24,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 02:40:26,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1496333.3333333333, ans=0.0 2023-10-04 02:40:28,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:40:30,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 02:40:30,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:40:36,453 INFO [train.py:1046] (2/4) Epoch 43, batch 1350, loss[loss=0.1676, simple_loss=0.2556, pruned_loss=0.03979, over 24466.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.236, pruned_loss=0.03769, over 4718798.90 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:40:37,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 02:40:40,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:40:42,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:40:45,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:40:45,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:40:46,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:40:46,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:40:51,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:40:52,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 02:40:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:40:55,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:40:57,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 02:40:58,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:41:01,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:41:01,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 02:41:03,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 02:41:06,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 02:41:06,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:07,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 02:41:11,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1496533.3333333333, ans=0.125 2023-10-04 02:41:12,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.66 vs. limit=15.0 2023-10-04 02:41:18,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:21,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=15.0 2023-10-04 02:41:22,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1496600.0, ans=0.125 2023-10-04 02:41:27,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:28,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:29,145 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.909e+02 2.129e+02 2.419e+02 3.786e+02, threshold=4.258e+02, percent-clipped=0.0 2023-10-04 02:41:29,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 02:41:32,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:32,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 02:41:32,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:41:32,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:41:32,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1496600.0, ans=0.0 2023-10-04 02:41:36,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:41:36,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1496666.6666666667, ans=0.95 2023-10-04 02:41:37,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 02:41:38,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:41:44,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 02:41:47,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 02:41:50,247 INFO [train.py:1046] (2/4) Epoch 43, batch 1400, loss[loss=0.1473, simple_loss=0.234, pruned_loss=0.0303, over 24507.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.03748, over 4702526.12 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:41:53,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 02:41:53,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1496733.3333333333, ans=0.07 2023-10-04 02:41:54,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:57,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:41:57,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:42:02,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 02:42:03,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 02:42:05,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1496800.0, ans=0.2 2023-10-04 02:42:08,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1496800.0, ans=0.0 2023-10-04 02:42:14,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:42:15,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:42:18,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:42:18,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:42:24,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:42:24,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 02:42:32,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:32,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:36,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 02:42:36,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1496933.3333333333, ans=0.125 2023-10-04 02:42:37,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:42:37,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:42:37,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:42:39,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:42:40,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:42:40,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:42:42,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:42:43,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 02:42:43,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:42:49,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:52,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:42:55,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1497000.0, ans=0.2 2023-10-04 02:42:58,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 02:42:58,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:43:00,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:43:03,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 02:43:03,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:04,346 INFO [train.py:1046] (2/4) Epoch 43, batch 1450, loss[loss=0.149, simple_loss=0.2331, pruned_loss=0.03241, over 24665.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2351, pruned_loss=0.03748, over 4697952.39 frames. ], batch size: 65, lr: 2.37e-03, grad_scale: 4.0 2023-10-04 02:43:05,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:43:08,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:43:10,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:43:10,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:10,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 02:43:14,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:14,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:43:16,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:43:16,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 02:43:18,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:43:18,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 02:43:18,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:19,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:19,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 02:43:21,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:43:21,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1497133.3333333333, ans=0.1 2023-10-04 02:43:22,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:43:22,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 02:43:22,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:24,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:43:25,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:28,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:30,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.79 vs. limit=15.0 2023-10-04 02:43:31,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:43:31,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:43:34,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:34,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:36,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:36,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:43:36,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:36,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:43:41,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 02:43:41,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1497200.0, ans=0.125 2023-10-04 02:43:43,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:43:46,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 02:43:47,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:43:49,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:43:51,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:43:52,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 02:43:56,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:43:57,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 02:43:59,138 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.025e+02 2.199e+02 2.536e+02 7.667e+02, threshold=4.399e+02, percent-clipped=1.0 2023-10-04 02:44:00,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 02:44:00,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1497266.6666666667, ans=0.1 2023-10-04 02:44:02,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:05,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:44:06,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:44:08,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 02:44:09,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 02:44:10,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 02:44:12,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:12,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:44:18,497 INFO [train.py:1046] (2/4) Epoch 43, batch 1500, loss[loss=0.1649, simple_loss=0.2408, pruned_loss=0.04445, over 23778.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03738, over 4710743.48 frames. ], batch size: 164, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:44:23,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 02:44:23,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:44:23,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:44:24,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:26,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:44:26,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:44:27,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 02:44:29,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:44:29,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:44:29,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:44:30,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:44:32,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:44:33,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:44:40,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:44:40,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 02:44:41,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:44:41,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:44:43,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:46,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1497533.3333333333, ans=0.125 2023-10-04 02:44:47,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 02:44:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 02:44:51,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:51,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 02:44:53,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:44:56,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:44:57,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:57,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:44:59,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 02:44:59,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:44:59,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:45:00,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 02:45:02,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:45:06,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:45:06,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 02:45:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:45:12,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:45:16,920 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 02:45:18,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:18,277 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 02:45:19,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:45:21,152 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 02:45:22,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:45:27,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 02:45:27,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1497666.6666666667, ans=0.1 2023-10-04 02:45:28,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:31,670 INFO [train.py:1046] (2/4) Epoch 43, batch 1550, loss[loss=0.1464, simple_loss=0.2326, pruned_loss=0.03014, over 24595.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03744, over 4708801.92 frames. ], batch size: 60, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:45:31,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:45:31,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:33,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:45:33,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:34,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:45:36,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 02:45:36,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 02:45:36,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:45:38,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 02:45:39,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 02:45:40,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:45:42,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:42,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:45:42,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:45:43,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:43,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 02:45:48,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:48,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:45:49,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:45:51,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:45:52,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 02:45:53,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:45:53,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 02:45:53,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 02:45:53,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 02:45:55,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:55,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:45:59,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1497800.0, ans=0.0 2023-10-04 02:46:00,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:46:02,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 02:46:02,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 02:46:10,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:13,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:46:13,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:46:13,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:46:14,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 02:46:19,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:46:19,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1497933.3333333333, ans=0.125 2023-10-04 02:46:22,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:23,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:46:26,649 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.915e+02 2.072e+02 2.347e+02 3.023e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-04 02:46:26,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:46:26,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:28,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 02:46:28,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:46:29,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:46:29,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:29,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1498000.0, ans=0.125 2023-10-04 02:46:31,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 02:46:31,596 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 02:46:34,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:46:39,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 02:46:43,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:46:44,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:44,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 02:46:45,973 INFO [train.py:1046] (2/4) Epoch 43, batch 1600, loss[loss=0.1488, simple_loss=0.2449, pruned_loss=0.02637, over 24425.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2368, pruned_loss=0.03783, over 4714974.32 frames. ], batch size: 69, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:46:47,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:46:48,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1498066.6666666667, ans=0.0 2023-10-04 02:46:49,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:46:49,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:46:49,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:46:50,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:46:51,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1498066.6666666667, ans=0.05 2023-10-04 02:46:54,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:46:54,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 02:46:56,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 02:46:57,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 02:46:59,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:47:01,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 02:47:02,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:47:04,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:47:09,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:47:11,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 02:47:14,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:47:16,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 02:47:16,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 02:47:22,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 02:47:29,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:47:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 02:47:31,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:47:31,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:47:31,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:47:32,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 02:47:35,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1498266.6666666667, ans=0.0 2023-10-04 02:47:37,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 02:47:37,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1498266.6666666667, ans=0.0 2023-10-04 02:47:39,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:47:39,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:40,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:47:42,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:47:43,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:47:44,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:47:52,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:52,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:47:52,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1498333.3333333333, ans=0.125 2023-10-04 02:47:54,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 02:47:54,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:47:56,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 02:48:00,935 INFO [train.py:1046] (2/4) Epoch 43, batch 1650, loss[loss=0.1508, simple_loss=0.2363, pruned_loss=0.03259, over 24450.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2373, pruned_loss=0.03808, over 4717321.75 frames. ], batch size: 63, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:48:03,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:05,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:48:05,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:48:05,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 02:48:05,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 02:48:05,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 02:48:05,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 02:48:09,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.09 vs. limit=6.0 2023-10-04 02:48:09,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:48:09,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:48:09,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:48:09,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:48:14,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:16,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 02:48:17,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:48:19,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:48:19,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:48:19,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:48:19,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 02:48:19,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1498466.6666666667, ans=0.125 2023-10-04 02:48:20,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 02:48:26,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:48:27,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:48:27,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1498466.6666666667, ans=0.0 2023-10-04 02:48:35,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 02:48:35,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:38,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 02:48:41,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:48:43,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:48:43,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:48:43,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:48:45,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:48:45,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:49,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:49,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:50,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:48:50,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:48:50,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:48:51,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:48:54,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:48:55,938 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.993e+02 2.180e+02 2.514e+02 3.925e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-04 02:48:56,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 02:48:57,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:48:57,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 02:49:00,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 02:49:00,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 02:49:00,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:02,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:49:02,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:49:03,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:49:03,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 02:49:06,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:49:08,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:49:08,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:49:11,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 02:49:14,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:49:14,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:49:14,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 02:49:15,716 INFO [train.py:1046] (2/4) Epoch 43, batch 1700, loss[loss=0.145, simple_loss=0.213, pruned_loss=0.03849, over 23610.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2369, pruned_loss=0.03828, over 4702750.43 frames. ], batch size: 256, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:49:15,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:49:15,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:49:15,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:49:20,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:49:20,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:49:21,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 02:49:23,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:49:31,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:49:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:49:36,501 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-10-04 02:49:41,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:49:41,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:49:41,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:49:41,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:49:44,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 02:49:46,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:49:46,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:48,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:49:50,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:49:52,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 02:49:52,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 02:49:52,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1498866.6666666667, ans=0.125 2023-10-04 02:49:54,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:55,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 02:49:56,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:50:03,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1498933.3333333333, ans=0.0 2023-10-04 02:50:04,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:04,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:50:06,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:50:06,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 02:50:07,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:50:10,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:10,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 02:50:11,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:50:11,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:11,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:11,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:13,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:13,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:50:14,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:16,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:50:16,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:18,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1499000.0, ans=0.2 2023-10-04 02:50:20,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:50:21,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 02:50:24,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:24,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:50:26,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 02:50:29,703 INFO [train.py:1046] (2/4) Epoch 43, batch 1750, loss[loss=0.1689, simple_loss=0.2559, pruned_loss=0.04089, over 24065.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2344, pruned_loss=0.03769, over 4689155.55 frames. ], batch size: 80, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:50:31,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:34,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:34,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:50:35,497 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.06 vs. limit=15.0 2023-10-04 02:50:36,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 02:50:36,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:39,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:50:39,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:40,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1499066.6666666667, ans=0.0 2023-10-04 02:50:43,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 02:50:45,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 02:50:48,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:50,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:50:53,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 02:50:53,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 02:50:55,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=15.0 2023-10-04 02:50:56,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:50:56,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 02:50:56,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1499133.3333333333, ans=0.2 2023-10-04 02:51:04,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:51:07,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:07,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:51:12,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:12,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:51:14,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:51:15,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1499266.6666666667, ans=0.125 2023-10-04 02:51:16,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:19,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:51:20,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:51:20,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 02:51:22,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:51:24,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 02:51:24,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:51:26,923 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 2.005e+02 2.231e+02 2.661e+02 3.753e+02, threshold=4.462e+02, percent-clipped=0.0 2023-10-04 02:51:27,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:51:27,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:51:29,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:51:29,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:51:31,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:33,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:51:36,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:51:38,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:51:40,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:51:41,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 02:51:41,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:43,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:51:43,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:51:43,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:51:43,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:51:44,685 INFO [train.py:1046] (2/4) Epoch 43, batch 1800, loss[loss=0.1549, simple_loss=0.2481, pruned_loss=0.03092, over 24329.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2337, pruned_loss=0.03746, over 4690546.22 frames. ], batch size: 74, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:51:44,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:51:48,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1499400.0, ans=0.0 2023-10-04 02:51:49,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:51:49,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:51,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:51:54,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:55,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 02:51:56,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:51:59,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:02,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:03,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:03,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:52:07,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:52:07,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 02:52:08,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:08,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1499466.6666666667, ans=0.125 2023-10-04 02:52:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:11,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1499466.6666666667, ans=0.125 2023-10-04 02:52:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 02:52:18,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 02:52:18,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 02:52:18,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:19,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.26 vs. limit=5.0 2023-10-04 02:52:19,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:19,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:52:19,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:52:25,527 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 02:52:26,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:52:28,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:29,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 02:52:29,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 02:52:31,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:52:32,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:52:33,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:52:38,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 02:52:44,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:52:45,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 02:52:45,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:52:45,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:45,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:52:45,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 02:52:49,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1499666.6666666667, ans=0.0 2023-10-04 02:52:51,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:52:51,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:52:53,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 02:52:53,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:55,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1499666.6666666667, ans=0.0 2023-10-04 02:52:56,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:52:56,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:52:56,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:57,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:57,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:52:58,934 INFO [train.py:1046] (2/4) Epoch 43, batch 1850, loss[loss=0.1606, simple_loss=0.2547, pruned_loss=0.03328, over 24648.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2348, pruned_loss=0.03765, over 4684447.11 frames. ], batch size: 73, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:53:00,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:53:00,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:53:03,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:53:04,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:53:10,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:53:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 02:53:13,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 02:53:17,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 02:53:21,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:53:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 02:53:21,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 02:53:32,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:53:33,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 02:53:33,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1499866.6666666667, ans=0.125 2023-10-04 02:53:36,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:53:37,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:53:41,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 02:53:41,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:53:41,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:53:42,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:53:44,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1499933.3333333333, ans=0.125 2023-10-04 02:53:45,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:53:46,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:53:50,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:53:51,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:53:51,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 02:53:51,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:53:52,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1499933.3333333333, ans=0.125 2023-10-04 02:53:53,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:53:55,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:53:56,295 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.697e+02 1.939e+02 2.109e+02 2.438e+02 4.084e+02, threshold=4.217e+02, percent-clipped=0.0 2023-10-04 02:53:58,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 02:53:59,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:54:02,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.21 vs. limit=15.0 2023-10-04 02:54:03,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:54:03,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:54:03,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 02:54:03,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 02:54:04,954 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 02:54:06,362 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 02:54:09,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:54:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:54:09,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:54:09,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:09,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1500000.0, ans=0.125 2023-10-04 02:54:10,363 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 02:54:10,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:54:10,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:11,667 INFO [train.py:1046] (2/4) Epoch 43, batch 1900, loss[loss=0.1517, simple_loss=0.2404, pruned_loss=0.03147, over 24574.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2361, pruned_loss=0.03805, over 4683302.56 frames. ], batch size: 71, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:54:11,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:54:13,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:54:15,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:54:15,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 02:54:17,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:17,874 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 02:54:17,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:54:19,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:54:24,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:54:27,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:54:27,780 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 02:54:29,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 02:54:30,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:54:31,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:54:31,765 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 02:54:31,799 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 02:54:36,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 02:54:37,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:54:39,306 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.24 vs. limit=15.0 2023-10-04 02:54:40,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 02:54:41,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 02:54:48,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.00 vs. limit=15.0 2023-10-04 02:54:51,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 02:54:54,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 02:54:54,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:55,528 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 02:54:55,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 02:54:55,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 02:54:55,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 02:54:55,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:00,356 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:55:01,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 02:55:02,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:55:05,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:55:05,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 02:55:07,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:55:11,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 02:55:11,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:55:18,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:55:18,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:55:20,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:55:20,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:55:21,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:55:21,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 02:55:24,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:55:25,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:55:25,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:55:26,732 INFO [train.py:1046] (2/4) Epoch 43, batch 1950, loss[loss=0.1524, simple_loss=0.2251, pruned_loss=0.03983, over 23840.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2373, pruned_loss=0.03857, over 4685659.07 frames. ], batch size: 150, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:55:28,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:55:28,383 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:55:29,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:55:29,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:55:31,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:55:32,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:55:35,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:55:35,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:35,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:55:38,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 02:55:38,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:55:39,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:40,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:42,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:55:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:55:42,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:45,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:55:48,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:55:48,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:55:48,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:55:48,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:53,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:56,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:55:56,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:55:56,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:55:56,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 02:55:57,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:55:57,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:55:57,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:02,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:56:05,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:56:09,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:56:12,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:56:12,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:56:12,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 02:56:12,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:56:16,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:56:17,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:56:17,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:56:25,115 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.031e+02 2.275e+02 2.589e+02 3.753e+02, threshold=4.549e+02, percent-clipped=0.0 2023-10-04 02:56:25,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:27,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:30,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:32,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:56:33,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:35,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 02:56:35,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:56:35,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:56:36,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 02:56:36,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1500666.6666666667, ans=0.125 2023-10-04 02:56:39,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:56:40,757 INFO [train.py:1046] (2/4) Epoch 43, batch 2000, loss[loss=0.1506, simple_loss=0.24, pruned_loss=0.03058, over 24475.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2379, pruned_loss=0.03874, over 4684441.57 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:56:42,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:56:43,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:56:43,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:56:46,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:56:49,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:52,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 02:56:52,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:56:57,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:56:57,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1500800.0, ans=0.125 2023-10-04 02:56:58,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 02:57:00,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:57:00,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:57:04,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:57:05,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 02:57:06,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:07,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:07,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:08,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.71 vs. limit=22.5 2023-10-04 02:57:09,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 02:57:09,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:57:09,666 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:57:10,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 02:57:10,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:57:13,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:13,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:57:13,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:14,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:57:16,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:57:17,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 02:57:18,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1500866.6666666667, ans=15.0 2023-10-04 02:57:20,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 02:57:20,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:57:20,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:22,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1500866.6666666667, ans=0.125 2023-10-04 02:57:25,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:26,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:57:26,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:57:28,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:57:30,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:57:30,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:30,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:57:30,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:32,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:35,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:57:35,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 02:57:39,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:57:40,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:43,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:43,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:57:46,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:47,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:47,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:49,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:57:51,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:57:53,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:54,717 INFO [train.py:1046] (2/4) Epoch 43, batch 2050, loss[loss=0.1637, simple_loss=0.2531, pruned_loss=0.03718, over 24025.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2376, pruned_loss=0.03822, over 4692944.78 frames. ], batch size: 80, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:57:54,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:58,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:58,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:58:03,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:58:04,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:58:05,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:58:07,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:58:10,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 02:58:10,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:58:11,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:58:13,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:58:21,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:58:22,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:58:25,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 02:58:26,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:58:27,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 02:58:27,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:58:31,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:58:33,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:58:33,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:58:34,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:58:35,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:58:35,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:58:35,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:58:38,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:58:41,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:58:43,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:58:44,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:58:45,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1501266.6666666667, ans=0.125 2023-10-04 02:58:48,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:58:55,254 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.186e+02 2.516e+02 3.792e+02, threshold=4.373e+02, percent-clipped=0.0 2023-10-04 02:58:55,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:58:56,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 02:59:01,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:59:02,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:59:05,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:59:06,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 02:59:08,168 INFO [train.py:1046] (2/4) Epoch 43, batch 2100, loss[loss=0.1287, simple_loss=0.1827, pruned_loss=0.03737, over 19077.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03793, over 4690352.19 frames. ], batch size: 389, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:59:09,611 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 02:59:09,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:09,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:59:10,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:59:12,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:59:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 02:59:12,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 02:59:14,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:59:16,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.79 vs. limit=22.5 2023-10-04 02:59:18,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:59:18,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:59:20,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1501400.0, ans=10.0 2023-10-04 02:59:21,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:23,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:59:23,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 02:59:23,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:59:24,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 02:59:24,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 02:59:26,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:59:26,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 02:59:26,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:59:32,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 02:59:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:59:35,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:59:35,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:59:38,789 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=22.5 2023-10-04 02:59:39,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:59:39,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 02:59:39,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:39,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:59:40,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.17 vs. limit=15.0 2023-10-04 02:59:42,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 02:59:42,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:42,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 02:59:42,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 02:59:44,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 02:59:45,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:59:46,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:59:50,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:59:51,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:59:52,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:56,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:56,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 02:59:56,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:56,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:56,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:57,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 02:59:57,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 02:59:59,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 03:00:03,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:00:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:00:06,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 03:00:10,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:13,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:00:13,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:00:13,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:00:15,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 03:00:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:00:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:17,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:00:17,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:00:17,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:18,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-10-04 03:00:21,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 03:00:22,571 INFO [train.py:1046] (2/4) Epoch 43, batch 2150, loss[loss=0.1467, simple_loss=0.2302, pruned_loss=0.03166, over 24340.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2363, pruned_loss=0.03746, over 4695155.19 frames. ], batch size: 61, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:00:23,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 03:00:23,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:27,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:00:27,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:00:27,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:00:27,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:00:27,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1501733.3333333333, ans=0.1 2023-10-04 03:00:32,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 03:00:34,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:34,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:36,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:00:36,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:37,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:00:40,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:41,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:00:41,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:00:44,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:44,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 03:00:48,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:00:49,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:00:51,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:51,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:00:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:51,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:00:53,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:53,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:00:53,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:55,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 03:00:57,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:00:58,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:58,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:00,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:01:02,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:01:04,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:01:04,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:01:07,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:07,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 03:01:07,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:01:10,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:01:10,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:10,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1501933.3333333333, ans=0.125 2023-10-04 03:01:11,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:01:12,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:01:14,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:15,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:15,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 03:01:16,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 03:01:18,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:01:18,595 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 03:01:18,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:18,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:01:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 03:01:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:01:19,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 03:01:20,002 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 03:01:20,002 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 03:01:20,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 03:01:20,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1502000.0, ans=0.0 2023-10-04 03:01:22,555 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.924e+02 2.146e+02 2.514e+02 4.521e+02, threshold=4.293e+02, percent-clipped=1.0 2023-10-04 03:01:22,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:22,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1502000.0, ans=0.125 2023-10-04 03:01:23,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:01:23,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:01:24,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:25,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:01:26,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:27,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:34,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:01:34,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 03:01:36,193 INFO [train.py:1046] (2/4) Epoch 43, batch 2200, loss[loss=0.1337, simple_loss=0.2152, pruned_loss=0.02609, over 24306.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2358, pruned_loss=0.0374, over 4705558.70 frames. ], batch size: 56, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:01:39,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:01:41,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:43,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:01:43,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:01:43,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:01:43,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1502066.6666666667, ans=0.125 2023-10-04 03:01:46,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:47,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:47,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 03:01:50,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 03:01:54,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:01:54,601 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:02:01,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 03:02:03,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:05,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:02:05,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:02:05,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1502200.0, ans=0.2 2023-10-04 03:02:08,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:02:08,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1502200.0, ans=0.125 2023-10-04 03:02:09,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 03:02:12,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:02:13,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:15,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 03:02:19,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:02:21,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:02:21,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:02:21,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1502266.6666666667, ans=0.2 2023-10-04 03:02:24,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:25,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 03:02:27,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:27,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 03:02:30,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:30,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:02:30,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:32,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:02:32,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:02:32,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:32,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:35,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:02:35,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:02:38,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:02:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 03:02:40,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:02:43,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:02:43,609 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 03:02:46,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:02:46,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1502333.3333333333, ans=0.125 2023-10-04 03:02:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 03:02:49,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:02:49,120 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 03:02:50,432 INFO [train.py:1046] (2/4) Epoch 43, batch 2250, loss[loss=0.1558, simple_loss=0.2329, pruned_loss=0.03933, over 23564.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2363, pruned_loss=0.03762, over 4700502.26 frames. ], batch size: 134, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:02:50,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:51,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:02:53,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:55,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 03:02:58,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:02:59,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:03:04,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:03:06,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:03:08,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:08,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:03:10,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:03:12,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 03:03:12,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:03:14,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:03:15,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 03:03:17,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:03:17,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:18,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:03:20,267 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-10-04 03:03:22,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:03:23,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:03:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:03:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 03:03:25,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1502533.3333333333, ans=0.0 2023-10-04 03:03:26,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.47 vs. limit=6.0 2023-10-04 03:03:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:30,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:03:35,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:03:36,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:03:37,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:03:37,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:03:39,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:03:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:03:43,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1502600.0, ans=0.0 2023-10-04 03:03:43,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1502600.0, ans=0.125 2023-10-04 03:03:46,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:03:47,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-10-04 03:03:47,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:03:50,212 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 2.039e+02 2.231e+02 2.498e+02 4.606e+02, threshold=4.463e+02, percent-clipped=1.0 2023-10-04 03:03:50,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1502666.6666666667, ans=0.125 2023-10-04 03:03:53,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:03:53,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:03:54,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:03:57,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:04:01,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:04:01,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 03:04:01,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:01,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:04:01,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1502666.6666666667, ans=0.125 2023-10-04 03:04:03,911 INFO [train.py:1046] (2/4) Epoch 43, batch 2300, loss[loss=0.1667, simple_loss=0.2441, pruned_loss=0.04464, over 22774.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2372, pruned_loss=0.03767, over 4710532.34 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:04:04,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 03:04:08,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:04:08,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:14,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:14,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:04:15,771 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 03:04:17,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:20,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.99 vs. limit=6.0 2023-10-04 03:04:22,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:04:22,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:04:24,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:04:24,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:24,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 03:04:25,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:04:26,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:04:28,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:04:32,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:04:34,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:04:38,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:04:43,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:04:43,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:43,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1502866.6666666667, ans=0.125 2023-10-04 03:04:45,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:04:48,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:52,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:04:52,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:04:52,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:04:52,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 03:04:57,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:04:57,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:04:57,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:04:57,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:04:59,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:00,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 03:05:00,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:05:00,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 03:05:00,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:05:00,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:05:01,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.60 vs. limit=15.0 2023-10-04 03:05:01,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 03:05:05,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1503000.0, ans=0.2 2023-10-04 03:05:09,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:05:12,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:05:15,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:15,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:05:15,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:05:17,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-10-04 03:05:17,889 INFO [train.py:1046] (2/4) Epoch 43, batch 2350, loss[loss=0.1368, simple_loss=0.2171, pruned_loss=0.02822, over 24589.00 frames. ], tot_loss[loss=0.157, simple_loss=0.238, pruned_loss=0.03797, over 4695559.10 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:05:17,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:05:18,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:05:18,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:05:18,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 03:05:25,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:05:25,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 03:05:30,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 03:05:33,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:05:36,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:05:36,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:05:36,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:05:36,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:05:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 03:05:39,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1503133.3333333333, ans=0.125 2023-10-04 03:05:40,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:05:46,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 03:05:46,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:05:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:05:49,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:05:51,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:05:53,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 03:05:53,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:05:55,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:05:55,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:05:56,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:59,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:06:01,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 03:06:02,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:06:05,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:06:05,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:06:06,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 03:06:07,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:06:10,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 03:06:10,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:06:13,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1503266.6666666667, ans=0.125 2023-10-04 03:06:14,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 03:06:19,065 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.962e+02 2.138e+02 2.370e+02 3.005e+02, threshold=4.276e+02, percent-clipped=0.0 2023-10-04 03:06:19,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 03:06:19,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:06:19,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:06:19,214 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 03:06:19,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 03:06:23,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 03:06:25,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:06:31,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:06:32,274 INFO [train.py:1046] (2/4) Epoch 43, batch 2400, loss[loss=0.1384, simple_loss=0.2058, pruned_loss=0.03555, over 23354.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03794, over 4703717.59 frames. ], batch size: 285, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:06:33,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:06:37,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:06:37,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 03:06:38,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 03:06:40,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1503400.0, ans=0.0 2023-10-04 03:06:45,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:06:45,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:06:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 03:06:48,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:06:48,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:06:48,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 03:06:54,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:06:55,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 03:07:00,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:07:05,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 03:07:08,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:07:09,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:13,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:07:15,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 03:07:15,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:07:24,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:26,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:07:29,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:07:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:07:29,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:07:31,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:07:31,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:31,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:07:31,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:07:36,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:07:37,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1503666.6666666667, ans=10.0 2023-10-04 03:07:38,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:07:38,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 03:07:38,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 03:07:41,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:07:41,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:41,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 03:07:41,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 03:07:43,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 03:07:43,205 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 03:07:44,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 03:07:44,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:07:45,908 INFO [train.py:1046] (2/4) Epoch 43, batch 2450, loss[loss=0.1304, simple_loss=0.1857, pruned_loss=0.03757, over 19100.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2358, pruned_loss=0.03777, over 4701995.86 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:07:46,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:46,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:07:47,269 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 03:07:47,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:47,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1503733.3333333333, ans=0.125 2023-10-04 03:07:48,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:07:52,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:07:52,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:07:55,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:07:55,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:07:57,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 03:07:57,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1503733.3333333333, ans=0.05 2023-10-04 03:08:03,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:08:03,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:06,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:08:06,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:08:06,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:08:07,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 03:08:11,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:14,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:08:15,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:08:18,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:08:18,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:18,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1503866.6666666667, ans=0.1 2023-10-04 03:08:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:19,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:08:21,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 03:08:21,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:08:28,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:29,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:29,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:08:30,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:08:31,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:33,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:08:33,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 03:08:37,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:38,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:08:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:08:42,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:08:47,283 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.943e+02 2.149e+02 2.494e+02 4.938e+02, threshold=4.298e+02, percent-clipped=1.0 2023-10-04 03:08:48,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:08:48,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 03:08:48,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:08:50,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:08:50,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 03:08:51,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:08:51,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:08:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:08:57,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:57,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:09:00,292 INFO [train.py:1046] (2/4) Epoch 43, batch 2500, loss[loss=0.1578, simple_loss=0.2405, pruned_loss=0.03751, over 19897.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2346, pruned_loss=0.03734, over 4703806.54 frames. ], batch size: 43, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:09:01,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 03:09:01,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:09:05,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:09:06,576 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.56 vs. limit=15.0 2023-10-04 03:09:15,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:09:16,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:09:17,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:09:17,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 03:09:22,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:09:23,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:09:23,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:09:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:09:25,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 03:09:26,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:26,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:09:27,372 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.67 vs. limit=6.0 2023-10-04 03:09:27,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 03:09:27,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:27,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 03:09:28,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:30,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1504200.0, ans=0.0 2023-10-04 03:09:31,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:09:32,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:09:35,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:09:36,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 03:09:37,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:09:38,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:42,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:43,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1504266.6666666667, ans=0.125 2023-10-04 03:09:46,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:48,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:09:55,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:09:57,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 03:09:58,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:09:58,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:09:59,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:09:59,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:10:00,792 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 03:10:00,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 03:10:00,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 03:10:03,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:10:05,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 03:10:05,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 03:10:06,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:10:06,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 03:10:11,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 03:10:14,242 INFO [train.py:1046] (2/4) Epoch 43, batch 2550, loss[loss=0.1386, simple_loss=0.2142, pruned_loss=0.03152, over 24323.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2345, pruned_loss=0.03714, over 4703172.29 frames. ], batch size: 56, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:10:14,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:10:15,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:10:15,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:10:17,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:10:18,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 03:10:18,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:10:21,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 03:10:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:10:27,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:29,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:10:29,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 03:10:29,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:10:31,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:10:32,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:10:33,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:10:33,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 03:10:35,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:10:35,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:35,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 03:10:49,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:10:54,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:10:54,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:54,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:10:56,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:11:01,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:11:03,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.84 vs. limit=22.5 2023-10-04 03:11:03,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:11:03,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:11:03,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:11:05,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:11:06,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:11:09,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:11:09,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:11:13,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:11:13,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 03:11:14,423 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.895e+02 2.103e+02 2.314e+02 4.132e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-04 03:11:14,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:11:14,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:11:15,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:11:15,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:11:17,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1504666.6666666667, ans=0.125 2023-10-04 03:11:19,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:24,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:11:26,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:27,946 INFO [train.py:1046] (2/4) Epoch 43, batch 2600, loss[loss=0.1719, simple_loss=0.2497, pruned_loss=0.04701, over 23368.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2351, pruned_loss=0.03762, over 4709761.29 frames. ], batch size: 106, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:11:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 03:11:30,930 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 03:11:30,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:11:30,986 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 03:11:31,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 03:11:32,266 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 03:11:33,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:11:35,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 03:11:36,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 03:11:36,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1504733.3333333333, ans=0.125 2023-10-04 03:11:37,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.99 vs. limit=15.0 2023-10-04 03:11:38,502 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 03:11:39,056 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.07 vs. limit=15.0 2023-10-04 03:11:41,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:11:42,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 03:11:42,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 03:11:44,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:11:44,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1504800.0, ans=0.1 2023-10-04 03:11:45,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 03:11:46,958 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 03:11:48,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 03:11:56,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:11:56,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:56,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:11:56,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 03:11:59,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:12:03,487 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 03:12:07,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:12:09,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:10,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 03:12:10,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:12:10,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:12:12,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 03:12:14,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1504933.3333333333, ans=0.125 2023-10-04 03:12:14,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1504933.3333333333, ans=0.125 2023-10-04 03:12:15,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:12:16,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:12:19,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:22,676 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 03:12:22,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:22,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:12:30,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:12:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:12:30,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 03:12:31,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:12:33,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:12:34,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:12:39,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 03:12:41,237 INFO [train.py:1046] (2/4) Epoch 43, batch 2650, loss[loss=0.151, simple_loss=0.2386, pruned_loss=0.03173, over 23512.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2357, pruned_loss=0.03755, over 4707483.25 frames. ], batch size: 94, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:12:41,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:41,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:12:46,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 03:12:47,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:49,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:12:49,177 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 03:12:49,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:12:50,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:55,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:12:56,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:12:58,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:58,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 03:12:59,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:12:59,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:13:01,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 03:13:04,237 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 03:13:04,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1505133.3333333333, ans=0.2 2023-10-04 03:13:05,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:08,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 03:13:08,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:09,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 03:13:13,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:13,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:13:13,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:14,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:18,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 03:13:18,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 03:13:22,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:13:24,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 03:13:24,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:26,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:26,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:13:26,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:13:27,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1505266.6666666667, ans=0.1 2023-10-04 03:13:28,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:28,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:13:31,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:13:32,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:13:32,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:13:34,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:13:37,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:37,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:13:38,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:38,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:13:38,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1505266.6666666667, ans=0.1 2023-10-04 03:13:39,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:13:40,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1505333.3333333333, ans=0.0 2023-10-04 03:13:43,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:44,754 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.943e+02 2.072e+02 2.278e+02 3.072e+02, threshold=4.144e+02, percent-clipped=0.0 2023-10-04 03:13:44,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:13:44,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:46,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 03:13:50,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:52,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:53,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:53,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1505333.3333333333, ans=0.1 2023-10-04 03:13:54,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:13:56,281 INFO [train.py:1046] (2/4) Epoch 43, batch 2700, loss[loss=0.1466, simple_loss=0.2346, pruned_loss=0.02932, over 24458.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2372, pruned_loss=0.03791, over 4705641.85 frames. ], batch size: 63, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:13:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:13:56,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:13:58,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:13:58,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 03:14:01,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:14:03,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 03:14:05,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:14:05,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:05,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:07,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:14:07,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:14:08,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:14:08,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:14:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 03:14:09,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:14:11,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:14:12,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:14:12,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:14:16,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:14:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 03:14:18,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:14:23,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1505466.6666666667, ans=0.0 2023-10-04 03:14:24,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:14:24,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:14:29,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:14:29,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:14:29,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:14:29,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:14:32,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1505533.3333333333, ans=0.0 2023-10-04 03:14:33,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:14:35,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:14:35,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:14:35,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:14:39,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:39,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:14:48,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:14:48,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:14:51,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:14:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:14:54,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1505666.6666666667, ans=0.125 2023-10-04 03:14:55,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:14:57,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:14:58,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:00,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:15:00,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:15:01,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:15:03,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:15:03,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:15:06,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 03:15:06,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:08,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-10-04 03:15:09,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:15:09,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 03:15:10,686 INFO [train.py:1046] (2/4) Epoch 43, batch 2750, loss[loss=0.1517, simple_loss=0.2091, pruned_loss=0.0471, over 19784.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2369, pruned_loss=0.03811, over 4691337.19 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:15:10,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 03:15:10,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:14,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:14,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:15:18,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:18,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:15:18,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:18,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1505733.3333333333, ans=0.0 2023-10-04 03:15:21,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:15:23,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:15:23,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:15:23,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:23,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 03:15:24,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:15:24,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:29,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 03:15:30,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:15:30,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:31,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:15:31,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:15:33,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:15:33,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:15:34,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:35,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:40,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:15:40,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:15:40,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:15:41,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:43,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:15:50,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:52,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:15:52,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:15:57,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:57,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:15:57,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:16:02,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:16:02,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:16:02,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 03:16:07,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 03:16:08,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1506000.0, ans=0.125 2023-10-04 03:16:12,647 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.953e+02 2.174e+02 2.380e+02 3.470e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 03:16:14,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:16:17,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:16:17,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 03:16:17,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:16:18,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:16:20,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 03:16:21,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:16:24,611 INFO [train.py:1046] (2/4) Epoch 43, batch 2800, loss[loss=0.1584, simple_loss=0.2371, pruned_loss=0.03984, over 24331.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2354, pruned_loss=0.0378, over 4698950.92 frames. ], batch size: 61, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:16:24,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 03:16:24,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:25,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:16:26,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 03:16:26,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:16:26,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:30,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:16:30,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 03:16:30,118 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 03:16:32,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:34,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:16:34,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:16:36,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1506066.6666666667, ans=0.125 2023-10-04 03:16:39,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:16:42,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 03:16:42,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 03:16:43,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 03:16:45,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:46,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:16:46,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:16:49,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:16:50,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:50,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:16:51,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:16:58,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:17:00,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:17:01,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:01,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1506200.0, ans=0.0 2023-10-04 03:17:03,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:17:03,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:10,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:17:10,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 03:17:10,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:12,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:17:12,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:17:12,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1506266.6666666667, ans=0.1 2023-10-04 03:17:15,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:15,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:18,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:17:19,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1506266.6666666667, ans=0.2 2023-10-04 03:17:21,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:17:21,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:21,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:17:21,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:17:23,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:17:23,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:17:23,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 03:17:24,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:24,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:17:24,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 03:17:26,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:27,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:17:27,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:17:27,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 03:17:34,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:17:34,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:17:36,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:17:37,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:17:38,859 INFO [train.py:1046] (2/4) Epoch 43, batch 2850, loss[loss=0.1633, simple_loss=0.253, pruned_loss=0.03682, over 24623.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2344, pruned_loss=0.03769, over 4693073.13 frames. ], batch size: 68, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:17:40,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:17:40,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:17:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:42,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1506400.0, ans=0.0 2023-10-04 03:17:43,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:44,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:45,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:17:46,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 03:17:52,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 03:17:52,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:17:54,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 03:17:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:56,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 03:17:58,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 03:17:59,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:14,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:18:14,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:18:14,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:18:16,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:18:16,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:18:16,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1506533.3333333333, ans=0.125 2023-10-04 03:18:17,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:18:19,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:18:19,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 03:18:20,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:18:22,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:18:22,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:18:22,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:25,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:18:25,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:18:26,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:28,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:18:28,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:18:29,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:30,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:32,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1506600.0, ans=0.1 2023-10-04 03:18:33,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:18:37,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:18:38,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 03:18:38,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 03:18:41,187 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.245e+02 2.523e+02 4.092e+02, threshold=4.490e+02, percent-clipped=0.0 2023-10-04 03:18:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:18:42,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:18:42,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 03:18:42,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:18:44,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:18:44,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:18:44,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:18:44,142 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 03:18:44,191 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 03:18:44,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:18:46,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:52,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:18:52,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:18:52,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:18:53,437 INFO [train.py:1046] (2/4) Epoch 43, batch 2900, loss[loss=0.1723, simple_loss=0.2462, pruned_loss=0.04922, over 22801.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2343, pruned_loss=0.03754, over 4695573.03 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:18:53,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 03:18:56,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:56,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1506733.3333333333, ans=0.125 2023-10-04 03:18:57,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 03:18:57,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 03:18:59,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:18:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:19:00,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:19:02,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:19:04,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:19:06,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:19:10,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:19:10,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 03:19:11,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:19:12,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:14,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 03:19:15,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 03:19:18,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:19:18,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 03:19:18,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:19:22,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:19:22,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:19:26,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:19:27,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:30,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:19:33,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:19:34,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 03:19:34,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 03:19:34,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:19:38,234 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:19:38,823 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.30 vs. limit=15.0 2023-10-04 03:19:39,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:19:42,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 03:19:43,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:19:48,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:57,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:19:57,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:19:59,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 03:20:00,376 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.37 vs. limit=22.5 2023-10-04 03:20:02,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:02,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 03:20:04,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:20:04,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:20:04,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=1507000.0, ans=0.02 2023-10-04 03:20:06,828 INFO [train.py:1046] (2/4) Epoch 43, batch 2950, loss[loss=0.1532, simple_loss=0.2365, pruned_loss=0.03491, over 23590.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03734, over 4720785.19 frames. ], batch size: 149, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:20:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:20:10,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 03:20:11,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:20:11,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:13,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:20:16,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 03:20:17,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 03:20:19,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:20:19,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:20:23,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:20:25,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:20:28,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:20:28,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:20:31,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:20:31,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:20:32,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:33,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:33,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:20:36,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 03:20:42,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 03:20:42,680 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 03:20:44,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1507200.0, ans=0.125 2023-10-04 03:20:45,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:20:45,493 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 03:20:47,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 03:20:47,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:20:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:20:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 03:20:47,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:20:48,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1507200.0, ans=0.0 2023-10-04 03:20:50,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 03:20:51,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:20:51,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:20:54,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:55,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.09 vs. limit=15.0 2023-10-04 03:20:56,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:20:56,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:20:56,213 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 03:20:57,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:57,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 03:21:02,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:21:03,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:21:03,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 03:21:05,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:21:06,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 03:21:09,082 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.953e+02 2.204e+02 2.575e+02 5.460e+02, threshold=4.408e+02, percent-clipped=1.0 2023-10-04 03:21:09,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:21:12,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:21:12,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:21:13,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:21:13,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:21:15,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:21:16,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.47 vs. limit=15.0 2023-10-04 03:21:16,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:16,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:21:16,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:21:16,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:21:18,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:21:18,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:20,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 03:21:20,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:21,634 INFO [train.py:1046] (2/4) Epoch 43, batch 3000, loss[loss=0.1307, simple_loss=0.2091, pruned_loss=0.02616, over 24454.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2359, pruned_loss=0.03769, over 4703507.34 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:21:21,634 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 03:21:33,110 INFO [train.py:1078] (2/4) Epoch 43, validation: loss=0.3299, simple_loss=0.2679, pruned_loss=0.196, over 1125622.00 frames. 2023-10-04 03:21:33,110 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 03:21:34,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:21:34,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:21:39,073 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 03:21:39,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 03:21:39,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1507400.0, ans=0.0 2023-10-04 03:21:40,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:21:41,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:21:42,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 03:21:42,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:21:45,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1507400.0, ans=0.09899494936611666 2023-10-04 03:21:49,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:21:58,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:22:03,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 03:22:03,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:22:06,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:22:08,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:22:08,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:22:08,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1507533.3333333333, ans=0.025 2023-10-04 03:22:09,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:22:09,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 03:22:13,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 03:22:14,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:22:14,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:22:19,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:22:19,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:22:20,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:20,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:22:23,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.49 vs. limit=22.5 2023-10-04 03:22:24,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:22:24,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:22:24,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:22:26,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:22:30,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 03:22:30,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:22:30,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:31,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:22:34,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:35,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:35,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 03:22:35,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 03:22:35,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:22:36,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 03:22:38,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:22:39,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 03:22:43,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:22:44,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:22:45,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 03:22:45,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 03:22:45,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:22:47,135 INFO [train.py:1046] (2/4) Epoch 43, batch 3050, loss[loss=0.1823, simple_loss=0.2496, pruned_loss=0.05749, over 23936.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2369, pruned_loss=0.03824, over 4702354.13 frames. ], batch size: 196, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:22:47,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:22:47,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:47,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:22:48,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:48,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:22:51,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 03:22:53,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:22:55,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:22:55,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:22:58,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:58,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1507733.3333333333, ans=0.1 2023-10-04 03:23:01,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 03:23:05,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 03:23:06,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 03:23:06,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:09,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:23:11,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:11,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:23:12,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:13,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1507800.0, ans=0.2 2023-10-04 03:23:15,248 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.23 vs. limit=22.5 2023-10-04 03:23:15,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:23:17,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:23:17,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:17,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:23:17,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:20,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:22,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:25,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:25,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 03:23:27,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:27,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:23:28,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:23:29,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:23:31,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:23:31,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:31,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1507933.3333333333, ans=0.0 2023-10-04 03:23:36,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:38,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:43,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:45,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:23:45,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:47,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:23:47,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:23:47,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:23:48,518 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.989e+02 2.180e+02 2.446e+02 3.954e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-04 03:23:48,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 03:23:50,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:23:50,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:50,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1508000.0, ans=0.125 2023-10-04 03:23:50,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.97 vs. limit=22.5 2023-10-04 03:23:53,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 03:23:55,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:59,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:59,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1508066.6666666667, ans=0.0 2023-10-04 03:24:00,710 INFO [train.py:1046] (2/4) Epoch 43, batch 3100, loss[loss=0.1531, simple_loss=0.2201, pruned_loss=0.04302, over 19583.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.237, pruned_loss=0.03821, over 4710715.07 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:24:00,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:24:03,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:24:04,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 03:24:06,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 03:24:06,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 03:24:08,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1508066.6666666667, ans=0.1 2023-10-04 03:24:08,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1508066.6666666667, ans=0.1 2023-10-04 03:24:09,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:24:13,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:24:13,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:15,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 03:24:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:25,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 03:24:27,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:24:29,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:29,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:24:29,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:24:30,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 03:24:32,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:24:32,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 03:24:32,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:24:34,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:35,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 03:24:38,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:24:42,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:24:42,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1508200.0, ans=0.2 2023-10-04 03:24:43,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 03:24:43,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 03:24:45,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:45,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:48,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:24:48,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:48,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:24:49,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:24:49,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:24:49,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1508266.6666666667, ans=0.0 2023-10-04 03:24:52,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:24:52,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:24:52,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:52,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:24:55,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1508266.6666666667, ans=0.2 2023-10-04 03:24:57,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:25:00,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 03:25:01,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:25:01,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 03:25:03,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:03,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:03,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 03:25:03,977 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.51 vs. limit=22.5 2023-10-04 03:25:13,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 03:25:14,894 INFO [train.py:1046] (2/4) Epoch 43, batch 3150, loss[loss=0.1422, simple_loss=0.1977, pruned_loss=0.04331, over 19059.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2359, pruned_loss=0.03798, over 4704889.18 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:25:14,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:16,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:17,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:25:17,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:25:17,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 03:25:19,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:21,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 03:25:22,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 03:25:26,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:26,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1508400.0, ans=0.0 2023-10-04 03:25:27,570 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 03:25:30,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 03:25:30,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:25:31,699 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 03:25:33,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 03:25:35,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 03:25:37,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 03:25:37,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 03:25:37,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:37,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:25:37,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:39,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 03:25:40,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:40,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:42,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:25:43,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:25:43,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1508533.3333333333, ans=0.125 2023-10-04 03:25:47,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 03:25:47,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:25:49,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:25:50,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:25:50,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 03:25:53,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 03:25:53,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:25:53,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:25:55,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:25:55,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:55,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:25:57,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1508533.3333333333, ans=0.1 2023-10-04 03:25:58,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:25:58,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:25:58,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 03:26:00,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:26:00,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:00,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:26:00,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:26:01,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 03:26:03,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:03,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1508600.0, ans=0.125 2023-10-04 03:26:04,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 03:26:04,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:05,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 03:26:06,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 03:26:09,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:26:09,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:09,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 03:26:10,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 03:26:11,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:26:15,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:26:17,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:17,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:26:19,128 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.065e+02 2.333e+02 2.653e+02 4.086e+02, threshold=4.666e+02, percent-clipped=0.0 2023-10-04 03:26:20,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:26:22,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:23,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 03:26:29,801 INFO [train.py:1046] (2/4) Epoch 43, batch 3200, loss[loss=0.1495, simple_loss=0.2389, pruned_loss=0.03, over 24489.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2346, pruned_loss=0.03738, over 4706142.95 frames. ], batch size: 66, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:26:29,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:26:29,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:26:35,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:37,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:26:37,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 03:26:39,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:41,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:26:42,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.60 vs. limit=22.5 2023-10-04 03:26:45,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:54,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:27:02,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 03:27:03,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:27:06,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 03:27:08,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:27:10,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:27:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:27:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:27:12,771 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.33 vs. limit=15.0 2023-10-04 03:27:15,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1508933.3333333333, ans=0.0 2023-10-04 03:27:16,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 03:27:17,031 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.93 vs. limit=15.0 2023-10-04 03:27:17,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 03:27:19,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 03:27:22,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 03:27:23,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:27:29,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:27:29,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:27:31,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:27:31,215 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 03:27:31,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:27:31,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1509000.0, ans=0.125 2023-10-04 03:27:34,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:27:35,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 03:27:36,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1509000.0, ans=0.125 2023-10-04 03:27:37,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 03:27:37,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 03:27:38,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 03:27:40,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:27:42,748 INFO [train.py:1046] (2/4) Epoch 43, batch 3250, loss[loss=0.1548, simple_loss=0.2446, pruned_loss=0.03252, over 24454.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03752, over 4721760.07 frames. ], batch size: 69, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:27:44,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:27:44,161 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 03:27:44,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:27:44,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:27:45,647 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 03:27:48,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:27:50,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:27:52,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1509066.6666666667, ans=0.125 2023-10-04 03:27:54,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1509066.6666666667, ans=0.0 2023-10-04 03:28:00,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:00,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 03:28:01,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1509133.3333333333, ans=0.0 2023-10-04 03:28:02,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:02,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:28:02,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:28:04,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:28:05,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:28:07,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:07,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:28:08,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:08,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:08,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:09,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:28:11,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:12,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:28:14,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:14,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:14,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1509200.0, ans=0.125 2023-10-04 03:28:15,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:15,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:28:16,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:28:21,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 03:28:21,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:28:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:28:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:22,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1509200.0, ans=0.125 2023-10-04 03:28:24,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:28:31,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:28:33,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1509266.6666666667, ans=0.2 2023-10-04 03:28:37,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:28:37,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:37,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 03:28:37,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:28:38,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:28:39,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:39,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1509266.6666666667, ans=0.125 2023-10-04 03:28:40,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 03:28:40,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 03:28:42,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:28:43,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:43,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:45,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 03:28:45,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:46,414 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.986e+02 2.142e+02 2.380e+02 3.154e+02, threshold=4.284e+02, percent-clipped=0.0 2023-10-04 03:28:47,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-10-04 03:28:49,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:28:49,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:28:50,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 03:28:50,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:28:51,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1509333.3333333333, ans=0.0 2023-10-04 03:28:51,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1509333.3333333333, ans=0.125 2023-10-04 03:28:52,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:28:52,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 03:28:55,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1509400.0, ans=0.125 2023-10-04 03:28:56,935 INFO [train.py:1046] (2/4) Epoch 43, batch 3300, loss[loss=0.1585, simple_loss=0.2341, pruned_loss=0.0414, over 23898.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03754, over 4729484.88 frames. ], batch size: 195, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:28:57,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:28:57,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 03:28:59,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 03:29:00,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 03:29:00,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:04,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:29:05,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1509400.0, ans=0.125 2023-10-04 03:29:06,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:29:06,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:07,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:29:09,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:29:10,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:12,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:29:14,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 03:29:16,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:29:16,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:17,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:17,717 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 03:29:17,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1509466.6666666667, ans=0.2 2023-10-04 03:29:20,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:29:21,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:29:21,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:29:21,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:29:23,467 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 03:29:27,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:27,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:29:30,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:30,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 03:29:32,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 03:29:33,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:33,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:29:36,468 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 03:29:37,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 03:29:39,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:29:43,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 03:29:44,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:29:47,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:29:47,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:29:47,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1509600.0, ans=0.1 2023-10-04 03:29:48,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:29:48,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:48,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:48,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:29:52,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:29:52,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:52,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:29:54,360 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 03:29:56,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 03:29:58,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:29:58,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:29:58,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:01,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:30:01,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:01,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:30:02,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:02,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:30:04,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:30:06,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:30:08,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 03:30:08,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:10,110 INFO [train.py:1046] (2/4) Epoch 43, batch 3350, loss[loss=0.1526, simple_loss=0.2488, pruned_loss=0.02823, over 24329.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2361, pruned_loss=0.03757, over 4732738.71 frames. ], batch size: 74, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:30:10,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:12,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:30:13,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:30:13,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:16,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:17,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:30:19,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:20,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:30:23,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:24,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:30:25,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:26,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:30:27,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 03:30:29,309 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 03:30:30,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:33,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 03:30:33,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 03:30:34,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:30:34,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:30:35,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:30:36,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 03:30:36,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:36,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:30:40,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:41,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:41,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1509866.6666666667, ans=0.0 2023-10-04 03:30:42,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:42,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:30:47,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:30:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:48,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:30:53,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:30:53,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:54,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:54,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:30:57,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:00,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 03:31:00,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:31:00,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 03:31:02,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:31:03,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 03:31:04,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:07,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:31:12,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:14,109 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 2.008e+02 2.223e+02 2.622e+02 3.286e+02, threshold=4.446e+02, percent-clipped=0.0 2023-10-04 03:31:14,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 03:31:15,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:31:16,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:31:18,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:31:20,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1510000.0, ans=0.125 2023-10-04 03:31:23,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:31:24,371 INFO [train.py:1046] (2/4) Epoch 43, batch 3400, loss[loss=0.1446, simple_loss=0.2297, pruned_loss=0.02975, over 24656.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2364, pruned_loss=0.03754, over 4745768.86 frames. ], batch size: 65, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:31:25,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 03:31:25,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:31:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:31:27,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:27,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 03:31:29,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:29,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 03:31:30,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:31:30,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1510066.6666666667, ans=0.1 2023-10-04 03:31:31,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:31:31,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:31:31,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:31:32,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1510066.6666666667, ans=0.125 2023-10-04 03:31:33,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 03:31:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 03:31:36,222 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 03:31:37,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:31:42,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:31:42,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:31:42,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:31:42,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:31:47,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:31:47,856 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.12 vs. limit=15.0 2023-10-04 03:31:48,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 03:31:53,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:31:54,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:31:54,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:55,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:31:57,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1510200.0, ans=0.1 2023-10-04 03:32:03,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:32:06,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 03:32:11,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1510266.6666666667, ans=0.125 2023-10-04 03:32:13,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:32:14,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:32:14,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 03:32:15,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:32:15,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:32:16,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:32:16,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:32:17,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1510266.6666666667, ans=0.0 2023-10-04 03:32:20,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:32:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:32:21,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:32:27,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:32:30,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 03:32:36,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:32:37,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 03:32:39,353 INFO [train.py:1046] (2/4) Epoch 43, batch 3450, loss[loss=0.1403, simple_loss=0.2224, pruned_loss=0.02906, over 24332.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2364, pruned_loss=0.03781, over 4742494.26 frames. ], batch size: 56, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:32:42,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 03:32:42,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:32:45,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:32:45,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 03:32:46,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:32:49,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:32:53,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1510466.6666666667, ans=0.1 2023-10-04 03:32:55,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:32:55,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:32:57,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:32:57,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:00,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 03:33:06,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1510466.6666666667, ans=0.125 2023-10-04 03:33:09,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 03:33:09,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:33:09,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:33:12,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:16,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1510533.3333333333, ans=0.2 2023-10-04 03:33:18,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 03:33:19,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:33:22,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:33:23,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:33:25,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:33:25,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:33:26,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=22.5 2023-10-04 03:33:27,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 03:33:27,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:33:30,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:32,466 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.97 vs. limit=12.0 2023-10-04 03:33:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:33:35,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 03:33:38,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:33:43,840 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.002e+02 2.243e+02 2.534e+02 3.921e+02, threshold=4.486e+02, percent-clipped=0.0 2023-10-04 03:33:43,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:33:45,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:48,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:33:49,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.68 vs. limit=15.0 2023-10-04 03:33:50,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:50,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:33:52,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:33:53,702 INFO [train.py:1046] (2/4) Epoch 43, batch 3500, loss[loss=0.162, simple_loss=0.2456, pruned_loss=0.03923, over 23387.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.235, pruned_loss=0.03778, over 4732197.96 frames. ], batch size: 119, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:33:53,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:33:57,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:33:59,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:34:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 03:34:01,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:34:04,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:34:08,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:34:08,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 03:34:12,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:34:12,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:34:13,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:34:13,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:34:14,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:34:14,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:14,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:34:16,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 03:34:19,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:19,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:34:20,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:34:23,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:23,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 03:34:25,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:34:26,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:34:28,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:34:28,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1510866.6666666667, ans=0.2 2023-10-04 03:34:29,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:31,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:34:32,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:34:32,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1510866.6666666667, ans=0.1 2023-10-04 03:34:35,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 03:34:35,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 03:34:36,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1510866.6666666667, ans=0.125 2023-10-04 03:34:37,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 03:34:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:34:39,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:39,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:34:39,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:34:44,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:34:44,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:34:48,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:34:49,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 03:34:49,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 03:34:49,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:34:54,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:34:54,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:34:57,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:58,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 03:34:59,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:35:01,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:35:02,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 03:35:04,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 03:35:06,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:07,119 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:35:07,903 INFO [train.py:1046] (2/4) Epoch 43, batch 3550, loss[loss=0.1287, simple_loss=0.1838, pruned_loss=0.03678, over 19045.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2332, pruned_loss=0.0374, over 4716641.31 frames. ], batch size: 389, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:35:08,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:35:08,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:08,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:12,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:35:19,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:20,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 03:35:23,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:35:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:35:25,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:25,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1511133.3333333333, ans=0.1 2023-10-04 03:35:26,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:35:27,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:35:30,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:35:32,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:35:32,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:32,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:35:33,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:35:39,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:35:39,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:35:41,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:35:41,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:41,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:35:41,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 03:35:41,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:43,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:44,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:35:45,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.91 vs. limit=22.5 2023-10-04 03:35:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:50,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:35:50,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:52,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 03:35:52,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:35:54,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 03:35:54,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:35:57,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:35:58,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:36:01,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 03:36:01,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:08,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:09,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 03:36:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:14,573 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.748e+02 1.996e+02 2.248e+02 2.673e+02 4.535e+02, threshold=4.496e+02, percent-clipped=1.0 2023-10-04 03:36:14,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:36:14,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 03:36:16,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1511333.3333333333, ans=0.125 2023-10-04 03:36:20,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 03:36:21,459 INFO [train.py:1046] (2/4) Epoch 43, batch 3600, loss[loss=0.1443, simple_loss=0.2228, pruned_loss=0.03284, over 23359.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2333, pruned_loss=0.03696, over 4727603.12 frames. ], batch size: 119, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:36:21,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:36:22,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:36:24,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:24,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:24,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:36:29,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:36:30,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:32,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:36:33,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:36:33,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:33,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 03:36:34,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.47 vs. limit=22.5 2023-10-04 03:36:38,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:36:39,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:42,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:36:44,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:36:46,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:36:46,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:36:47,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 03:36:47,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:36:51,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:51,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:36:54,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:55,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:36:57,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:36:58,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 03:36:58,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1511533.3333333333, ans=0.0 2023-10-04 03:37:03,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1511533.3333333333, ans=0.0 2023-10-04 03:37:05,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:05,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:37:07,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 03:37:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:37:18,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:18,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1511600.0, ans=0.125 2023-10-04 03:37:21,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:27,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:37:27,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:37:27,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 03:37:28,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 03:37:28,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 03:37:30,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:37:31,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:37:32,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 03:37:32,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:37:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:37:32,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:34,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 03:37:35,997 INFO [train.py:1046] (2/4) Epoch 43, batch 3650, loss[loss=0.1455, simple_loss=0.2256, pruned_loss=0.03275, over 24333.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03712, over 4732363.20 frames. ], batch size: 56, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:37:36,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 03:37:39,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:40,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 03:37:44,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 03:37:47,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:37:48,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 03:37:51,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 03:37:54,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:37:54,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:37:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:37:58,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:37:58,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:59,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 03:37:59,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:37:59,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:01,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 03:38:02,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:38:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:38:02,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:04,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:38:04,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1511866.6666666667, ans=0.125 2023-10-04 03:38:05,693 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:38:05,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-10-04 03:38:08,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 03:38:10,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 03:38:10,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:38:13,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 03:38:14,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:38:14,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:38:19,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:38:21,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:21,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:38:23,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:38:25,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:38:27,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:38:30,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:30,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:30,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:38:32,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:38:33,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:38:41,169 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 03:38:42,406 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.039e+02 2.236e+02 2.475e+02 4.169e+02, threshold=4.472e+02, percent-clipped=0.0 2023-10-04 03:38:42,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1512000.0, ans=0.125 2023-10-04 03:38:43,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:38:43,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:38:45,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:38:46,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1512000.0, ans=0.2 2023-10-04 03:38:47,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:47,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:38:49,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:50,644 INFO [train.py:1046] (2/4) Epoch 43, batch 3700, loss[loss=0.1466, simple_loss=0.2294, pruned_loss=0.03192, over 24320.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2347, pruned_loss=0.03727, over 4725071.73 frames. ], batch size: 61, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:38:52,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 03:38:52,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:53,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:38:54,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:55,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:38:57,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:57,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 03:38:57,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:59,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:38:59,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:39:01,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:39:04,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:05,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:07,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:39:08,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:39:08,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:39:11,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:13,295 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 03:39:21,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:39:21,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:39:23,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:39:23,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 03:39:23,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:39:27,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:27,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 03:39:28,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:28,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1512200.0, ans=0.125 2023-10-04 03:39:31,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:39:32,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:32,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:39:35,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:39:35,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1512266.6666666667, ans=0.0 2023-10-04 03:39:41,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:39:41,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 03:39:41,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:42,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 03:39:46,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:39:46,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:39:50,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:50,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 03:39:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:39:54,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:39:54,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:39:54,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:56,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:39:57,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 03:39:59,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 03:39:59,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:39:59,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:00,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:40:02,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:40:03,716 INFO [train.py:1046] (2/4) Epoch 43, batch 3750, loss[loss=0.1712, simple_loss=0.2591, pruned_loss=0.04162, over 24421.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2362, pruned_loss=0.038, over 4719136.39 frames. ], batch size: 77, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:40:03,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:40:04,165 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:40:05,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:40:06,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:08,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 03:40:08,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1512400.0, ans=0.125 2023-10-04 03:40:09,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 03:40:12,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:40:12,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 03:40:14,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:40:16,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:17,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:17,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:40:17,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1512466.6666666667, ans=0.125 2023-10-04 03:40:21,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:40:26,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:40:26,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:40:27,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:40:30,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:40:30,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 03:40:32,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:40:33,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:40:33,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:40:36,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 03:40:36,431 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:40:39,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1512533.3333333333, ans=0.125 2023-10-04 03:40:40,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 03:40:42,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:40:43,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:40:45,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:40:45,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1512533.3333333333, ans=0.125 2023-10-04 03:40:48,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:50,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:40:54,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 03:40:57,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:59,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:40:59,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:41:03,594 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=15.0 2023-10-04 03:41:04,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:41:08,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:41:10,051 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.075e+02 2.386e+02 2.913e+02 4.377e+02, threshold=4.772e+02, percent-clipped=0.0 2023-10-04 03:41:10,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:41:11,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:41:12,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:41:13,830 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:41:14,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:41:16,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1512733.3333333333, ans=0.025 2023-10-04 03:41:17,730 INFO [train.py:1046] (2/4) Epoch 43, batch 3800, loss[loss=0.1671, simple_loss=0.2306, pruned_loss=0.05182, over 19762.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2371, pruned_loss=0.03835, over 4719346.10 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:41:18,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1512733.3333333333, ans=0.1 2023-10-04 03:41:21,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1512733.3333333333, ans=0.125 2023-10-04 03:41:24,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:41:26,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:28,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:41:28,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 03:41:29,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:41:31,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1512800.0, ans=0.0 2023-10-04 03:41:32,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:41:32,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:41:33,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 03:41:33,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:35,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:41:36,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:41:36,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:41:36,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:39,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 03:41:43,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 03:41:43,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:41:47,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:41:47,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:41:48,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:41:48,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:41:48,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:52,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:54,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:58,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:41:58,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 03:41:59,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:42:06,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:42:14,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:42:16,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 03:42:17,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 03:42:17,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:42:18,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:42:18,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:22,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 03:42:25,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1513000.0, ans=0.125 2023-10-04 03:42:26,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 03:42:26,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 03:42:26,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:28,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:42:32,358 INFO [train.py:1046] (2/4) Epoch 43, batch 3850, loss[loss=0.1562, simple_loss=0.2433, pruned_loss=0.03452, over 24354.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2354, pruned_loss=0.03801, over 4715068.62 frames. ], batch size: 74, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:42:33,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:42:33,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:42:38,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:42:38,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 03:42:40,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:42:40,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:41,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1513066.6666666667, ans=0.04949747468305833 2023-10-04 03:42:44,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:42:48,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:42:49,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:42:51,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 03:42:57,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:42:58,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:43:00,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:00,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:43:00,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1513200.0, ans=0.1 2023-10-04 03:43:03,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:04,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:43:04,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:04,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:43:05,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:08,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:09,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:09,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:43:10,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 03:43:10,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 03:43:11,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:11,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:14,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:14,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:16,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 03:43:18,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 03:43:20,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:22,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 03:43:25,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:43:29,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:31,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:32,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1513333.3333333333, ans=0.125 2023-10-04 03:43:35,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:35,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 03:43:36,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 03:43:38,217 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.999e+02 2.160e+02 2.464e+02 3.825e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-04 03:43:39,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:39,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:39,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1513333.3333333333, ans=0.125 2023-10-04 03:43:42,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:43:42,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:43:44,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:44,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:44,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:43:44,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 03:43:45,935 INFO [train.py:1046] (2/4) Epoch 43, batch 3900, loss[loss=0.1477, simple_loss=0.239, pruned_loss=0.02824, over 24433.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2351, pruned_loss=0.03756, over 4722931.32 frames. ], batch size: 69, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:43:46,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:47,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 03:43:47,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1513400.0, ans=0.2 2023-10-04 03:43:48,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:48,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:50,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:43:50,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:51,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:43:53,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:53,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:53,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:43:53,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 03:43:53,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:57,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:43:58,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:43:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:43:58,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1513400.0, ans=0.125 2023-10-04 03:43:59,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:44:01,980 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.33 vs. limit=22.5 2023-10-04 03:44:02,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:44:02,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:44:03,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:44:06,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 03:44:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:44:07,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 03:44:08,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:44:09,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 03:44:10,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 03:44:15,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:44:15,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:44:15,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:44:17,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:21,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:44:24,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:44:26,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:44:26,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:44:28,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:44:29,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1513600.0, ans=0.2 2023-10-04 03:44:32,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:44:32,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:44:39,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:44:40,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:44:42,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1513600.0, ans=0.125 2023-10-04 03:44:51,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:44:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:54,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.55 vs. limit=22.5 2023-10-04 03:44:56,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 03:44:56,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 03:44:56,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:59,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 03:44:59,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:45:00,797 INFO [train.py:1046] (2/4) Epoch 43, batch 3950, loss[loss=0.1589, simple_loss=0.2426, pruned_loss=0.03766, over 23417.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03714, over 4709416.08 frames. ], batch size: 93, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:45:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 03:45:02,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.15 vs. limit=15.0 2023-10-04 03:45:05,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:45:06,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 03:45:07,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:45:10,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:45:12,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:45:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 03:45:16,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:45:16,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 03:45:18,752 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 03:45:18,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:45:21,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:45:22,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:45:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:45:25,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 03:45:26,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1513800.0, ans=0.125 2023-10-04 03:45:27,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:45:29,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:45:29,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:45:30,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:45:30,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:45:37,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:45:37,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:45:43,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 03:45:47,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1513933.3333333333, ans=0.125 2023-10-04 03:45:49,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 03:45:49,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 03:45:49,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:45:50,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.70 vs. limit=12.0 2023-10-04 03:45:51,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:45:58,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:46:00,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:46:00,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:46:00,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:46:00,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 03:46:07,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 2.005e+02 2.254e+02 2.527e+02 3.756e+02, threshold=4.508e+02, percent-clipped=0.0 2023-10-04 03:46:07,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:46:08,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:46:11,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 03:46:14,417 INFO [train.py:1046] (2/4) Epoch 43, batch 4000, loss[loss=0.1363, simple_loss=0.2171, pruned_loss=0.0277, over 24557.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.03747, over 4707109.15 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:46:18,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.39 vs. limit=22.5 2023-10-04 03:46:20,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:28,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:33,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:46:33,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:46:33,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:34,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 03:46:34,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:46:35,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 03:46:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:46:35,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 03:46:38,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:46:41,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:46:41,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:46:41,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:46:41,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:46:41,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 03:46:42,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:46:44,259 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 03:46:45,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:46:45,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:46:50,373 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 03:46:50,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:46:52,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:46:56,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 03:46:57,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:47:00,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:47:00,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 03:47:03,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:47:03,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 03:47:03,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:47:03,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:47:05,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:47:07,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:47:08,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:47:08,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:47:11,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 03:47:11,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:47:12,834 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 03:47:16,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:47:20,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 03:47:21,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:47:21,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:47:23,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:47:23,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:47:29,560 INFO [train.py:1046] (2/4) Epoch 43, batch 4050, loss[loss=0.1406, simple_loss=0.2192, pruned_loss=0.03101, over 21092.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2352, pruned_loss=0.03752, over 4707682.06 frames. ], batch size: 46, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:47:29,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:47:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:47:32,622 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:47:33,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 03:47:35,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:47:35,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:47:36,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:47:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:47:39,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:47:42,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:47:46,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:47:47,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:47:49,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:47:49,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:47:51,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1514466.6666666667, ans=0.0 2023-10-04 03:47:54,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:47:55,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1514466.6666666667, ans=0.1 2023-10-04 03:47:57,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:48:00,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 03:48:01,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 03:48:01,957 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 03:48:02,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1514533.3333333333, ans=0.125 2023-10-04 03:48:05,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:48:08,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1514533.3333333333, ans=0.0 2023-10-04 03:48:12,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 03:48:12,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:48:12,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1514600.0, ans=0.125 2023-10-04 03:48:14,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:48:15,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:48:17,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:48:17,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:48:21,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:48:24,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 03:48:24,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:48:25,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:48:28,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 03:48:31,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:48:36,608 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.949e+02 2.233e+02 2.507e+02 3.530e+02, threshold=4.466e+02, percent-clipped=0.0 2023-10-04 03:48:39,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 03:48:39,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:48:39,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:48:39,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1514666.6666666667, ans=0.2 2023-10-04 03:48:42,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 03:48:42,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 03:48:42,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:48:43,718 INFO [train.py:1046] (2/4) Epoch 43, batch 4100, loss[loss=0.1555, simple_loss=0.2492, pruned_loss=0.03093, over 24323.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2361, pruned_loss=0.03756, over 4709523.76 frames. ], batch size: 74, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:48:45,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:48:47,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:47,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:48:53,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 03:48:55,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 03:48:57,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 03:48:58,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 03:48:58,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:48:58,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:58,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:58,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:48:58,846 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 03:49:02,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:49:02,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1514800.0, ans=0.125 2023-10-04 03:49:04,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:49:05,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:49:05,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:49:09,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:49:09,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:49:11,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:49:11,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 03:49:12,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:49:12,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:49:12,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:49:12,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:49:12,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 03:49:15,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:15,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 03:49:18,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:49:19,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:49:19,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 03:49:21,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:49:21,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:49:22,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:49:24,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 03:49:25,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:49:25,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:49:28,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 03:49:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:49:30,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:49:32,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1514933.3333333333, ans=0.1 2023-10-04 03:49:33,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:33,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1514933.3333333333, ans=0.0 2023-10-04 03:49:38,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:49:42,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:49:42,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:49:42,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1515000.0, ans=0.125 2023-10-04 03:49:49,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.15 vs. limit=10.0 2023-10-04 03:49:51,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:49:52,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:53,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1515000.0, ans=0.0 2023-10-04 03:49:55,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:49:57,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:49:58,563 INFO [train.py:1046] (2/4) Epoch 43, batch 4150, loss[loss=0.1487, simple_loss=0.238, pruned_loss=0.02976, over 24398.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2367, pruned_loss=0.03755, over 4705214.45 frames. ], batch size: 77, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:50:01,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:50:03,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:50:04,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:50:04,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:50:04,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1515066.6666666667, ans=0.125 2023-10-04 03:50:06,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 03:50:07,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:50:09,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 03:50:09,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 03:50:09,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 03:50:10,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:50:14,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:50:14,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:50:18,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:50:18,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:50:19,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:50:21,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:50:21,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:50:23,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:50:28,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:50:30,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:50:32,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 03:50:35,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 03:50:35,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:50:37,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 03:50:37,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:50:37,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:50:40,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:50:41,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:50:45,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 03:50:47,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:50:48,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:50:50,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 03:50:50,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:50:52,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 03:50:55,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:50:56,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:50:56,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1515333.3333333333, ans=0.125 2023-10-04 03:50:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:50:58,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 03:50:58,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:50:58,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:51:00,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:51:02,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 03:51:02,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:51:04,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:51:04,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:51:04,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 03:51:05,317 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.071e+02 2.291e+02 2.754e+02 5.163e+02, threshold=4.583e+02, percent-clipped=2.0 2023-10-04 03:51:05,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:51:05,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:51:05,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:51:08,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:51:08,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 03:51:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:51:12,976 INFO [train.py:1046] (2/4) Epoch 43, batch 4200, loss[loss=0.1539, simple_loss=0.247, pruned_loss=0.03038, over 24626.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2355, pruned_loss=0.0374, over 4687039.41 frames. ], batch size: 68, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:51:13,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:51:13,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1515400.0, ans=22.5 2023-10-04 03:51:14,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 03:51:17,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:51:17,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1515400.0, ans=0.0 2023-10-04 03:51:18,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:51:19,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:51:19,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:51:19,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:51:22,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 03:51:26,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 03:51:26,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:28,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:51:32,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:51:33,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1515466.6666666667, ans=0.07 2023-10-04 03:51:34,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:51:35,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:51:35,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:37,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 03:51:37,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:51:37,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1515466.6666666667, ans=0.95 2023-10-04 03:51:38,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=12.0 2023-10-04 03:51:38,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:38,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:51:38,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:51:41,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:51:43,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 03:51:44,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:47,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:51:48,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:51:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:51:52,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:51:54,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:51:54,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 03:51:54,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:51:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:52:00,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:52:02,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:52:02,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1515600.0, ans=0.125 2023-10-04 03:52:08,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:52:11,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 03:52:12,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:52:17,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:52:18,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:18,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 03:52:26,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:52:27,721 INFO [train.py:1046] (2/4) Epoch 43, batch 4250, loss[loss=0.1578, simple_loss=0.2415, pruned_loss=0.03706, over 24057.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2348, pruned_loss=0.03727, over 4692048.02 frames. ], batch size: 86, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:52:29,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.38 vs. limit=15.0 2023-10-04 03:52:30,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:52:30,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:52:31,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:35,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:52:35,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 03:52:35,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:52:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:42,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:52:45,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:45,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:52:46,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:52:46,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:52:47,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:48,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:52:49,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:52:54,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:52:55,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 03:52:57,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1515866.6666666667, ans=0.04949747468305833 2023-10-04 03:52:58,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 03:52:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:53:00,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:00,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:53:00,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:53:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:00,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:53:05,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:53:06,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:53:09,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:53:10,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:12,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 03:53:12,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:53:13,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 03:53:16,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:53:17,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:53:20,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:20,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:53:24,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 03:53:25,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:53:27,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:53:29,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1516000.0, ans=0.2 2023-10-04 03:53:30,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:33,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:33,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1516000.0, ans=0.125 2023-10-04 03:53:35,074 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.993e+02 2.241e+02 2.615e+02 3.958e+02, threshold=4.482e+02, percent-clipped=0.0 2023-10-04 03:53:35,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:53:35,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:53:36,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:53:37,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:53:39,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:53:39,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 03:53:40,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1516066.6666666667, ans=0.0 2023-10-04 03:53:41,882 INFO [train.py:1046] (2/4) Epoch 43, batch 4300, loss[loss=0.1404, simple_loss=0.2196, pruned_loss=0.03059, over 23556.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2346, pruned_loss=0.03713, over 4712452.67 frames. ], batch size: 134, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:53:41,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:42,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1516066.6666666667, ans=0.0 2023-10-04 03:53:44,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:53:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:53:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:53,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:53,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 03:53:55,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:53:58,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:53:58,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:53:58,431 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 03:53:59,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.62 vs. limit=15.0 2023-10-04 03:54:02,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:54:04,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:54:07,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 03:54:09,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:54:09,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 03:54:11,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:54:13,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:54:14,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:54:14,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:54:16,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:54:16,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1516200.0, ans=0.2 2023-10-04 03:54:17,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:54:20,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:54:20,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 03:54:20,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 03:54:22,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:54:26,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:26,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:54:26,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:26,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:54:26,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 03:54:26,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 03:54:27,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 03:54:27,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:54:29,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 03:54:29,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 03:54:33,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:54:35,077 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 03:54:36,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:54:37,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:54:37,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:54:39,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 03:54:41,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:54:41,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:41,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:54:42,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:54:42,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:54:45,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:54:46,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:54:48,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:48,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:54:48,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1516333.3333333333, ans=0.0 2023-10-04 03:54:51,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1516333.3333333333, ans=0.125 2023-10-04 03:54:53,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 03:54:55,432 INFO [train.py:1046] (2/4) Epoch 43, batch 4350, loss[loss=0.1449, simple_loss=0.2287, pruned_loss=0.03051, over 24464.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.03739, over 4711062.35 frames. ], batch size: 63, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:54:55,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:54:59,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:01,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:55:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:55:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:55:05,994 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:55:09,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:55:09,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1516466.6666666667, ans=0.125 2023-10-04 03:55:11,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1516466.6666666667, ans=0.125 2023-10-04 03:55:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:55:14,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:55:14,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:55:16,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.95 vs. limit=12.0 2023-10-04 03:55:17,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:55:18,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:55:21,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:55:27,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 03:55:27,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:28,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:34,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:35,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 03:55:39,101 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.47 vs. limit=15.0 2023-10-04 03:55:39,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:55:41,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:55:46,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 03:55:46,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:55:47,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:55:48,955 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 03:55:50,419 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 03:55:50,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:55:50,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:51,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:55:52,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.04 vs. limit=22.5 2023-10-04 03:55:53,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:55:53,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:55:54,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:55:57,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 03:55:57,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:57,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:55:57,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:59,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 03:55:59,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1516666.6666666667, ans=0.0 2023-10-04 03:56:00,983 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 03:56:00,987 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 03:56:00,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 03:56:02,234 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 1.917e+02 2.050e+02 2.278e+02 3.303e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-04 03:56:05,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:56:05,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:56:05,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:06,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:56:07,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1516666.6666666667, ans=0.125 2023-10-04 03:56:08,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 03:56:09,577 INFO [train.py:1046] (2/4) Epoch 43, batch 4400, loss[loss=0.1446, simple_loss=0.2269, pruned_loss=0.03119, over 23359.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2362, pruned_loss=0.03771, over 4715751.62 frames. ], batch size: 93, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:56:10,978 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 03:56:10,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:12,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.95 vs. limit=6.0 2023-10-04 03:56:14,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:56:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:15,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1516733.3333333333, ans=0.125 2023-10-04 03:56:16,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:56:19,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 03:56:19,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 03:56:21,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 03:56:21,081 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 03:56:21,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:56:21,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:56:23,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 03:56:25,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:25,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:25,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 03:56:28,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:29,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 03:56:29,531 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 03:56:31,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 03:56:31,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 03:56:32,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 03:56:32,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:35,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:56:35,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:56:36,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:56:38,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 03:56:38,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 03:56:40,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:40,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:56:40,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:41,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:42,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:42,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 03:56:44,324 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 03:56:48,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:48,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1516866.6666666667, ans=0.1 2023-10-04 03:56:54,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:56:57,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 03:57:01,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:57:03,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1516933.3333333333, ans=0.125 2023-10-04 03:57:04,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.62 vs. limit=15.0 2023-10-04 03:57:04,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:57:06,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:57:07,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 03:57:07,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:57:07,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:57:07,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:57:07,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:57:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 03:57:14,639 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.90 vs. limit=10.0 2023-10-04 03:57:16,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 03:57:17,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 03:57:17,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:17,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 03:57:17,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:57:20,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:57:22,687 INFO [train.py:1046] (2/4) Epoch 43, batch 4450, loss[loss=0.1334, simple_loss=0.2144, pruned_loss=0.02618, over 24412.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2365, pruned_loss=0.03774, over 4720252.28 frames. ], batch size: 58, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:57:22,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 03:57:22,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1517066.6666666667, ans=0.2 2023-10-04 03:57:26,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:57:28,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:29,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:57:34,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:57:34,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:57:38,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:39,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:57:41,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:57:43,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:43,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 03:57:43,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:57:44,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:44,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:57:44,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:57:48,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.50 vs. limit=15.0 2023-10-04 03:57:48,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:57:52,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:57:53,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:57:53,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:57:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:57,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:58:02,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:58:04,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 03:58:04,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 03:58:04,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:58:06,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:58:07,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 03:58:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:58:14,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:58:14,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 03:58:14,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:14,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:58:16,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:58:16,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:58:17,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:58:19,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=9.81 vs. limit=22.5 2023-10-04 03:58:20,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:58:20,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 03:58:23,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:58:24,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1517333.3333333333, ans=0.125 2023-10-04 03:58:26,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:58:26,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:58:28,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:28,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:58:29,420 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.027e+02 2.201e+02 2.513e+02 3.494e+02, threshold=4.403e+02, percent-clipped=0.0 2023-10-04 03:58:29,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:58:29,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1517333.3333333333, ans=0.0 2023-10-04 03:58:33,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 03:58:33,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1517333.3333333333, ans=0.0 2023-10-04 03:58:34,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:58:37,760 INFO [train.py:1046] (2/4) Epoch 43, batch 4500, loss[loss=0.1537, simple_loss=0.2342, pruned_loss=0.03664, over 23238.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2367, pruned_loss=0.03767, over 4729039.08 frames. ], batch size: 105, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:58:39,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:58:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 03:58:40,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 03:58:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:58:45,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1517400.0, ans=0.125 2023-10-04 03:58:46,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:48,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:58:49,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:58:50,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:58:50,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:58:50,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:59:01,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1517466.6666666667, ans=0.5 2023-10-04 03:59:04,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:59:05,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:59:07,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:59:07,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1517533.3333333333, ans=0.125 2023-10-04 03:59:08,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:59:09,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:59:09,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1517533.3333333333, ans=0.2 2023-10-04 03:59:15,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:59:19,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:59:23,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:59:26,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:59:27,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 03:59:28,257 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.93 vs. limit=15.0 2023-10-04 03:59:29,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:29,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:59:29,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:59:30,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:59:30,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:59:32,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 03:59:32,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:59:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:34,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1517600.0, ans=0.2 2023-10-04 03:59:35,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:59:37,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:59:39,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:41,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:59:41,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:59:41,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 03:59:44,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 03:59:44,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 03:59:48,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 03:59:51,410 INFO [train.py:1046] (2/4) Epoch 43, batch 4550, loss[loss=0.1376, simple_loss=0.2167, pruned_loss=0.02929, over 24269.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.0376, over 4731586.16 frames. ], batch size: 56, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:59:52,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 03:59:54,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:59:57,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1517733.3333333333, ans=0.125 2023-10-04 03:59:58,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:59:58,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:00:00,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:03,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:00:06,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:00:09,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:09,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:00:09,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:12,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:12,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:00:13,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:00:18,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 04:00:18,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 04:00:20,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=22.5 2023-10-04 04:00:21,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:00:21,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 04:00:24,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 04:00:24,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:00:28,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 04:00:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:00:29,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1517866.6666666667, ans=0.2 2023-10-04 04:00:31,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:33,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:33,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:00:34,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 04:00:37,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:00:40,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:42,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:00:42,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:44,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.57 vs. limit=22.5 2023-10-04 04:00:44,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 04:00:44,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 04:00:44,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:00:45,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 04:00:48,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 04:00:48,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:50,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:50,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:00:52,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:52,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:00:53,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:00:53,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 04:00:55,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:00:55,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:00:56,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 04:00:56,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:00:56,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 04:00:58,352 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.020e+02 2.212e+02 2.607e+02 3.843e+02, threshold=4.425e+02, percent-clipped=0.0 2023-10-04 04:00:58,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:00:58,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:00:58,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1518000.0, ans=0.125 2023-10-04 04:00:59,455 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.34 vs. limit=10.0 2023-10-04 04:01:00,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:01:01,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:01:01,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:01:02,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:01:04,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:01:06,027 INFO [train.py:1046] (2/4) Epoch 43, batch 4600, loss[loss=0.1331, simple_loss=0.1879, pruned_loss=0.03913, over 19380.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2337, pruned_loss=0.03727, over 4708825.42 frames. ], batch size: 389, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:01:07,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:08,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:01:12,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:01:12,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:01:12,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:13,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 04:01:15,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:01:19,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:01:20,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:22,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:22,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1518133.3333333333, ans=0.125 2023-10-04 04:01:28,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 04:01:28,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:30,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:31,587 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.30 vs. limit=15.0 2023-10-04 04:01:34,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:01:34,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 04:01:41,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:01:42,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:01:46,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:46,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:01:47,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:01:51,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 04:01:52,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:01:57,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:01:58,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:00,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:00,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 04:02:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:01,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 04:02:03,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:03,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:04,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:05,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:02:06,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:06,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 04:02:07,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 04:02:07,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 04:02:07,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:09,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:09,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:10,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:12,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1518333.3333333333, ans=0.125 2023-10-04 04:02:19,788 INFO [train.py:1046] (2/4) Epoch 43, batch 4650, loss[loss=0.1527, simple_loss=0.2275, pruned_loss=0.03893, over 23503.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.233, pruned_loss=0.03735, over 4699910.99 frames. ], batch size: 285, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:02:21,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:02:22,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1518400.0, ans=0.04949747468305833 2023-10-04 04:02:24,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:02:24,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:24,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:02:24,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:27,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:30,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 04:02:32,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:02:34,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 04:02:36,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:02:37,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 04:02:37,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:02:37,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 04:02:37,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 04:02:37,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:37,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1518466.6666666667, ans=0.0 2023-10-04 04:02:38,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:02:43,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:02:44,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:44,965 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 04:02:48,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:49,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 04:02:51,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:51,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:02:52,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 04:02:52,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:02:55,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:02:55,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1518533.3333333333, ans=0.1 2023-10-04 04:02:59,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:05,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:03:07,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:03:08,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:03:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:03:11,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 04:03:11,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 04:03:13,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 04:03:13,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 04:03:16,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:17,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1518666.6666666667, ans=0.0 2023-10-04 04:03:23,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:03:23,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:03:23,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 04:03:24,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:26,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 2.022e+02 2.193e+02 2.608e+02 3.826e+02, threshold=4.386e+02, percent-clipped=0.0 2023-10-04 04:03:26,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:03:26,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:03:26,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:03:29,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:03:29,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:03:30,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:03:33,368 INFO [train.py:1046] (2/4) Epoch 43, batch 4700, loss[loss=0.1559, simple_loss=0.2332, pruned_loss=0.03927, over 23424.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2334, pruned_loss=0.0373, over 4703955.56 frames. ], batch size: 285, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:03:33,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:33,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:03:33,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:03:33,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1518733.3333333333, ans=0.125 2023-10-04 04:03:35,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 04:03:35,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:03:36,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 04:03:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:48,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:48,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:03:49,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:03:50,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:03:51,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1518800.0, ans=0.125 2023-10-04 04:03:55,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 04:03:56,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 04:03:56,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:57,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:03:57,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:04:02,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:04:04,371 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.81 vs. limit=15.0 2023-10-04 04:04:08,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:04:08,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 04:04:11,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:04:16,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 04:04:17,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:04:18,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:19,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1518933.3333333333, ans=0.125 2023-10-04 04:04:22,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 04:04:23,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:04:27,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:04:28,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 04:04:29,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:29,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:04:31,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:04:31,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:04:33,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 04:04:33,306 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 04:04:33,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1519000.0, ans=0.125 2023-10-04 04:04:33,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1519000.0, ans=0.09899494936611666 2023-10-04 04:04:34,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.32 vs. limit=15.0 2023-10-04 04:04:34,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:04:38,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:38,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:38,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 04:04:40,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:45,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 04:04:48,891 INFO [train.py:1046] (2/4) Epoch 43, batch 4750, loss[loss=0.1578, simple_loss=0.2453, pruned_loss=0.03519, over 24488.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2346, pruned_loss=0.03778, over 4705931.12 frames. ], batch size: 66, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:04:48,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:04:49,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.37 vs. limit=15.0 2023-10-04 04:04:50,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:04:50,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1519066.6666666667, ans=0.125 2023-10-04 04:04:55,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:04:55,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:04:56,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 04:04:56,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:04:59,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 04:05:01,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:05:02,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:05:03,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 04:05:09,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:05:11,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 04:05:12,254 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-10-04 04:05:12,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:14,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1519133.3333333333, ans=0.125 2023-10-04 04:05:15,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:05:15,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:05:15,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:05:17,354 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 04:05:17,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 04:05:19,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.76 vs. limit=22.5 2023-10-04 04:05:23,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 04:05:26,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:05:27,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:05:30,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:05:30,473 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 04:05:30,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:05:30,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1519200.0, ans=0.0 2023-10-04 04:05:31,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:05:33,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1519266.6666666667, ans=0.125 2023-10-04 04:05:34,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:05:37,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 04:05:37,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 04:05:39,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:05:39,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:05:40,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:05:42,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 04:05:42,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 04:05:45,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 04:05:48,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:05:50,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:05:50,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 04:05:51,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:52,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:05:53,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1519333.3333333333, ans=0.125 2023-10-04 04:05:54,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:05:54,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:05:56,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:05:57,482 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.030e+02 2.225e+02 2.486e+02 3.690e+02, threshold=4.449e+02, percent-clipped=0.0 2023-10-04 04:06:00,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:00,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 04:06:01,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 04:06:02,913 INFO [train.py:1046] (2/4) Epoch 43, batch 4800, loss[loss=0.1334, simple_loss=0.2121, pruned_loss=0.02737, over 24419.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2356, pruned_loss=0.0379, over 4715064.57 frames. ], batch size: 58, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:06:02,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 04:06:05,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:06:05,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:07,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 04:06:10,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.0 2023-10-04 04:06:11,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:13,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:19,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:06:20,026 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=22.5 2023-10-04 04:06:20,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:06:20,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:20,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 04:06:20,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1519466.6666666667, ans=0.125 2023-10-04 04:06:21,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:06:23,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:06:24,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:06:26,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:06:29,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:29,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:06:31,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:31,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:06:31,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:32,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:06:35,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:38,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:39,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:39,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:06:40,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.16 vs. limit=10.0 2023-10-04 04:06:40,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 04:06:40,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:44,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 04:06:44,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 04:06:44,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:45,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:06:45,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:06:45,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:06:45,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:06:47,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:06:49,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:06:49,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1519600.0, ans=0.125 2023-10-04 04:06:52,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:52,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:06:55,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:06:57,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1519600.0, ans=0.125 2023-10-04 04:07:00,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 04:07:00,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:07:00,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:01,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:07:02,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:07:04,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:07:05,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:07:05,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:07,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:07:08,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:07:09,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:07:11,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:12,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:12,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:07:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 04:07:17,520 INFO [train.py:1046] (2/4) Epoch 43, batch 4850, loss[loss=0.1524, simple_loss=0.2202, pruned_loss=0.04232, over 23596.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2352, pruned_loss=0.038, over 4713701.45 frames. ], batch size: 256, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:07:17,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 04:07:17,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:07:17,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:07:17,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:07:17,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:18,086 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.18 vs. limit=12.0 2023-10-04 04:07:21,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:07:27,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 04:07:29,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:32,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:07:33,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:07:34,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:38,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:39,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:07:40,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:07:40,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 04:07:45,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:07:46,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:07:48,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:07:49,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:07:49,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 04:07:50,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:07:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:07:55,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:07:55,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 04:07:55,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 04:07:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:08:03,085 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:08:05,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:08:07,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 04:08:07,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:08:08,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:08:10,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:08:11,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 04:08:11,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:08:11,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 04:08:11,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:12,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:08:14,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 04:08:24,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:08:26,940 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.023e+02 2.216e+02 2.527e+02 3.488e+02, threshold=4.432e+02, percent-clipped=0.0 2023-10-04 04:08:30,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:08:31,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:08:32,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1520066.6666666667, ans=0.1 2023-10-04 04:08:33,726 INFO [train.py:1046] (2/4) Epoch 43, batch 4900, loss[loss=0.1483, simple_loss=0.2234, pruned_loss=0.03665, over 24608.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.0377, over 4719238.20 frames. ], batch size: 60, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:08:35,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 04:08:35,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:08:40,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:08:42,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:42,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:08:44,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 04:08:51,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 04:08:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 04:08:55,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 04:08:55,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:08:55,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:56,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:08:57,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:08:57,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:08:57,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 04:08:57,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1520133.3333333333, ans=0.1 2023-10-04 04:09:00,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 04:09:00,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:09:01,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:09:03,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:09:05,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:09:05,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:06,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 04:09:07,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:09:09,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:09:09,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 04:09:09,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 04:09:14,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 04:09:16,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:09:17,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:09:17,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:09:19,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:19,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 04:09:19,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:09:19,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 04:09:22,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:09:24,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:09:26,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1520266.6666666667, ans=0.035 2023-10-04 04:09:28,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 04:09:29,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:09:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 04:09:29,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 04:09:37,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:09:38,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:09:38,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 04:09:38,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:09:39,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:09:41,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:44,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1520333.3333333333, ans=0.125 2023-10-04 04:09:45,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:09:45,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:09:45,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:09:45,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 04:09:46,725 INFO [train.py:1046] (2/4) Epoch 43, batch 4950, loss[loss=0.1531, simple_loss=0.2371, pruned_loss=0.03454, over 24627.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.234, pruned_loss=0.03759, over 4717302.05 frames. ], batch size: 65, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:09:48,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:09:51,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:09:51,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:09:53,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 04:09:54,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 04:09:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:09:55,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 04:09:55,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:55,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:09:57,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:09:57,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:09:59,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:59,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:10:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:10:02,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1520466.6666666667, ans=0.125 2023-10-04 04:10:03,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:10:06,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:06,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:10:09,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:10:10,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1520466.6666666667, ans=0.0 2023-10-04 04:10:12,450 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:10:13,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:14,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:10:16,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:17,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:19,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:10:19,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 04:10:21,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 04:10:22,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:24,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:10:25,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:10:26,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:10:26,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:10:26,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:10:29,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:10:31,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:10:33,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:10:34,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:35,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:37,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 04:10:37,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:10:38,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:10:44,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:10:45,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:10:45,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:10:45,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:45,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:10:46,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:10:50,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:10:50,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:10:51,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:10:53,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 04:10:54,731 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.929e+02 2.246e+02 2.665e+02 5.151e+02, threshold=4.493e+02, percent-clipped=1.0 2023-10-04 04:10:57,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:00,475 INFO [train.py:1046] (2/4) Epoch 43, batch 5000, loss[loss=0.137, simple_loss=0.2182, pruned_loss=0.02784, over 24454.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2335, pruned_loss=0.03714, over 4715262.93 frames. ], batch size: 58, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:11:00,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 04:11:01,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:11:04,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1520733.3333333333, ans=0.0 2023-10-04 04:11:07,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:11:07,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:11:07,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1520733.3333333333, ans=0.025 2023-10-04 04:11:09,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 04:11:11,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 04:11:12,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:11:15,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 04:11:15,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:11:15,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:11:15,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 04:11:15,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:16,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:11:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 04:11:18,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:19,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:11:20,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 04:11:21,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1520800.0, ans=0.0 2023-10-04 04:11:22,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 04:11:22,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:11:22,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 04:11:24,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:11:24,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:24,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:11:24,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 04:11:24,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 04:11:27,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 04:11:27,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:27,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:30,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 04:11:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:11:30,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:32,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:33,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 04:11:34,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 04:11:36,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:11:38,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:11:42,410 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 04:11:45,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:11:46,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:46,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:11:48,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 04:11:48,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1520933.3333333333, ans=0.0 2023-10-04 04:11:49,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:49,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:11:49,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:11:50,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 04:11:52,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:11:55,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:11:55,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:11:56,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.30 vs. limit=22.5 2023-10-04 04:12:03,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 04:12:07,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:11,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1521000.0, ans=0.1 2023-10-04 04:12:15,124 INFO [train.py:1046] (2/4) Epoch 43, batch 5050, loss[loss=0.1717, simple_loss=0.2627, pruned_loss=0.04037, over 23978.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03705, over 4713892.81 frames. ], batch size: 80, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:12:15,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:12:15,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1521066.6666666667, ans=0.2 2023-10-04 04:12:16,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:16,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:12:16,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:12:16,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:12:16,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:12:16,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:20,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:20,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 04:12:20,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:12:24,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:12:24,851 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.78 vs. limit=12.0 2023-10-04 04:12:25,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:12:27,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 04:12:28,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:28,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1521133.3333333333, ans=0.125 2023-10-04 04:12:29,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:12:31,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:12:33,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:12:34,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:12:42,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 04:12:42,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:12:43,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.57 vs. limit=15.0 2023-10-04 04:12:43,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:12:43,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 04:12:43,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:12:44,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:45,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:12:45,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 04:12:45,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1521200.0, ans=10.0 2023-10-04 04:12:46,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 04:12:48,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:50,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:12:54,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:54,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 04:12:56,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:12:59,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 04:13:01,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:13:02,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:13:02,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:03,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:13:05,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:13:08,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:13:08,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:10,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:13:10,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:13:10,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 04:13:11,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:13:13,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:13:15,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.19 vs. limit=12.0 2023-10-04 04:13:17,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:13:17,230 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 04:13:17,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:13:19,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:13:19,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:21,183 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 04:13:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:13:22,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 04:13:22,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:25,265 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 1.920e+02 2.111e+02 2.330e+02 3.749e+02, threshold=4.222e+02, percent-clipped=0.0 2023-10-04 04:13:26,326 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=14.58 vs. limit=15.0 2023-10-04 04:13:28,067 INFO [train.py:1046] (2/4) Epoch 43, batch 5100, loss[loss=0.1454, simple_loss=0.2327, pruned_loss=0.02901, over 24687.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.235, pruned_loss=0.03715, over 4710224.18 frames. ], batch size: 65, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:13:28,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:28,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:28,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 04:13:30,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 04:13:31,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:31,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:13:31,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:13:36,270 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 04:13:36,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1521400.0, ans=0.1 2023-10-04 04:13:37,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:13:39,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 04:13:41,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 04:13:41,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:43,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:13:43,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1521466.6666666667, ans=0.1 2023-10-04 04:13:47,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:13:47,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 04:13:47,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 04:13:51,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:51,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:13:54,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:55,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1521466.6666666667, ans=0.125 2023-10-04 04:13:58,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 04:13:58,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:13:58,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:14:00,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 04:14:02,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:03,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:03,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 04:14:04,832 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 04:14:04,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:05,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1521533.3333333333, ans=0.0 2023-10-04 04:14:06,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 04:14:06,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 04:14:06,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1521533.3333333333, ans=0.0 2023-10-04 04:14:06,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1521533.3333333333, ans=0.125 2023-10-04 04:14:08,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1521533.3333333333, ans=0.1 2023-10-04 04:14:11,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:14:18,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1521600.0, ans=0.0 2023-10-04 04:14:19,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:19,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1521600.0, ans=0.125 2023-10-04 04:14:22,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 04:14:22,281 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 04:14:22,288 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 04:14:23,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 04:14:23,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:25,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.96 vs. limit=15.0 2023-10-04 04:14:26,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 04:14:29,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 04:14:32,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 04:14:32,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:14:34,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1521666.6666666667, ans=0.2 2023-10-04 04:14:36,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 04:14:37,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:14:37,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 04:14:42,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1521733.3333333333, ans=0.2 2023-10-04 04:14:43,777 INFO [train.py:1046] (2/4) Epoch 43, batch 5150, loss[loss=0.1553, simple_loss=0.2454, pruned_loss=0.03258, over 24653.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03729, over 4716117.07 frames. ], batch size: 68, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:14:43,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:14:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:14:43,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:14:43,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:14:43,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:14:45,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:14:46,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 04:14:46,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 04:14:46,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 04:14:48,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:14:48,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 04:14:49,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:49,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 04:14:50,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:14:52,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:14:53,716 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:14:56,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:14:56,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 04:14:57,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:57,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:14:59,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:14:59,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:14:59,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:01,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:15:01,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:15:03,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 04:15:04,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:15:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:15:07,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:15:09,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 04:15:10,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:15:14,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:15:16,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 04:15:20,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:15:24,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:25,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.65 vs. limit=15.0 2023-10-04 04:15:25,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:15:26,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1521933.3333333333, ans=0.125 2023-10-04 04:15:27,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=22.5 2023-10-04 04:15:29,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:15:30,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:15:33,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 04:15:37,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:15:38,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1521933.3333333333, ans=15.0 2023-10-04 04:15:39,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:15:39,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:15:40,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.72 vs. limit=12.0 2023-10-04 04:15:41,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:15:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:15:44,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 04:15:48,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:15:48,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:15:49,274 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.34 vs. limit=15.0 2023-10-04 04:15:50,376 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.77 vs. limit=15.0 2023-10-04 04:15:51,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:15:51,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:15:52,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:15:52,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:15:52,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:15:53,835 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.002e+02 2.172e+02 2.451e+02 4.119e+02, threshold=4.344e+02, percent-clipped=0.0 2023-10-04 04:15:55,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:15:56,707 INFO [train.py:1046] (2/4) Epoch 43, batch 5200, loss[loss=0.1625, simple_loss=0.2465, pruned_loss=0.03925, over 23178.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2365, pruned_loss=0.03779, over 4717067.66 frames. ], batch size: 93, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:15:58,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:16:00,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:04,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 04:16:06,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:16:06,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:10,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:16:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:14,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 04:16:14,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-10-04 04:16:16,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:16:18,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:20,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 04:16:22,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:16:23,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:16:24,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 04:16:24,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 04:16:26,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 04:16:27,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:27,820 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 04:16:27,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:27,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:29,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:16:29,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 04:16:29,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:16:31,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1522200.0, ans=0.125 2023-10-04 04:16:33,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:34,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 04:16:35,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 04:16:35,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 04:16:41,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 04:16:41,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:16:44,781 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.05 vs. limit=15.0 2023-10-04 04:16:45,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:16:46,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:16:47,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 04:16:48,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:48,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:16:49,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:49,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:16:50,986 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:16:53,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:16:54,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:16:58,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:58,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:16:58,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:59,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1522333.3333333333, ans=0.0 2023-10-04 04:17:06,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:17:06,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 04:17:06,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1522333.3333333333, ans=0.0 2023-10-04 04:17:08,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:17:08,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:17:10,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:11,339 INFO [train.py:1046] (2/4) Epoch 43, batch 5250, loss[loss=0.155, simple_loss=0.2436, pruned_loss=0.03314, over 24416.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2348, pruned_loss=0.03756, over 4699706.44 frames. ], batch size: 69, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:17:11,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:17:12,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:17:15,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:17:17,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:17:17,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:17:18,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:17:24,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:17:24,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1522466.6666666667, ans=0.125 2023-10-04 04:17:27,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:17:28,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:17:29,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:17:31,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 04:17:31,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:17:32,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:48,779 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.36 vs. limit=15.0 2023-10-04 04:17:54,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1522600.0, ans=0.125 2023-10-04 04:18:04,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.72 vs. limit=22.5 2023-10-04 04:18:18,266 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.988e+02 2.166e+02 2.434e+02 3.421e+02, threshold=4.331e+02, percent-clipped=0.0 2023-10-04 04:18:19,585 INFO [train.py:1046] (2/4) Epoch 43, batch 5300, loss[loss=0.1376, simple_loss=0.2182, pruned_loss=0.02847, over 24569.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2336, pruned_loss=0.03723, over 4687508.16 frames. ], batch size: 60, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:18:32,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1522800.0, ans=0.2 2023-10-04 04:18:33,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:18:33,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 04:18:33,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 04:18:33,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:34,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:34,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:34,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:34,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:34,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:18:34,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:34,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:18:34,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:18:34,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 04:18:34,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 04:18:34,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 04:18:34,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:18:34,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 04:18:35,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 04:18:35,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:35,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:35,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:18:35,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:18:35,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:18:36,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:18:36,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:36,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:36,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:18:36,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:36,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:18:36,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:36,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:18:37,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 04:18:37,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:18:37,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:37,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 04:18:37,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 04:18:37,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:18:37,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:18:37,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 04:18:37,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 04:18:37,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:18:38,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:18:38,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:18:38,792 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 04:18:38,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 04:18:38,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:18:38,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:39,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 04:18:39,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 04:18:39,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 04:18:39,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:18:43,400 INFO [train.py:1046] (2/4) Epoch 44, batch 0, loss[loss=0.1446, simple_loss=0.2319, pruned_loss=0.02864, over 24470.00 frames. ], tot_loss[loss=0.1446, simple_loss=0.2319, pruned_loss=0.02864, over 24470.00 frames. ], batch size: 66, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:18:43,401 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 04:18:55,636 INFO [train.py:1078] (2/4) Epoch 44, validation: loss=0.3443, simple_loss=0.2733, pruned_loss=0.2076, over 1125622.00 frames. 2023-10-04 04:18:55,637 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 04:18:59,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 04:18:59,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:19:00,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:19:05,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1522813.3333333333, ans=0.2 2023-10-04 04:19:06,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:06,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:19:06,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:08,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 04:19:08,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 04:19:09,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:11,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:15,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:15,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:16,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:19:18,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:19:18,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 04:19:19,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:19:21,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1522880.0, ans=0.2 2023-10-04 04:19:28,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:19:28,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:31,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 04:19:34,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-10-04 04:19:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:19:34,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:19:36,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:19:41,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:19:45,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:19:47,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1523013.3333333333, ans=0.125 2023-10-04 04:19:51,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 04:19:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 04:19:54,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:19:54,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:19:54,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:19:55,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:57,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 04:19:59,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:19:59,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:20:00,804 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.86 vs. limit=22.5 2023-10-04 04:20:04,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:20:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 04:20:07,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:20:08,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1523080.0, ans=0.05 2023-10-04 04:20:10,409 INFO [train.py:1046] (2/4) Epoch 44, batch 50, loss[loss=0.1572, simple_loss=0.2441, pruned_loss=0.03517, over 23969.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2362, pruned_loss=0.0368, over 1082031.12 frames. ], batch size: 80, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:20:10,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:20:10,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1523146.6666666667, ans=0.125 2023-10-04 04:20:13,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:20:13,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 04:20:13,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1523146.6666666667, ans=0.2 2023-10-04 04:20:14,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:20:14,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:20:15,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:20:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:20:19,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1523146.6666666667, ans=0.0 2023-10-04 04:20:20,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1523146.6666666667, ans=0.1 2023-10-04 04:20:22,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:20:26,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 04:20:26,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:28,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.47 vs. limit=10.0 2023-10-04 04:20:31,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:20:33,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 04:20:35,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 04:20:37,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:20:38,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:20:38,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:41,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:20:43,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:20:43,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:20:43,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:46,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1523280.0, ans=0.125 2023-10-04 04:20:51,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.65 vs. limit=15.0 2023-10-04 04:20:51,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:20:52,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:20:52,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:20:54,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 04:20:56,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:20:57,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:20:57,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 04:20:57,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:20:57,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1523346.6666666667, ans=0.0 2023-10-04 04:20:59,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 04:21:04,387 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.994e+02 2.209e+02 2.588e+02 5.609e+02, threshold=4.418e+02, percent-clipped=8.0 2023-10-04 04:21:04,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1523346.6666666667, ans=0.5 2023-10-04 04:21:07,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:07,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:21:09,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:09,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:21:09,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:21:12,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 04:21:12,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 04:21:13,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:13,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:21:15,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:21:16,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:21:16,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 04:21:16,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 04:21:17,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 04:21:18,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:19,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:21:19,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 04:21:19,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 04:21:20,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:20,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:21:22,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:21:22,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:21:23,643 INFO [train.py:1046] (2/4) Epoch 44, batch 100, loss[loss=0.1315, simple_loss=0.2122, pruned_loss=0.02543, over 24402.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2373, pruned_loss=0.03677, over 1898022.56 frames. ], batch size: 58, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:21:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:21:29,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1523480.0, ans=0.125 2023-10-04 04:21:30,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:21:33,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:21:34,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 04:21:34,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:38,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:21:38,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:21:38,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:21:38,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:21:38,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:21:41,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 04:21:43,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:21:43,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:43,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:21:45,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 04:21:47,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:47,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:47,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1523546.6666666667, ans=0.0 2023-10-04 04:21:48,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:21:49,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1523546.6666666667, ans=0.0 2023-10-04 04:21:51,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:21:54,296 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 04:21:55,684 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 04:21:55,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:21:55,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:21:59,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:22:00,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:22:02,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:06,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:06,648 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 04:22:08,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 04:22:11,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1523680.0, ans=0.125 2023-10-04 04:22:11,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1523680.0, ans=0.125 2023-10-04 04:22:12,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:22:14,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:22:16,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:20,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:22,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1523746.6666666667, ans=0.125 2023-10-04 04:22:23,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:22:25,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:22:26,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:27,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:29,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:29,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:22:30,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:31,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 04:22:31,051 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 04:22:31,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:32,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:22:33,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:33,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:33,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 04:22:34,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:22:35,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:22:35,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:35,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:35,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1523813.3333333333, ans=0.125 2023-10-04 04:22:36,998 INFO [train.py:1046] (2/4) Epoch 44, batch 150, loss[loss=0.1532, simple_loss=0.2434, pruned_loss=0.0315, over 24341.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2374, pruned_loss=0.03727, over 2518611.07 frames. ], batch size: 74, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:22:37,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:37,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:22:37,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:22:39,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:40,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1523813.3333333333, ans=0.0 2023-10-04 04:22:42,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:22:42,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:22:43,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:44,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:44,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:48,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:22:50,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:52,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 04:22:52,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 04:22:52,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 04:22:54,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1523880.0, ans=0.125 2023-10-04 04:22:55,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:22:55,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:22:56,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1523880.0, ans=22.5 2023-10-04 04:22:58,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:22:59,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:59,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:59,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:23:03,027 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 04:23:04,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:23:10,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:23:14,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:23:14,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1523946.6666666667, ans=0.125 2023-10-04 04:23:15,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 04:23:18,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:23:18,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:23:18,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:23:20,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:23:21,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:23:21,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:23:22,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:24,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 04:23:28,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:29,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:23:31,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:23:31,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:23:32,515 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.030e+02 2.255e+02 2.508e+02 4.097e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-04 04:23:32,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:32,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1524013.3333333333, ans=0.0 2023-10-04 04:23:34,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 04:23:38,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:23:39,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:23:41,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:23:44,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:23:44,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 04:23:44,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:23:44,596 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 04:23:47,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:23:51,243 INFO [train.py:1046] (2/4) Epoch 44, batch 200, loss[loss=0.1425, simple_loss=0.2276, pruned_loss=0.02874, over 24486.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2383, pruned_loss=0.03769, over 3012536.28 frames. ], batch size: 66, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:23:52,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:23:52,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:23:56,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 04:23:56,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:23:56,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:23:57,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1524146.6666666667, ans=0.0 2023-10-04 04:23:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 04:23:58,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:24:01,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:01,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:05,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:24:05,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:24:05,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:22,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:24:22,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:24:22,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:24:23,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:24:23,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 04:24:25,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:24:26,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:27,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:24:29,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:24:29,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:24:30,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 04:24:30,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:24:30,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:36,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:24:36,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1524346.6666666667, ans=0.125 2023-10-04 04:24:42,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:24:47,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:48,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:24:52,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1524413.3333333333, ans=0.2 2023-10-04 04:24:54,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:57,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 04:24:58,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:58,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:24:58,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:25:00,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:25:00,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1524413.3333333333, ans=0.125 2023-10-04 04:25:01,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 04:25:01,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:02,003 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 04:25:04,607 INFO [train.py:1046] (2/4) Epoch 44, batch 250, loss[loss=0.1547, simple_loss=0.2482, pruned_loss=0.03058, over 24366.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2376, pruned_loss=0.03773, over 3385133.03 frames. ], batch size: 77, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:25:04,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:06,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:25:07,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:07,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:25:07,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1524480.0, ans=0.1 2023-10-04 04:25:08,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.81 vs. limit=15.0 2023-10-04 04:25:09,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1524480.0, ans=0.125 2023-10-04 04:25:10,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:25:10,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:12,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:25:16,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:25:22,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:25:26,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:25:26,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:25:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:25:32,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:25:32,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:25:34,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:25:34,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:25:34,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:25:35,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:25:37,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:25:38,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 04:25:38,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:25:40,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:25:42,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:25:42,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:25:43,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:25:44,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:25:44,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:25:48,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:50,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:25:50,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:25:54,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:25:58,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:59,830 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.036e+02 2.263e+02 2.637e+02 3.971e+02, threshold=4.526e+02, percent-clipped=0.0 2023-10-04 04:26:02,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:26:04,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1524746.6666666667, ans=0.125 2023-10-04 04:26:05,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:26:07,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:26:09,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 04:26:10,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:26:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:26:12,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 04:26:12,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:26:13,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:26:13,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 04:26:17,852 INFO [train.py:1046] (2/4) Epoch 44, batch 300, loss[loss=0.142, simple_loss=0.2189, pruned_loss=0.03258, over 23462.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2358, pruned_loss=0.0373, over 3691754.99 frames. ], batch size: 134, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:26:17,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:26:19,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:26:23,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:26:23,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 04:26:24,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:26:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:26:26,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 04:26:27,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:26:31,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:26:35,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.72 vs. limit=22.5 2023-10-04 04:26:35,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:26:35,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 04:26:42,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 04:26:42,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:26:45,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:26:46,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:26:46,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 04:26:46,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:26:47,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:26:49,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1524946.6666666667, ans=0.1 2023-10-04 04:26:50,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:26:50,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:26:55,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:26:55,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 04:26:57,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:26:58,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:00,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 04:27:00,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:05,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:27:08,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:27:08,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 04:27:12,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:12,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:27:16,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:27:18,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 04:27:19,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:27:19,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:21,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 04:27:23,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:23,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:25,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:27:25,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:26,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:27:29,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 04:27:31,116 INFO [train.py:1046] (2/4) Epoch 44, batch 350, loss[loss=0.1612, simple_loss=0.2504, pruned_loss=0.03599, over 24394.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2354, pruned_loss=0.03675, over 3923264.40 frames. ], batch size: 77, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:27:33,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:37,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:27:39,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:39,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:42,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 04:27:45,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:27:45,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 04:27:49,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:49,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 04:27:50,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:53,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 04:27:54,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:27:55,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1525213.3333333333, ans=0.125 2023-10-04 04:27:57,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:57,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:27:58,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1525213.3333333333, ans=0.0 2023-10-04 04:27:59,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:27:59,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:00,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:28:00,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:00,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:28:03,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:28:03,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:28:07,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:10,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:28:10,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:28:11,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:28:11,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:16,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 04:28:16,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:28:21,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:21,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:21,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:28:23,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 04:28:25,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:27,109 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 04:28:28,274 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.956e+02 2.278e+02 2.648e+02 3.683e+02, threshold=4.557e+02, percent-clipped=0.0 2023-10-04 04:28:28,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 04:28:28,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:32,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:28:32,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 04:28:35,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:36,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:28:38,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:39,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:39,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:42,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:43,766 INFO [train.py:1046] (2/4) Epoch 44, batch 400, loss[loss=0.1481, simple_loss=0.2216, pruned_loss=0.03728, over 24502.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2347, pruned_loss=0.03671, over 4080719.59 frames. ], batch size: 58, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:28:45,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:28:48,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:28:50,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 04:28:50,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:50,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:28:52,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:28:52,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:28:55,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.67 vs. limit=15.0 2023-10-04 04:28:55,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:56,137 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.04 vs. limit=15.0 2023-10-04 04:28:56,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:28:58,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1525546.6666666667, ans=0.125 2023-10-04 04:28:59,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 04:29:00,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 04:29:00,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:29:01,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1525546.6666666667, ans=0.125 2023-10-04 04:29:03,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 04:29:03,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:29:07,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:29:07,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:07,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 04:29:07,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:29:09,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:29:09,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:10,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:29:11,951 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 04:29:13,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 04:29:17,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:29:19,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:29:19,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 04:29:22,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 04:29:24,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:29:27,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:29:32,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 04:29:34,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1525680.0, ans=0.125 2023-10-04 04:29:35,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:29:36,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 04:29:38,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:40,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:29:40,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 04:29:43,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:29:44,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:29:47,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:29:50,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:29:50,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 04:29:53,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:29:54,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 04:29:55,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:29:55,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:29:59,134 INFO [train.py:1046] (2/4) Epoch 44, batch 450, loss[loss=0.1686, simple_loss=0.249, pruned_loss=0.04411, over 23622.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.235, pruned_loss=0.0368, over 4226705.12 frames. ], batch size: 85, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:29:59,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 04:30:00,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:30:02,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:30:02,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:30:03,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 04:30:03,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:30:04,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:30:05,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:30:06,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 04:30:06,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:30:07,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:30:09,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:30:15,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1525880.0, ans=0.1 2023-10-04 04:30:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:17,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:30:17,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 04:30:17,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1525880.0, ans=0.125 2023-10-04 04:30:18,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 04:30:22,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:30:25,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:27,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:30:30,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1525946.6666666667, ans=0.0 2023-10-04 04:30:33,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:30:33,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:30:35,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 04:30:37,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 04:30:38,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 04:30:39,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:30:39,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:30:41,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:30:42,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 04:30:42,713 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 04:30:42,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:45,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:30:45,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 04:30:48,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:30:49,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:30:49,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 04:30:51,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 04:30:53,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:30:56,704 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.766e+02 2.172e+02 2.484e+02 2.956e+02 4.900e+02, threshold=4.968e+02, percent-clipped=2.0 2023-10-04 04:30:56,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:30:56,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:30:58,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 04:31:01,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:31:01,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 04:31:04,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 04:31:04,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:31:10,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:31:12,721 INFO [train.py:1046] (2/4) Epoch 44, batch 500, loss[loss=0.1493, simple_loss=0.2241, pruned_loss=0.03728, over 23298.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.03688, over 4326136.94 frames. ], batch size: 119, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:31:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:31:12,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:31:14,220 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 04:31:17,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:31:18,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:31:18,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:31:18,531 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 04:31:19,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 04:31:19,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:31:24,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:31:28,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:31:30,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:31:32,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:31:32,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:31:33,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:41,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:42,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:31:42,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:31:42,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:42,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 04:31:42,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:31:45,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:31:46,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:31:46,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:31:47,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.82 vs. limit=10.0 2023-10-04 04:31:48,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:48,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 04:31:50,911 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 04:31:53,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1526280.0, ans=0.125 2023-10-04 04:31:56,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:31:57,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:58,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:58,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:32:00,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:32:02,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 04:32:05,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:32:06,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:07,461 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.94 vs. limit=15.0 2023-10-04 04:32:08,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:09,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:32:11,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1526413.3333333333, ans=0.1 2023-10-04 04:32:15,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:32:17,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 04:32:17,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:32:21,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 04:32:22,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:32:24,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:27,412 INFO [train.py:1046] (2/4) Epoch 44, batch 550, loss[loss=0.1407, simple_loss=0.2214, pruned_loss=0.02997, over 20605.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2356, pruned_loss=0.03719, over 4417112.35 frames. ], batch size: 45, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:32:30,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 04:32:33,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 04:32:33,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:33,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 04:32:35,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:32:35,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:35,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:35,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1526480.0, ans=0.125 2023-10-04 04:32:36,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:36,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:32:37,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:32:39,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:40,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 04:32:40,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:32:46,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:32:46,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:48,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:32:50,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:53,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 04:32:54,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 04:32:56,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:32:56,741 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:33:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:33:02,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:33:04,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:33:07,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:07,685 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 04:33:07,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:33:09,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 04:33:12,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:33:13,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:33:13,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:33:13,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:15,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 04:33:17,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 04:33:17,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:17,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:33:18,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:33:18,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:33:23,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:33:23,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:33:25,075 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.940e+02 2.267e+02 2.484e+02 4.509e+02, threshold=4.533e+02, percent-clipped=0.0 2023-10-04 04:33:26,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:33:27,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:27,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 04:33:28,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:33:30,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:31,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:33:31,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:32,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:33:32,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 04:33:39,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 04:33:42,093 INFO [train.py:1046] (2/4) Epoch 44, batch 600, loss[loss=0.1489, simple_loss=0.2363, pruned_loss=0.03075, over 24512.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2363, pruned_loss=0.03783, over 4477614.52 frames. ], batch size: 66, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:33:43,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 04:33:43,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:33:43,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:33:44,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:50,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:33:52,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:33:56,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 04:33:57,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:33:58,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:34:01,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:04,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.37 vs. limit=12.0 2023-10-04 04:34:04,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 04:34:04,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:34:04,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1526880.0, ans=0.1 2023-10-04 04:34:05,314 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.99 vs. limit=6.0 2023-10-04 04:34:10,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 04:34:12,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:34:12,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:12,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:34:17,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:34:19,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:34:19,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:34:26,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:34:28,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1527013.3333333333, ans=0.125 2023-10-04 04:34:31,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:34:31,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:34:31,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:38,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 04:34:40,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1527080.0, ans=0.04949747468305833 2023-10-04 04:34:42,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1527080.0, ans=0.2 2023-10-04 04:34:43,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:34:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:34:46,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 04:34:47,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:34:50,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 04:34:50,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:34:51,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:34:55,830 INFO [train.py:1046] (2/4) Epoch 44, batch 650, loss[loss=0.1561, simple_loss=0.2466, pruned_loss=0.03282, over 24315.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.235, pruned_loss=0.0374, over 4535716.73 frames. ], batch size: 74, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:34:57,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 04:34:57,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:35:01,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:35:01,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:35:04,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:05,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 04:35:07,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:35:07,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1527146.6666666667, ans=0.1 2023-10-04 04:35:13,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:35:13,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:16,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:20,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 04:35:21,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:35:23,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:25,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:35:25,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 04:35:26,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1527280.0, ans=0.125 2023-10-04 04:35:29,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:29,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:31,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:35:31,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:33,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:35:33,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=1527280.0, ans=0.05 2023-10-04 04:35:34,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:35:34,439 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 04:35:35,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:35,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:35:38,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:35:39,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:35:39,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:35:43,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 04:35:43,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:35:43,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1527346.6666666667, ans=0.125 2023-10-04 04:35:44,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:35:44,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:35:44,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:35:45,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:35:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 04:35:48,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 04:35:48,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:48,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:35:48,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:35:48,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1527346.6666666667, ans=0.125 2023-10-04 04:35:50,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:35:51,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:54,090 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.998e+02 2.208e+02 2.441e+02 4.854e+02, threshold=4.416e+02, percent-clipped=2.0 2023-10-04 04:35:55,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1527413.3333333333, ans=0.0 2023-10-04 04:35:57,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:57,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:35:59,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:36:01,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1527413.3333333333, ans=0.025 2023-10-04 04:36:02,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:36:02,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 04:36:04,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:36:09,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:36:09,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:09,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:36:09,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:10,810 INFO [train.py:1046] (2/4) Epoch 44, batch 700, loss[loss=0.1438, simple_loss=0.2073, pruned_loss=0.04017, over 22790.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2337, pruned_loss=0.03715, over 4566218.72 frames. ], batch size: 322, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:36:14,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 04:36:14,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1527480.0, ans=0.2 2023-10-04 04:36:15,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 04:36:18,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 04:36:18,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:21,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:36:23,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 04:36:28,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:36:28,343 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:36:28,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1527546.6666666667, ans=0.0 2023-10-04 04:36:31,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:36:32,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:36:33,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:36:36,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:37,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 04:36:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:36:39,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 04:36:40,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 04:36:46,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:36:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:36:49,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:36:52,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:36:52,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 04:36:56,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:58,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:36:58,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 04:37:03,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:37:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:07,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:13,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:37:13,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 04:37:16,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 04:37:16,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 04:37:19,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:21,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:37:21,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:37:23,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:23,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 04:37:25,159 INFO [train.py:1046] (2/4) Epoch 44, batch 750, loss[loss=0.1679, simple_loss=0.2401, pruned_loss=0.04787, over 23800.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2337, pruned_loss=0.03702, over 4593875.19 frames. ], batch size: 179, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:37:27,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 04:37:27,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 04:37:28,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 04:37:29,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 04:37:29,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 04:37:31,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:37:33,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 04:37:34,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:35,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:37:36,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:37:37,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:37,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:37:37,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:37:40,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:37:40,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:37:42,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:37:45,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:37:45,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:46,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 04:37:48,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:37:49,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:54,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:37:55,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 04:37:55,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:37:55,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1527946.6666666667, ans=0.09899494936611666 2023-10-04 04:37:56,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 04:37:56,542 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 04:37:57,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 04:37:57,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:37:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:37:59,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1527946.6666666667, ans=0.125 2023-10-04 04:38:00,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:38:07,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:38:08,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:08,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:38:10,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:38:11,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:11,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 04:38:13,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:38:13,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 04:38:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:38:17,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:38:17,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 04:38:19,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:23,158 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.912e+02 2.188e+02 2.458e+02 4.051e+02, threshold=4.377e+02, percent-clipped=0.0 2023-10-04 04:38:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:38:26,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:38:26,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:38:28,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:38:31,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 04:38:32,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1528080.0, ans=0.125 2023-10-04 04:38:33,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:38:33,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:38:34,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.18 vs. limit=15.0 2023-10-04 04:38:37,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:38:37,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:38:40,333 INFO [train.py:1046] (2/4) Epoch 44, batch 800, loss[loss=0.1727, simple_loss=0.2592, pruned_loss=0.04304, over 24350.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03701, over 4618354.69 frames. ], batch size: 77, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:38:40,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:40,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:38:44,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1528146.6666666667, ans=0.1 2023-10-04 04:38:47,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:47,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:50,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:38:50,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:38:51,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:51,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:38:53,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:56,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:38:57,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:38:58,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 04:39:00,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:01,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:39:01,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:39:01,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:39:03,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 04:39:03,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:39:04,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 04:39:08,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:09,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:39:12,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:39:12,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:39:14,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:14,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:18,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:39:18,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:39:18,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 04:39:22,018 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 04:39:22,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 04:39:22,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:39:22,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:39:24,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:24,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:39:26,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1528346.6666666667, ans=0.125 2023-10-04 04:39:30,125 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 04:39:30,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 04:39:31,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:39:34,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:39:35,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1528346.6666666667, ans=0.0 2023-10-04 04:39:38,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:39:41,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:42,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 04:39:43,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1528413.3333333333, ans=0.125 2023-10-04 04:39:44,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:39:44,937 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.16 vs. limit=15.0 2023-10-04 04:39:46,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 04:39:52,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:39:53,714 INFO [train.py:1046] (2/4) Epoch 44, batch 850, loss[loss=0.1464, simple_loss=0.2249, pruned_loss=0.03397, over 23242.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2349, pruned_loss=0.03712, over 4639245.78 frames. ], batch size: 119, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:39:55,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:39:55,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 04:39:57,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:39:58,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:59,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 04:40:00,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:01,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:40:02,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:04,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:40:05,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:40:05,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 04:40:07,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 04:40:07,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 04:40:08,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:40:10,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:40:12,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:12,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:40:12,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:40:16,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:16,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:16,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 04:40:20,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 04:40:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:25,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 04:40:27,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 04:40:28,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.64 vs. limit=15.0 2023-10-04 04:40:29,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 04:40:32,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 04:40:32,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:40:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:40:32,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 04:40:34,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:35,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1528613.3333333333, ans=0.1 2023-10-04 04:40:37,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:37,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 04:40:37,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1528680.0, ans=0.125 2023-10-04 04:40:39,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:40:40,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-10-04 04:40:41,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:41,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:40:42,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:40:42,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1528680.0, ans=0.2 2023-10-04 04:40:42,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1528680.0, ans=0.0 2023-10-04 04:40:44,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:40:45,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:40:45,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 04:40:49,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:40:49,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:40:50,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:40:50,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:40:52,177 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 1.992e+02 2.228e+02 2.493e+02 3.552e+02, threshold=4.457e+02, percent-clipped=0.0 2023-10-04 04:40:52,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:53,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:40:57,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:40:58,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1528746.6666666667, ans=0.0 2023-10-04 04:40:59,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:40:59,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:40:59,641 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.03 vs. limit=15.0 2023-10-04 04:41:05,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:41:08,422 INFO [train.py:1046] (2/4) Epoch 44, batch 900, loss[loss=0.1511, simple_loss=0.2301, pruned_loss=0.03604, over 23630.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03759, over 4663665.02 frames. ], batch size: 149, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:41:08,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:41:08,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 04:41:08,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:41:09,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:41:11,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 04:41:17,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:41:20,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:41:20,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 04:41:23,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:41:23,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1528880.0, ans=0.2 2023-10-04 04:41:24,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 04:41:24,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 04:41:26,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:41:26,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:41:27,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:41:27,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:41:34,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:41:35,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:41:36,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:41:39,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:41:43,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 04:41:46,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:41:49,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:41:50,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:41:51,684 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 04:41:53,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 04:41:58,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:41:58,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:42:00,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:42:07,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:07,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:08,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 04:42:08,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:42:12,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 04:42:14,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:42:15,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:17,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:42:17,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:21,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 04:42:21,755 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 04:42:23,054 INFO [train.py:1046] (2/4) Epoch 44, batch 950, loss[loss=0.1517, simple_loss=0.231, pruned_loss=0.03626, over 23451.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2359, pruned_loss=0.03779, over 4662258.93 frames. ], batch size: 134, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:42:23,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 04:42:23,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 04:42:24,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:26,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 04:42:33,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:42:36,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:37,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:37,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:42:39,028 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 04:42:42,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:44,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:42:45,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:42:45,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:42:45,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 04:42:48,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:42:48,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:50,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 04:42:50,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:54,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:54,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:54,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:55,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.78 vs. limit=15.0 2023-10-04 04:42:55,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 04:42:56,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1529280.0, ans=0.125 2023-10-04 04:42:57,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1529280.0, ans=0.0 2023-10-04 04:42:58,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:43:00,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:43:01,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:43:06,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:43:06,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:43:09,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 04:43:12,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 04:43:12,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:43:12,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:14,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:14,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:43:18,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 04:43:19,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:43:20,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:22,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:22,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 04:43:23,472 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.090e+02 2.361e+02 2.661e+02 3.886e+02, threshold=4.722e+02, percent-clipped=0.0 2023-10-04 04:43:23,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:43:23,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:43:24,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 04:43:25,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1529413.3333333333, ans=0.125 2023-10-04 04:43:26,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1529413.3333333333, ans=0.0 2023-10-04 04:43:27,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:43:30,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:43:33,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:43:34,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 04:43:34,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 04:43:34,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1529413.3333333333, ans=0.0 2023-10-04 04:43:37,331 INFO [train.py:1046] (2/4) Epoch 44, batch 1000, loss[loss=0.1361, simple_loss=0.2013, pruned_loss=0.0355, over 23580.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2343, pruned_loss=0.03751, over 4660928.28 frames. ], batch size: 256, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:43:39,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:42,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 04:43:43,655 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.09 vs. limit=12.0 2023-10-04 04:43:44,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:43:47,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:43:49,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 04:43:49,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 04:43:54,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:43:54,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:43:54,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:57,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 04:44:01,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 04:44:03,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 04:44:04,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:05,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 04:44:06,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1529613.3333333333, ans=0.07 2023-10-04 04:44:06,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1529613.3333333333, ans=0.125 2023-10-04 04:44:07,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 04:44:07,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 04:44:07,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1529613.3333333333, ans=0.125 2023-10-04 04:44:08,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:08,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:17,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:44:17,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:44:18,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:19,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1529613.3333333333, ans=0.125 2023-10-04 04:44:20,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:20,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 04:44:20,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:44:21,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:44:21,790 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 04:44:24,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 04:44:24,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1529680.0, ans=0.125 2023-10-04 04:44:25,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 04:44:28,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 04:44:29,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:44:30,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1529680.0, ans=0.0 2023-10-04 04:44:35,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1529746.6666666667, ans=0.04949747468305833 2023-10-04 04:44:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:36,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:44:36,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:38,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:44:41,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 04:44:41,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:44:42,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 04:44:42,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 04:44:44,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:44:44,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:44,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:44:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:44:48,873 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.17 vs. limit=6.0 2023-10-04 04:44:50,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:52,199 INFO [train.py:1046] (2/4) Epoch 44, batch 1050, loss[loss=0.1372, simple_loss=0.189, pruned_loss=0.04274, over 19187.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2331, pruned_loss=0.0373, over 4665996.08 frames. ], batch size: 389, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:44:52,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:44:53,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:44:55,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:44:56,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:59,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:45:01,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:45:03,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:45:03,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1529813.3333333333, ans=0.1 2023-10-04 04:45:04,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1529880.0, ans=0.125 2023-10-04 04:45:06,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:45:07,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:45:07,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:45:08,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:45:08,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 04:45:10,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:45:11,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 04:45:13,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:45:13,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 04:45:13,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:45:14,200 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.56 vs. limit=15.0 2023-10-04 04:45:21,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:45:21,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:45:21,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:45:25,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 04:45:25,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 04:45:25,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:45:29,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 04:45:32,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 04:45:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:45:32,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1529946.6666666667, ans=0.125 2023-10-04 04:45:36,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 04:45:37,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 04:45:37,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:45:39,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:45:40,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1530013.3333333333, ans=0.09899494936611666 2023-10-04 04:45:42,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:45:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 04:45:47,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 04:45:49,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 04:45:49,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:45:49,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:45:50,480 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.023e+02 2.342e+02 2.796e+02 4.637e+02, threshold=4.684e+02, percent-clipped=0.0 2023-10-04 04:45:50,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 04:45:54,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:45:56,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:45:56,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:45:56,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:45:56,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:45:57,783 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:46:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:01,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 04:46:01,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:46:03,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 04:46:03,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 04:46:04,595 INFO [train.py:1046] (2/4) Epoch 44, batch 1100, loss[loss=0.1455, simple_loss=0.2319, pruned_loss=0.02952, over 24474.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2334, pruned_loss=0.03687, over 4685579.47 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:46:04,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:46:07,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:46:11,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:46:11,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1530146.6666666667, ans=0.125 2023-10-04 04:46:18,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:46:19,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:46:19,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:46:19,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 04:46:21,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:46:23,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:46:27,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:46:29,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:46:29,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 04:46:31,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 04:46:32,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:46:32,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:46:35,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:46:36,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:46:41,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:46:44,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 04:46:45,064 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 04:46:47,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:48,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:49,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:46:49,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1530346.6666666667, ans=0.035 2023-10-04 04:46:50,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:46:52,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 04:46:53,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:46:53,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:46:53,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:46:53,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 04:46:57,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.64 vs. limit=22.5 2023-10-04 04:47:00,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:47:00,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 04:47:01,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:47:04,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:47:07,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 04:47:07,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:47:09,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:10,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:11,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:47:13,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 04:47:13,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:47:14,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:47:15,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 04:47:16,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:47:16,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 04:47:18,678 INFO [train.py:1046] (2/4) Epoch 44, batch 1150, loss[loss=0.1583, simple_loss=0.2456, pruned_loss=0.0355, over 24074.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2341, pruned_loss=0.03692, over 4692267.86 frames. ], batch size: 80, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:47:18,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:47:18,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:47:18,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1530480.0, ans=0.0 2023-10-04 04:47:20,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:47:24,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:26,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:47:26,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1530480.0, ans=0.1 2023-10-04 04:47:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:28,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:47:29,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 04:47:30,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:47:30,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1530480.0, ans=0.0 2023-10-04 04:47:33,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 04:47:34,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:34,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:47:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 04:47:40,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:41,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1530546.6666666667, ans=0.0 2023-10-04 04:47:44,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:44,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:47:46,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 04:47:46,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:47:46,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:47:47,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1530613.3333333333, ans=0.125 2023-10-04 04:47:47,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1530613.3333333333, ans=0.125 2023-10-04 04:47:49,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 04:47:51,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:53,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:48:01,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:48:08,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:48:08,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 04:48:08,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:09,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:14,051 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 04:48:15,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:17,277 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 1.987e+02 2.114e+02 2.395e+02 3.524e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-04 04:48:22,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 04:48:27,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:48:29,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:48:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:48:29,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:48:31,942 INFO [train.py:1046] (2/4) Epoch 44, batch 1200, loss[loss=0.1417, simple_loss=0.2253, pruned_loss=0.02908, over 24317.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2346, pruned_loss=0.03704, over 4691447.65 frames. ], batch size: 61, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:48:32,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:48:35,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1530813.3333333333, ans=0.2 2023-10-04 04:48:37,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:48:37,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:48:38,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:48:38,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:48:39,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:48:41,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:48:44,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:48:45,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:48:45,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:49,914 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 04:48:51,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 04:48:55,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:48:55,526 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:48:55,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1530880.0, ans=0.125 2023-10-04 04:48:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:49:00,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:49:01,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:49:01,552 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 04:49:02,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:49:11,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:49:11,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:49:11,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 04:49:11,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1530946.6666666667, ans=0.1 2023-10-04 04:49:12,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:49:15,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 04:49:16,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1531013.3333333333, ans=0.125 2023-10-04 04:49:17,401 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.13 vs. limit=10.0 2023-10-04 04:49:18,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 04:49:18,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:49:18,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:49:20,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:49:22,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:49:22,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:49:23,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:49:23,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:49:24,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 04:49:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:49:25,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:49:26,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 04:49:30,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:49:30,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:49:32,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1531080.0, ans=0.1 2023-10-04 04:49:33,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:49:36,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:49:36,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 04:49:40,807 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 04:49:43,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:49:44,797 INFO [train.py:1046] (2/4) Epoch 44, batch 1250, loss[loss=0.1662, simple_loss=0.2417, pruned_loss=0.0454, over 23785.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2354, pruned_loss=0.03718, over 4703207.37 frames. ], batch size: 212, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:49:44,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:49:46,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:49:48,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:49:51,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 04:49:55,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:49:56,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:49:57,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 04:49:58,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:50:00,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:50:03,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:50:04,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.06 vs. limit=12.0 2023-10-04 04:50:04,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:50:06,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:50:06,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:50:07,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:50:10,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:50:10,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1531213.3333333333, ans=0.125 2023-10-04 04:50:11,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:50:11,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:50:11,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:50:13,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:16,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:17,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:50:21,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 04:50:22,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:50:25,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:50:27,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 04:50:27,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:50:27,184 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 04:50:28,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:28,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:32,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:33,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:33,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:50:35,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 04:50:35,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 04:50:37,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 04:50:40,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:50:41,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 04:50:41,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:45,655 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.263e+02 2.727e+02 4.243e+02, threshold=4.525e+02, percent-clipped=1.0 2023-10-04 04:50:45,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 04:50:45,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:50:49,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 04:50:49,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:50:50,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:50:50,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 04:50:50,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:50:52,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 04:50:54,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:50:56,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:50:56,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1531413.3333333333, ans=0.1 2023-10-04 04:50:57,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:51:00,135 INFO [train.py:1046] (2/4) Epoch 44, batch 1300, loss[loss=0.2098, simple_loss=0.2794, pruned_loss=0.07013, over 19592.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2356, pruned_loss=0.03721, over 4705973.25 frames. ], batch size: 388, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:51:00,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:51:03,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:51:04,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 04:51:07,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:51:09,500 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:51:10,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:51:12,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:51:14,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:51:14,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:51:16,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 04:51:18,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.94 vs. limit=15.0 2023-10-04 04:51:19,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:51:20,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:51:22,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 04:51:24,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:51:28,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:51:28,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:51:29,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:51:29,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1531613.3333333333, ans=0.125 2023-10-04 04:51:32,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:51:32,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:51:33,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:51:33,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 04:51:39,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:51:39,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:51:41,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 04:51:41,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:51:42,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:51:44,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:51:46,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 04:51:46,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:51:46,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 04:51:49,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:51:50,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1531680.0, ans=0.05 2023-10-04 04:51:51,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1531680.0, ans=0.125 2023-10-04 04:51:53,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:51:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:51:57,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 04:51:58,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 04:51:58,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 04:52:02,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:52:05,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 04:52:07,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:52:14,025 INFO [train.py:1046] (2/4) Epoch 44, batch 1350, loss[loss=0.1437, simple_loss=0.2068, pruned_loss=0.04035, over 22741.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2347, pruned_loss=0.03736, over 4704812.69 frames. ], batch size: 322, lr: 2.32e-03, grad_scale: 4.0 2023-10-04 04:52:14,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 04:52:16,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:52:18,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:20,631 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=22.5 2023-10-04 04:52:21,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:52:21,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1531813.3333333333, ans=0.125 2023-10-04 04:52:22,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:52:24,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:52:24,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:52:24,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1531813.3333333333, ans=0.125 2023-10-04 04:52:27,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:52:27,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1531880.0, ans=0.125 2023-10-04 04:52:28,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 04:52:30,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:52:30,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:52:30,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1531880.0, ans=0.125 2023-10-04 04:52:32,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1531880.0, ans=0.1 2023-10-04 04:52:33,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 04:52:35,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:52:36,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:52:36,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 04:52:39,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 04:52:41,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 04:52:43,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:43,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 04:52:55,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:57,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1532013.3333333333, ans=0.0 2023-10-04 04:53:04,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:53:04,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:04,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 04:53:08,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:10,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 04:53:10,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:53:10,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:53:11,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:53:13,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 04:53:14,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:53:16,249 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.909e+02 2.111e+02 2.476e+02 3.345e+02, threshold=4.221e+02, percent-clipped=0.0 2023-10-04 04:53:19,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 04:53:20,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 04:53:27,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 04:53:28,407 INFO [train.py:1046] (2/4) Epoch 44, batch 1400, loss[loss=0.1586, simple_loss=0.2507, pruned_loss=0.03321, over 24539.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.233, pruned_loss=0.03708, over 4709344.73 frames. ], batch size: 71, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:53:28,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:32,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:53:32,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:53:33,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-10-04 04:53:37,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 04:53:38,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 04:53:48,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:53:51,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:53:51,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1532213.3333333333, ans=0.5 2023-10-04 04:53:54,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:53:54,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:53:58,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:53:59,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 04:54:06,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:06,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:07,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1532280.0, ans=0.2 2023-10-04 04:54:11,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 04:54:12,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:54:12,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:54:13,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.02 vs. limit=15.0 2023-10-04 04:54:14,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:54:14,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:54:17,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:54:17,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:54:17,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:54:18,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 04:54:19,315 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.28 vs. limit=15.0 2023-10-04 04:54:19,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:54:24,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:29,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:54:34,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 04:54:36,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:54:36,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:54:38,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 04:54:40,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:54:41,375 INFO [train.py:1046] (2/4) Epoch 44, batch 1450, loss[loss=0.1743, simple_loss=0.2469, pruned_loss=0.05085, over 23868.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2329, pruned_loss=0.03685, over 4719283.25 frames. ], batch size: 179, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:54:41,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:54:45,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1532480.0, ans=0.2 2023-10-04 04:54:46,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:54:48,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1532480.0, ans=0.1 2023-10-04 04:54:49,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:54:49,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:49,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 04:54:49,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1532480.0, ans=0.0 2023-10-04 04:54:53,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:54:53,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:54:56,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:54:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 04:54:57,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:54:57,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 04:54:59,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:59,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:54:59,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 04:55:03,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:55:03,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:55:04,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:55:04,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:04,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:55:05,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:07,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:07,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1532546.6666666667, ans=0.0 2023-10-04 04:55:10,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:55:10,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:55:13,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:55:13,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:13,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.86 vs. limit=15.0 2023-10-04 04:55:14,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:14,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:55:14,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:16,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:19,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 04:55:22,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:55:25,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1532680.0, ans=0.1 2023-10-04 04:55:26,212 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 04:55:26,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:55:29,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:55:30,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:55:31,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 04:55:36,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:38,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 04:55:39,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 04:55:41,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:55:41,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1532746.6666666667, ans=0.0 2023-10-04 04:55:43,945 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.952e+02 2.155e+02 2.479e+02 3.753e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-04 04:55:44,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:55:44,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:55:46,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 04:55:46,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-10-04 04:55:49,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 04:55:49,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 04:55:50,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:52,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:55:55,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1532813.3333333333, ans=0.125 2023-10-04 04:55:56,244 INFO [train.py:1046] (2/4) Epoch 44, batch 1500, loss[loss=0.1637, simple_loss=0.2405, pruned_loss=0.04348, over 23716.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2333, pruned_loss=0.03649, over 4720348.41 frames. ], batch size: 232, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:56:03,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 04:56:03,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:56:03,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:56:03,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:56:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:56:05,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:56:07,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 04:56:08,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:56:08,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:56:08,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:56:10,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:56:13,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:56:13,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:56:17,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:56:17,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 04:56:19,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:56:19,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:56:20,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.21 vs. limit=22.5 2023-10-04 04:56:21,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:56:22,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1532880.0, ans=0.125 2023-10-04 04:56:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 04:56:26,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1532946.6666666667, ans=0.125 2023-10-04 04:56:28,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 04:56:29,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:56:29,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 04:56:34,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:56:36,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:56:37,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:56:37,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:56:37,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1532946.6666666667, ans=0.2 2023-10-04 04:56:37,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1532946.6666666667, ans=0.07 2023-10-04 04:56:38,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 04:56:40,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:56:40,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:56:40,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 04:56:41,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:56:47,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:56:47,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 04:56:50,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:56:52,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:56:55,519 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 04:56:55,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:56:56,844 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 04:56:58,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:56:58,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:56:59,647 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 04:56:59,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:57:02,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 04:57:04,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:06,618 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.85 vs. limit=22.5 2023-10-04 04:57:07,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:57:07,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:08,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:57:08,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:08,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:57:11,061 INFO [train.py:1046] (2/4) Epoch 44, batch 1550, loss[loss=0.1591, simple_loss=0.2508, pruned_loss=0.03376, over 24444.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2344, pruned_loss=0.037, over 4724126.68 frames. ], batch size: 69, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:57:11,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 04:57:11,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 04:57:12,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:57:12,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 04:57:12,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 04:57:15,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:57:16,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:18,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:57:18,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:57:20,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:21,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:24,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 04:57:24,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:57:24,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:57:25,595 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.86 vs. limit=15.0 2023-10-04 04:57:26,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:57:28,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:57:28,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 04:57:31,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:57:31,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 04:57:32,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 04:57:32,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 04:57:32,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:57:34,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:57:38,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:57:40,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 04:57:40,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 04:57:49,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:57:52,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:57:52,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:57:52,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:57:52,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 04:57:58,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:58:00,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:03,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:58:06,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:58:07,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:58:07,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 04:58:07,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:58:09,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:58:09,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:10,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 04:58:10,918 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 04:58:13,600 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.992e+02 2.204e+02 2.459e+02 3.860e+02, threshold=4.408e+02, percent-clipped=0.0 2023-10-04 04:58:13,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:13,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1533413.3333333333, ans=0.1 2023-10-04 04:58:19,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 04:58:23,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:58:24,658 INFO [train.py:1046] (2/4) Epoch 44, batch 1600, loss[loss=0.1663, simple_loss=0.2424, pruned_loss=0.04509, over 23398.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.03679, over 4738376.12 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 04:58:24,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:24,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 04:58:26,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:58:28,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:58:28,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:58:28,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:58:28,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:58:28,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1533480.0, ans=0.125 2023-10-04 04:58:33,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:33,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 04:58:33,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 04:58:35,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 04:58:37,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:58:39,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1533546.6666666667, ans=0.125 2023-10-04 04:58:39,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1533546.6666666667, ans=0.1 2023-10-04 04:58:40,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 04:58:42,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:58:43,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:58:48,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:58:49,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 04:58:52,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:58:52,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 04:58:53,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:53,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 04:58:59,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 04:59:06,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:59:08,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 04:59:09,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:59:09,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:59:09,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:59:10,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.85 vs. limit=15.0 2023-10-04 04:59:11,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1533680.0, ans=0.0 2023-10-04 04:59:12,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 04:59:16,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 04:59:19,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:59:19,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:19,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:19,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:59:20,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:59:22,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:59:23,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:59:28,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:30,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:59:32,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 04:59:32,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:59:34,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 04:59:38,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:59:40,524 INFO [train.py:1046] (2/4) Epoch 44, batch 1650, loss[loss=0.1597, simple_loss=0.2482, pruned_loss=0.03554, over 23931.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2356, pruned_loss=0.03774, over 4708346.14 frames. ], batch size: 80, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 04:59:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:59:42,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1533813.3333333333, ans=0.0 2023-10-04 04:59:43,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:59:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 04:59:44,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 04:59:44,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 04:59:44,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 04:59:48,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:48,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:59:50,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:59:50,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:59:52,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:59:54,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 04:59:57,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:59:57,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:59:57,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:59:57,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:59:58,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 04:59:58,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 04:59:58,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1533880.0, ans=0.125 2023-10-04 05:00:04,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:00:06,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:00:14,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.33 vs. limit=15.0 2023-10-04 05:00:15,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 05:00:15,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1533946.6666666667, ans=0.125 2023-10-04 05:00:16,266 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.39 vs. limit=15.0 2023-10-04 05:00:18,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:19,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 05:00:21,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:24,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:00:24,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:00:24,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:25,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:00:26,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:29,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:00:30,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:30,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:00:30,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:00:32,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:00:32,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:00:36,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:00:37,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 05:00:39,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:00:39,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 05:00:40,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 05:00:42,345 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.929e+02 2.067e+02 2.263e+02 2.881e+02, threshold=4.133e+02, percent-clipped=0.0 2023-10-04 05:00:42,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 05:00:42,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:00:42,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:00:42,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:42,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:42,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 05:00:47,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:48,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1534080.0, ans=0.0 2023-10-04 05:00:50,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:00:50,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:51,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 05:00:54,165 INFO [train.py:1046] (2/4) Epoch 44, batch 1700, loss[loss=0.1594, simple_loss=0.2368, pruned_loss=0.04095, over 23814.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2351, pruned_loss=0.03729, over 4713525.81 frames. ], batch size: 179, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:00:57,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:57,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:00:57,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 05:00:58,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:00:58,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:00:58,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:00:59,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:00:59,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:00:59,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 05:01:04,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:01:11,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:01:14,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:01:18,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:01:18,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:01:20,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:01:20,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:01:22,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 05:01:25,008 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.60 vs. limit=15.0 2023-10-04 05:01:25,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:01:25,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:27,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:01:27,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:01:30,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 05:01:32,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 05:01:33,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:34,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 05:01:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:01:42,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:01:42,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:01:44,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:01:47,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:01:47,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 05:01:47,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:01:48,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:48,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 05:01:50,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:01:50,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:01:50,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:50,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:01:52,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:01:52,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:01:53,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1534413.3333333333, ans=0.125 2023-10-04 05:01:54,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:01:54,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:01:54,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:01:58,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:01:58,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1534413.3333333333, ans=0.125 2023-10-04 05:01:59,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 05:02:03,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:04,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:02:04,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1534413.3333333333, ans=0.1 2023-10-04 05:02:06,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 05:02:09,455 INFO [train.py:1046] (2/4) Epoch 44, batch 1750, loss[loss=0.1698, simple_loss=0.2411, pruned_loss=0.04927, over 23816.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2341, pruned_loss=0.03753, over 4703708.46 frames. ], batch size: 164, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:02:12,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:16,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:02:17,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:02:17,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 05:02:18,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:02:19,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:02:19,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:22,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 05:02:25,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:02:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 05:02:26,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:02:28,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:02:28,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1534546.6666666667, ans=0.09899494936611666 2023-10-04 05:02:30,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:02:32,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 05:02:34,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:02:35,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 05:02:43,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:02:45,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:02:45,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:48,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:48,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:51,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:02:51,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:02:54,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:02:55,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 05:02:58,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:03:01,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 05:03:02,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:03:04,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:05,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:03:09,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:03:09,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 05:03:11,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:03:12,440 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 2.076e+02 2.414e+02 2.954e+02 4.467e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-04 05:03:13,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:03:17,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:18,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:03:19,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:03:21,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 05:03:21,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:03:22,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:03:22,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:22,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:03:22,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:03:24,025 INFO [train.py:1046] (2/4) Epoch 44, batch 1800, loss[loss=0.145, simple_loss=0.2191, pruned_loss=0.03545, over 23403.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2337, pruned_loss=0.03714, over 4706315.37 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:03:24,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:03:26,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:03:28,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:03:29,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:03:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:03:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:03:35,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:03:38,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:03:42,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:42,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:42,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:03:43,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1534880.0, ans=0.125 2023-10-04 05:03:45,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:03:45,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 05:03:45,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:03:47,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.42 vs. limit=15.0 2023-10-04 05:03:48,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:03:50,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1534880.0, ans=0.1 2023-10-04 05:03:51,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 05:03:53,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 05:03:54,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 05:03:54,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:03:54,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:54,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:55,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:04:00,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 05:04:02,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:04:03,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.76 vs. limit=15.0 2023-10-04 05:04:04,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:06,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 05:04:07,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 05:04:07,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:04:09,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:04:10,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.44 vs. limit=15.0 2023-10-04 05:04:11,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:04:15,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.61 vs. limit=15.0 2023-10-04 05:04:15,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 05:04:21,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:04:22,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 05:04:22,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:04:22,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:04:24,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:04:24,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 05:04:28,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:04:28,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:04:29,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 05:04:29,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:04:30,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1535080.0, ans=0.0 2023-10-04 05:04:31,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:04:31,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:04:31,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:32,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:32,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:04:35,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:04:35,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:04:36,680 INFO [train.py:1046] (2/4) Epoch 44, batch 1850, loss[loss=0.1666, simple_loss=0.24, pruned_loss=0.04663, over 23575.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2337, pruned_loss=0.03708, over 4717979.23 frames. ], batch size: 256, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:04:38,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:04:40,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:04:46,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:04:46,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 05:04:49,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1535146.6666666667, ans=0.125 2023-10-04 05:04:50,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 05:04:53,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.83 vs. limit=15.0 2023-10-04 05:04:53,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 05:04:58,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:04:59,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 05:04:59,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 05:05:08,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:05:09,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 05:05:12,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:05:13,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:05:17,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 05:05:18,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:18,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:05:19,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:05:20,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1535346.6666666667, ans=0.125 2023-10-04 05:05:22,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:05:23,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:05:26,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:05:26,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:26,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:05:26,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:05:26,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1535346.6666666667, ans=0.0 2023-10-04 05:05:29,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:05:31,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:05:31,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1535346.6666666667, ans=0.125 2023-10-04 05:05:33,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 05:05:34,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:05:37,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:05:38,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:05:38,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 05:05:38,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 05:05:39,920 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.019e+02 2.202e+02 2.668e+02 3.594e+02, threshold=4.403e+02, percent-clipped=0.0 2023-10-04 05:05:41,951 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 05:05:43,854 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 05:05:45,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:05:45,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:05:45,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:05:46,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:46,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 05:05:46,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:05:46,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:48,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:05:49,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:05:51,013 INFO [train.py:1046] (2/4) Epoch 44, batch 1900, loss[loss=0.1474, simple_loss=0.225, pruned_loss=0.03496, over 23691.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03712, over 4718937.64 frames. ], batch size: 149, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:05:51,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:05:51,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 05:05:51,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-10-04 05:05:52,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:52,629 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 05:05:53,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:05:53,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:05:59,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:06:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:06:02,341 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 05:06:03,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 05:06:04,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:06:05,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:06:05,040 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 05:06:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 05:06:09,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 05:06:10,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:06:15,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 05:06:17,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 05:06:26,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 05:06:28,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1535613.3333333333, ans=0.0 2023-10-04 05:06:28,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1535613.3333333333, ans=0.125 2023-10-04 05:06:29,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 05:06:29,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:06:31,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 05:06:31,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 05:06:31,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 05:06:31,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 05:06:31,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:06:36,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 05:06:39,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:06:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:06:43,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 05:06:45,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:06:49,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 05:06:50,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:06:57,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:06:57,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:06:57,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:06:58,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:07:00,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:07:00,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:07:00,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1535746.6666666667, ans=0.125 2023-10-04 05:07:01,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:07:02,882 INFO [train.py:1046] (2/4) Epoch 44, batch 1950, loss[loss=0.1547, simple_loss=0.2477, pruned_loss=0.03085, over 24292.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2356, pruned_loss=0.03747, over 4726834.73 frames. ], batch size: 74, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:07:03,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1535813.3333333333, ans=0.035 2023-10-04 05:07:03,326 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:07:04,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:07:04,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:07:04,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1535813.3333333333, ans=0.125 2023-10-04 05:07:05,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:07:05,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:07:06,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:07:07,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:07:08,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1535813.3333333333, ans=0.125 2023-10-04 05:07:10,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:07:11,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:07:11,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:11,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:07:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 05:07:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 05:07:17,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:17,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:17,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1535880.0, ans=0.1 2023-10-04 05:07:19,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:07:20,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:07:20,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:22,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:07:25,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:07:25,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:07:25,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:07:25,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:28,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:32,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:07:32,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:07:32,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:07:32,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 05:07:32,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:07:32,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:07:33,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:36,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:37,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.66 vs. limit=12.0 2023-10-04 05:07:39,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:07:43,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:07:47,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:07:47,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:07:47,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 05:07:49,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:07:49,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1536013.3333333333, ans=0.0 2023-10-04 05:07:53,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:07:53,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:07:54,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:07:58,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1536013.3333333333, ans=0.125 2023-10-04 05:08:00,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:03,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:06,094 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.020e+02 2.252e+02 2.597e+02 3.985e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-04 05:08:06,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:06,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1536080.0, ans=0.0 2023-10-04 05:08:07,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:08:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:08:10,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:08:10,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 05:08:10,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:08:11,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:08:11,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 05:08:14,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:08:16,703 INFO [train.py:1046] (2/4) Epoch 44, batch 2000, loss[loss=0.1491, simple_loss=0.2404, pruned_loss=0.02892, over 24299.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2361, pruned_loss=0.03745, over 4717689.97 frames. ], batch size: 74, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:08:18,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:08:19,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:08:19,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:08:20,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:08:22,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1536146.6666666667, ans=0.2 2023-10-04 05:08:23,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:26,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 05:08:27,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:08:29,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:08:32,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 05:08:32,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:08:32,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:08:36,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:08:38,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 05:08:39,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:40,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:41,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1536213.3333333333, ans=0.2 2023-10-04 05:08:42,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:43,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 05:08:43,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:08:46,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 05:08:46,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:08:48,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:08:48,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1536280.0, ans=0.0 2023-10-04 05:08:49,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:08:49,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:50,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:08:51,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:08:52,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 05:08:54,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 05:08:54,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:08:55,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:08:57,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1536280.0, ans=15.0 2023-10-04 05:08:59,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:01,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:09:01,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:09:01,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:09:03,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:09:03,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:09:03,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:08,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:09:09,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 05:09:15,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:09:16,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:20,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:20,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:09:23,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:24,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:09:24,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:26,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:09:27,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:09:28,918 INFO [train.py:1046] (2/4) Epoch 44, batch 2050, loss[loss=0.1381, simple_loss=0.2042, pruned_loss=0.03599, over 23422.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2357, pruned_loss=0.03738, over 4719781.97 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:09:28,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:30,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:31,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:09:31,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:32,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1536480.0, ans=0.0 2023-10-04 05:09:33,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1536480.0, ans=0.0 2023-10-04 05:09:37,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:09:38,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:09:40,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:41,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:09:43,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 05:09:43,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:09:45,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:46,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:09:46,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1536546.6666666667, ans=0.125 2023-10-04 05:09:55,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:09:55,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:57,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 05:10:00,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:10:02,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 05:10:02,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:10:04,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:10:07,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:08,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:10:09,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:10:10,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:10:10,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:10:10,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:10:11,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1536680.0, ans=0.025 2023-10-04 05:10:15,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:15,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1536680.0, ans=0.0 2023-10-04 05:10:16,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:10:18,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:10:18,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:10:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:10:28,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:10:30,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 05:10:33,073 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.975e+02 2.184e+02 2.529e+02 3.912e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-04 05:10:34,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:10:35,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:10:37,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:10:38,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 05:10:40,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1536746.6666666667, ans=0.125 2023-10-04 05:10:41,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 05:10:41,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:10:43,304 INFO [train.py:1046] (2/4) Epoch 44, batch 2100, loss[loss=0.142, simple_loss=0.2215, pruned_loss=0.03124, over 23383.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03697, over 4713864.51 frames. ], batch size: 119, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:10:43,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:43,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:10:45,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:10:45,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 05:10:45,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 05:10:46,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:10:49,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:10:50,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:10:53,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:10:53,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:10:53,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 05:10:55,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:10:55,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 05:10:55,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 05:10:56,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1536813.3333333333, ans=0.2 2023-10-04 05:10:56,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1536813.3333333333, ans=0.0 2023-10-04 05:10:57,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1536880.0, ans=0.0 2023-10-04 05:10:58,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:10:58,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:10:58,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 05:11:00,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 05:11:04,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 05:11:04,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:11:07,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:11:08,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:11:12,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:11:12,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 05:11:14,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:14,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 05:11:16,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 05:11:17,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:17,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 05:11:17,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 05:11:19,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 05:11:19,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:11:19,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.78 vs. limit=15.0 2023-10-04 05:11:20,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:11:24,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:11:25,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:11:27,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:28,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:28,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 05:11:28,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:28,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:29,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1537013.3333333333, ans=0.1 2023-10-04 05:11:30,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 05:11:31,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 05:11:32,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 05:11:37,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:11:39,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:11:39,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 05:11:45,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:47,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:11:49,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:11:49,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:11:49,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 05:11:50,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:11:51,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:51,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:11:51,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:11:51,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:52,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1537080.0, ans=0.1 2023-10-04 05:11:54,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 05:11:54,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 05:11:54,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:11:56,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1537146.6666666667, ans=0.1 2023-10-04 05:11:57,860 INFO [train.py:1046] (2/4) Epoch 44, batch 2150, loss[loss=0.154, simple_loss=0.2303, pruned_loss=0.03886, over 23785.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2333, pruned_loss=0.03672, over 4712046.91 frames. ], batch size: 212, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:11:57,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:57,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:11:58,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:11:59,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:12:02,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 05:12:04,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:06,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:08,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:12:08,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:08,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:12:09,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1537146.6666666667, ans=0.125 2023-10-04 05:12:11,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:11,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:12:11,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:12:16,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:16,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 05:12:20,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:22,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:12:24,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:24,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:24,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:24,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:12:25,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:25,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:12:27,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:12:28,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 05:12:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:12:31,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:33,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:33,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:12:34,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:12:35,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:35,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:12:37,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:37,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 05:12:38,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:12:41,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:42,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:44,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:46,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:12:46,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.18 vs. limit=10.0 2023-10-04 05:12:47,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:49,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 05:12:51,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 05:12:52,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:12:52,378 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 05:12:52,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:53,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:12:55,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 05:12:55,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:12:55,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 05:12:55,122 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 05:12:55,122 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 05:12:56,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 05:12:58,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:59,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:13:01,600 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.996e+02 2.244e+02 2.703e+02 4.049e+02, threshold=4.487e+02, percent-clipped=0.0 2023-10-04 05:13:01,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:01,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:13:03,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:13:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:11,479 INFO [train.py:1046] (2/4) Epoch 44, batch 2200, loss[loss=0.1414, simple_loss=0.2269, pruned_loss=0.02788, over 24478.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03655, over 4712496.62 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:13:11,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:13:12,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 05:13:15,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:13:20,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:20,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:13:21,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:13:23,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:13:26,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:13:26,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:13:26,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 05:13:30,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1537546.6666666667, ans=0.0 2023-10-04 05:13:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 05:13:32,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:13:33,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1537546.6666666667, ans=0.2 2023-10-04 05:13:36,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1537546.6666666667, ans=0.0 2023-10-04 05:13:39,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 05:13:41,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:41,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:13:43,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:13:45,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:13:47,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 05:13:49,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1537613.3333333333, ans=0.04949747468305833 2023-10-04 05:13:51,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:13:51,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:52,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1537613.3333333333, ans=0.125 2023-10-04 05:13:53,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 05:13:56,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:13:58,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:13:58,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:14:00,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:02,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 05:14:02,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:04,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 05:14:04,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1537680.0, ans=0.0 2023-10-04 05:14:07,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:07,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:14:07,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:08,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:14:09,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1537680.0, ans=0.125 2023-10-04 05:14:10,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:14:10,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:10,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:12,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:14:12,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:14:14,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:14:14,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.51 vs. limit=15.0 2023-10-04 05:14:17,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:14:17,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:14:17,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1537746.6666666667, ans=0.125 2023-10-04 05:14:19,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:14:20,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 05:14:22,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:14:24,140 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 05:14:25,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:14:25,554 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 05:14:26,810 INFO [train.py:1046] (2/4) Epoch 44, batch 2250, loss[loss=0.1361, simple_loss=0.2202, pruned_loss=0.02606, over 24489.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.234, pruned_loss=0.03668, over 4719780.92 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:14:26,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:14:28,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:14:28,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:14:30,373 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 05:14:33,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:14:34,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:14:38,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1537813.3333333333, ans=0.0 2023-10-04 05:14:39,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:14:39,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:14:42,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:14:42,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:14:43,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:14:45,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 05:14:46,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:46,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:14:48,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 05:14:49,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:14:49,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:14:51,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:14:54,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1537880.0, ans=0.0 2023-10-04 05:14:57,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:14:57,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:14:57,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:14:57,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1537946.6666666667, ans=0.125 2023-10-04 05:15:00,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 05:15:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:15:04,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:15:06,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1537946.6666666667, ans=0.1 2023-10-04 05:15:08,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:15:09,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:15:10,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:15:10,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:15:13,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:15:14,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:15:17,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:15:20,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1538013.3333333333, ans=0.1 2023-10-04 05:15:20,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:15:24,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:15:24,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:15:24,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1538013.3333333333, ans=0.0 2023-10-04 05:15:25,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:15:31,712 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 1.999e+02 2.130e+02 2.376e+02 3.367e+02, threshold=4.261e+02, percent-clipped=0.0 2023-10-04 05:15:33,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:15:36,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:15:36,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 05:15:36,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:36,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:15:39,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 05:15:39,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.18 vs. limit=22.5 2023-10-04 05:15:40,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:15:40,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:41,995 INFO [train.py:1046] (2/4) Epoch 44, batch 2300, loss[loss=0.1435, simple_loss=0.2279, pruned_loss=0.02953, over 24487.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2352, pruned_loss=0.03717, over 4710505.39 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:15:46,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:46,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:15:48,836 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 05:15:50,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1538146.6666666667, ans=0.125 2023-10-04 05:15:51,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:15:58,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1538213.3333333333, ans=0.2 2023-10-04 05:16:00,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:16:00,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:16:00,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:00,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:00,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 05:16:02,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:16:04,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:16:05,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:16:10,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:16:10,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1538280.0, ans=0.1 2023-10-04 05:16:11,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:16:13,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.03 vs. limit=15.0 2023-10-04 05:16:14,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:16:18,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:16:18,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:21,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:16:22,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:16:26,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1538346.6666666667, ans=0.95 2023-10-04 05:16:27,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:16:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:16:29,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:16:29,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 05:16:33,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:16:33,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:33,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:16:33,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:16:33,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:16:35,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 05:16:35,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:16:35,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 05:16:35,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:16:35,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:36,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 05:16:42,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:16:43,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1538413.3333333333, ans=0.0 2023-10-04 05:16:45,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:16:49,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1538413.3333333333, ans=0.0 2023-10-04 05:16:50,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:16:50,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:16:50,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:16:52,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:16:52,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:16:54,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:16:54,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 05:16:55,466 INFO [train.py:1046] (2/4) Epoch 44, batch 2350, loss[loss=0.1622, simple_loss=0.2407, pruned_loss=0.04183, over 23536.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2361, pruned_loss=0.03741, over 4709406.70 frames. ], batch size: 106, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:16:56,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1538480.0, ans=0.125 2023-10-04 05:17:01,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:17:01,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 05:17:07,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 05:17:09,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:17:13,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:13,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:13,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:17:13,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:17:13,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 05:17:17,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:17:19,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1538546.6666666667, ans=0.1 2023-10-04 05:17:22,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 05:17:25,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:17:28,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:17:28,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:17:30,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:17:31,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 05:17:32,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:17:34,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:17:34,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:17:34,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:17:37,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:17:40,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 05:17:41,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:17:44,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.42 vs. limit=15.0 2023-10-04 05:17:44,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:44,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:17:46,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 05:17:47,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:17:49,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 05:17:49,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:17:53,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 05:17:56,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 05:17:56,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:17:56,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:17:56,652 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 05:17:58,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 05:17:59,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 05:18:01,031 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.046e+02 2.383e+02 2.695e+02 3.864e+02, threshold=4.767e+02, percent-clipped=0.0 2023-10-04 05:18:03,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:18:03,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1538746.6666666667, ans=0.125 2023-10-04 05:18:05,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:18:09,845 INFO [train.py:1046] (2/4) Epoch 44, batch 2400, loss[loss=0.16, simple_loss=0.2442, pruned_loss=0.03794, over 24309.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03739, over 4705356.76 frames. ], batch size: 61, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:18:11,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:18:11,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1538813.3333333333, ans=0.125 2023-10-04 05:18:14,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:18:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 05:18:14,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 05:18:21,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:18:21,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:18:24,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 05:18:24,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:18:25,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:25,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 05:18:26,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1538880.0, ans=0.125 2023-10-04 05:18:29,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:33,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 05:18:37,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:18:40,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1538946.6666666667, ans=0.125 2023-10-04 05:18:43,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 05:18:44,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:18:46,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:48,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.68 vs. limit=15.0 2023-10-04 05:18:50,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=22.5 2023-10-04 05:18:50,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:18:52,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 05:18:52,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:18:55,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1539013.3333333333, ans=0.0 2023-10-04 05:19:00,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:01,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:19:05,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:06,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:19:06,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:19:06,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:19:06,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:07,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:19:07,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:19:08,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1539080.0, ans=0.125 2023-10-04 05:19:10,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:19:10,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:19:10,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 05:19:12,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 05:19:15,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:19:15,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:15,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 05:19:16,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 05:19:16,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 05:19:16,505 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 05:19:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 05:19:19,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:19:21,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:21,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:19:21,129 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 05:19:22,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:23,833 INFO [train.py:1046] (2/4) Epoch 44, batch 2450, loss[loss=0.1678, simple_loss=0.2602, pruned_loss=0.03777, over 24567.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2347, pruned_loss=0.03693, over 4710657.92 frames. ], batch size: 71, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:19:23,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:19:24,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1539146.6666666667, ans=0.0 2023-10-04 05:19:27,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:19:27,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:19:29,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:30,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:19:31,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 05:19:37,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:19:37,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:38,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:19:38,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:19:40,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:19:40,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 05:19:44,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:48,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:19:48,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:19:51,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:19:51,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:19:53,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:19:53,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:56,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 05:19:57,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:20:04,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:04,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1539280.0, ans=0.1 2023-10-04 05:20:05,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:20:05,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1539280.0, ans=0.125 2023-10-04 05:20:06,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:20:06,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:08,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:20:08,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 05:20:12,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:20:12,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:20:16,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:20:16,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:22,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:20:22,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 05:20:23,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:20:23,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1539413.3333333333, ans=0.2 2023-10-04 05:20:24,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:20:24,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 05:20:24,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:20:26,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:20:27,987 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.926e+02 2.194e+02 2.495e+02 3.736e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-04 05:20:29,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:20:32,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:32,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:20:36,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 05:20:37,279 INFO [train.py:1046] (2/4) Epoch 44, batch 2500, loss[loss=0.1561, simple_loss=0.2315, pruned_loss=0.0404, over 23753.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.234, pruned_loss=0.03691, over 4699550.31 frames. ], batch size: 212, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:20:37,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:20:42,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1539480.0, ans=0.125 2023-10-04 05:20:44,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:20:52,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:20:52,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:52,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1539546.6666666667, ans=0.1 2023-10-04 05:20:54,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:20:54,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 05:21:00,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:21:01,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:21:02,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:21:02,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:21:02,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 05:21:03,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:05,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:21:07,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 05:21:07,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:07,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 05:21:08,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:12,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:21:13,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:21:16,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:21:17,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 05:21:17,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:21:19,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:23,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:26,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:28,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:21:28,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-10-04 05:21:36,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:21:39,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 05:21:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:21:39,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:21:41,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:21:41,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:21:43,696 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 05:21:43,696 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 05:21:43,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 05:21:45,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:45,551 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:21:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 05:21:47,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 05:21:49,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:21:49,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 05:21:50,651 INFO [train.py:1046] (2/4) Epoch 44, batch 2550, loss[loss=0.1628, simple_loss=0.2371, pruned_loss=0.04426, over 23733.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.234, pruned_loss=0.0368, over 4705703.14 frames. ], batch size: 212, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:21:50,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 05:21:53,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:21:56,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:21:58,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:21:59,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:22:00,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 05:22:00,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:22:04,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 05:22:06,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:22:07,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:09,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1539880.0, ans=0.0 2023-10-04 05:22:11,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:22:11,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 05:22:12,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:22:12,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:22:12,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:22:15,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:22:15,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 05:22:15,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:22:15,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:15,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 05:22:28,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:22:34,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:22:34,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:34,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:22:35,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:22:40,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:22:43,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:22:43,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:22:43,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:22:43,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:22:44,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:22:45,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1540013.3333333333, ans=0.125 2023-10-04 05:22:47,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:22:48,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:52,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:22:52,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 05:22:52,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:22:54,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:55,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:22:56,822 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.968e+02 2.128e+02 2.363e+02 3.523e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-04 05:22:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:22:58,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:04,897 INFO [train.py:1046] (2/4) Epoch 44, batch 2600, loss[loss=0.1457, simple_loss=0.2298, pruned_loss=0.03077, over 21016.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2351, pruned_loss=0.03691, over 4717604.53 frames. ], batch size: 46, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:23:05,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:23:07,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:09,681 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 05:23:12,975 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 05:23:12,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:23:13,029 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 05:23:14,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 05:23:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 05:23:17,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:23:17,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 05:23:18,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 05:23:19,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.75 vs. limit=6.0 2023-10-04 05:23:19,856 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 05:23:21,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:23:23,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 05:23:23,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 05:23:25,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:23:26,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 05:23:29,409 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 05:23:29,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 05:23:36,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:23:38,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:23:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 05:23:40,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:23:44,875 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 05:23:50,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:50,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:23:50,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 05:23:51,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:23:51,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:23:53,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 05:23:53,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1540346.6666666667, ans=0.0 2023-10-04 05:23:56,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:23:56,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:23:56,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1540346.6666666667, ans=0.0 2023-10-04 05:23:57,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:23:59,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.63 vs. limit=6.0 2023-10-04 05:24:01,716 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 05:24:01,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:01,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:24:06,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:24:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:24:08,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 05:24:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:24:11,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:24:13,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:24:15,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1540413.3333333333, ans=0.09899494936611666 2023-10-04 05:24:18,948 INFO [train.py:1046] (2/4) Epoch 44, batch 2650, loss[loss=0.1699, simple_loss=0.2573, pruned_loss=0.04125, over 24346.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2364, pruned_loss=0.03773, over 4708309.57 frames. ], batch size: 77, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:24:19,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 05:24:20,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:21,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:24:27,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 05:24:27,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:27,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:24:28,800 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 05:24:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:24:30,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:33,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:24:34,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:24:34,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1540546.6666666667, ans=0.125 2023-10-04 05:24:36,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 05:24:37,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:24:37,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:24:40,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 05:24:42,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 05:24:44,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:24:47,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 05:24:47,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:24:47,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 05:24:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:24:51,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:24:51,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:24:51,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:24:52,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1540613.3333333333, ans=0.0 2023-10-04 05:24:56,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 05:24:56,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 05:24:59,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:25:04,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 05:25:04,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:25:04,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:04,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:25:04,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:25:05,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.59 vs. limit=15.0 2023-10-04 05:25:05,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:25:06,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1540680.0, ans=0.1 2023-10-04 05:25:08,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:25:08,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:25:10,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:25:11,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:25:12,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:25:12,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:12,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1540680.0, ans=10.0 2023-10-04 05:25:13,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:25:15,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:16,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:25:17,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:25:20,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:23,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:25:23,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:23,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 05:25:24,460 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.007e+02 2.168e+02 2.484e+02 3.520e+02, threshold=4.337e+02, percent-clipped=0.0 2023-10-04 05:25:25,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:25:27,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:30,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:30,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:31,526 INFO [train.py:1046] (2/4) Epoch 44, batch 2700, loss[loss=0.1663, simple_loss=0.2566, pruned_loss=0.03802, over 24379.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2368, pruned_loss=0.0375, over 4719228.56 frames. ], batch size: 77, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:25:31,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:25:31,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:34,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:25:34,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 05:25:37,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:25:39,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 05:25:40,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:25:41,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:41,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:44,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:25:44,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:44,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:25:45,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:25:45,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 05:25:46,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:25:48,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:25:49,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:25:49,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:52,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:25:53,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 05:25:53,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:25:56,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:25:56,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:25:59,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1540946.6666666667, ans=0.125 2023-10-04 05:26:02,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:26:03,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:26:03,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:26:03,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:26:06,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:09,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:26:09,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:26:09,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:26:13,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:13,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:26:14,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.61 vs. limit=6.0 2023-10-04 05:26:15,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1541013.3333333333, ans=0.125 2023-10-04 05:26:23,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:26:23,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:26:26,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:26:26,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:28,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:30,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:31,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:26:31,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:33,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:34,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:26:35,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:26:36,524 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=22.5 2023-10-04 05:26:37,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:37,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:40,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 05:26:41,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:44,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:26:44,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 05:26:45,399 INFO [train.py:1046] (2/4) Epoch 44, batch 2750, loss[loss=0.1824, simple_loss=0.2567, pruned_loss=0.0541, over 23923.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2364, pruned_loss=0.03751, over 4717505.55 frames. ], batch size: 196, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:26:46,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 05:26:46,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:49,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:26:49,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:50,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1541146.6666666667, ans=0.125 2023-10-04 05:26:52,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:52,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:26:53,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:55,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1541146.6666666667, ans=0.5 2023-10-04 05:26:56,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:26:56,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:26:56,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:26:58,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:58,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 05:26:58,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:26:58,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:58,310 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:27:02,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 05:27:03,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:27:05,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:05,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:27:06,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:27:06,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:27:08,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:27:08,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:08,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:14,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:27:14,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:27:16,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:27:16,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:27:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:24,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:27:26,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:27:30,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:30,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:27:31,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:27:31,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1541346.6666666667, ans=0.125 2023-10-04 05:27:34,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1541346.6666666667, ans=0.1 2023-10-04 05:27:35,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:27:37,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:27:37,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 05:27:42,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:27:45,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 05:27:45,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1541413.3333333333, ans=0.5 2023-10-04 05:27:48,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:27:50,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:27:50,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 05:27:51,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:27:52,770 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.112e+02 2.441e+02 2.787e+02 5.885e+02, threshold=4.882e+02, percent-clipped=3.0 2023-10-04 05:27:54,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:27:54,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 05:27:54,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:27:58,455 INFO [train.py:1046] (2/4) Epoch 44, batch 2800, loss[loss=0.1479, simple_loss=0.2132, pruned_loss=0.04132, over 23671.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2356, pruned_loss=0.03739, over 4716104.52 frames. ], batch size: 232, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:27:58,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 05:27:58,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:27:58,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:27:59,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 05:27:59,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:00,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:00,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1541480.0, ans=0.1 2023-10-04 05:28:01,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:01,434 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 05:28:01,434 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 05:28:05,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:07,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1541480.0, ans=0.125 2023-10-04 05:28:08,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:28:08,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:28:10,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:28:13,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 05:28:16,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 05:28:16,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 05:28:19,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:19,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:28:19,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:28:23,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:28:23,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:23,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:28:25,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:28:27,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1541613.3333333333, ans=0.125 2023-10-04 05:28:28,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1541613.3333333333, ans=0.125 2023-10-04 05:28:30,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:28:31,510 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=15.0 2023-10-04 05:28:32,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:32,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1541613.3333333333, ans=0.125 2023-10-04 05:28:34,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:36,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:28:36,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:28:41,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:28:41,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 05:28:41,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1541680.0, ans=0.125 2023-10-04 05:28:43,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:28:43,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:28:43,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:28:48,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:28:48,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:51,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:28:52,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:28:53,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:53,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:28:54,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:28:54,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:28:56,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:56,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 05:28:56,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:28:58,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:28:58,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:00,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 05:29:01,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:01,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:29:02,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:29:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 05:29:12,643 INFO [train.py:1046] (2/4) Epoch 44, batch 2850, loss[loss=0.1538, simple_loss=0.2223, pruned_loss=0.04267, over 23765.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.03732, over 4720870.76 frames. ], batch size: 164, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:29:12,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:29:12,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:29:12,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:29:14,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:29:14,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1541813.3333333333, ans=0.125 2023-10-04 05:29:17,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:29:17,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:29:18,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:29:20,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:20,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:29:22,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:29:22,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 05:29:27,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 05:29:27,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:29:30,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 05:29:31,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:34,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 05:29:34,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 05:29:36,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:48,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:50,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:29:50,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:29:52,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:29:52,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:29:52,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:29:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:29:54,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 05:29:55,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1542013.3333333333, ans=0.0 2023-10-04 05:29:57,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:29:57,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:29:57,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:00,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:00,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-10-04 05:30:01,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:02,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:04,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:30:05,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:30:05,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:07,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:10,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:30:12,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1542080.0, ans=0.5 2023-10-04 05:30:16,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:30:18,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 05:30:18,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 05:30:19,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:30:20,626 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.995e+02 2.371e+02 2.715e+02 4.777e+02, threshold=4.741e+02, percent-clipped=0.0 2023-10-04 05:30:20,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:20,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 05:30:20,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:30:22,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:22,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:30:22,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:30:22,742 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 05:30:22,809 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 05:30:22,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:30:24,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:26,680 INFO [train.py:1046] (2/4) Epoch 44, batch 2900, loss[loss=0.176, simple_loss=0.2634, pruned_loss=0.04427, over 23755.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.235, pruned_loss=0.03733, over 4732195.12 frames. ], batch size: 85, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:30:27,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.75 vs. limit=6.0 2023-10-04 05:30:29,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:30:29,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:30:30,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:30:30,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 05:30:32,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=12.0 2023-10-04 05:30:35,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:35,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 05:30:36,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 05:30:37,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:30:37,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:30:40,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:40,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:30:45,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:30:46,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:48,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:30:48,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 05:30:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:30:49,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:52,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 05:30:54,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 05:30:56,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:56,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 05:30:56,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:30:57,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1542280.0, ans=0.04949747468305833 2023-10-04 05:30:58,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:30:58,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:30:58,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1542280.0, ans=0.0 2023-10-04 05:30:59,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1542280.0, ans=0.0 2023-10-04 05:31:02,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:31:03,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:31:06,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:31:09,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:10,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 05:31:12,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 05:31:12,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:31:16,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:31:17,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 05:31:20,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:31:21,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1542346.6666666667, ans=0.125 2023-10-04 05:31:26,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:31:33,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:31:33,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:31:34,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 05:31:37,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:37,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 05:31:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:31:37,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:31:38,679 INFO [train.py:1046] (2/4) Epoch 44, batch 2950, loss[loss=0.1613, simple_loss=0.237, pruned_loss=0.04283, over 23938.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2356, pruned_loss=0.0373, over 4723020.00 frames. ], batch size: 196, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:31:42,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:31:44,734 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 05:31:46,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:31:46,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:48,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:31:50,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:31:51,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 05:31:51,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1542480.0, ans=0.1 2023-10-04 05:31:52,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 05:31:52,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:31:52,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:32:00,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:32:01,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:32:02,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1542546.6666666667, ans=0.125 2023-10-04 05:32:03,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:32:04,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:32:05,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:32:05,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:32:07,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:32:08,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:32:08,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:32:10,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1542613.3333333333, ans=0.125 2023-10-04 05:32:11,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 05:32:15,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 05:32:15,759 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 05:32:17,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:32:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 05:32:22,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 05:32:22,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:32:23,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:32:23,952 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 05:32:23,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:32:26,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 05:32:26,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:32:26,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:32:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:32:31,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:32:31,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:32,888 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 05:32:34,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:32:34,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 05:32:38,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:39,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:32:39,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 05:32:41,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:32:42,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 05:32:42,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1542746.6666666667, ans=0.1 2023-10-04 05:32:45,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:32:47,166 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.991e+02 2.260e+02 2.568e+02 3.516e+02, threshold=4.519e+02, percent-clipped=0.0 2023-10-04 05:32:47,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:32:48,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:32:48,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:32:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:32:52,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:32:52,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:32:52,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:32:53,908 INFO [train.py:1046] (2/4) Epoch 44, batch 3000, loss[loss=0.1632, simple_loss=0.232, pruned_loss=0.04716, over 22674.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2374, pruned_loss=0.03834, over 4707492.29 frames. ], batch size: 322, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:32:53,909 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 05:33:05,942 INFO [train.py:1078] (2/4) Epoch 44, validation: loss=0.3969, simple_loss=0.2803, pruned_loss=0.2567, over 1125622.00 frames. 2023-10-04 05:33:05,943 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 05:33:06,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:33:07,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:33:08,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:33:10,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 05:33:11,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:33:13,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:33:13,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:33:16,716 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 05:33:18,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 05:33:19,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:33:19,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:33:21,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 05:33:21,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1542880.0, ans=0.125 2023-10-04 05:33:23,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:33:27,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:33:32,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1542880.0, ans=0.0 2023-10-04 05:33:32,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1542880.0, ans=0.125 2023-10-04 05:33:36,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:33:40,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 05:33:42,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:33:46,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:33:46,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:33:46,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:33:49,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:33:49,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 05:33:52,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 05:33:54,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:33:54,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:33:55,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:33:57,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:33:57,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:33:57,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:34:01,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:34:01,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1543013.3333333333, ans=0.125 2023-10-04 05:34:03,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:34:03,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:34:04,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:34:07,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 05:34:08,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:34:08,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:08,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:34:11,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1543080.0, ans=0.125 2023-10-04 05:34:12,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:12,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:14,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 05:34:16,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 05:34:16,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:34:16,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 05:34:16,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:34:17,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 05:34:20,211 INFO [train.py:1046] (2/4) Epoch 44, batch 3050, loss[loss=0.1516, simple_loss=0.2274, pruned_loss=0.03793, over 23394.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2381, pruned_loss=0.03861, over 4700177.69 frames. ], batch size: 105, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:34:20,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:34:21,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:34:23,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 05:34:23,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 05:34:23,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:34:24,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:34:25,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:26,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:34:26,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:26,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:34:27,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 05:34:30,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:34:31,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:31,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:34:34,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:36,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 05:34:39,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1543213.3333333333, ans=0.0 2023-10-04 05:34:40,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1543213.3333333333, ans=0.2 2023-10-04 05:34:42,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 05:34:43,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 05:34:43,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:34:45,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:34:49,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:50,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:50,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:34:54,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:34:54,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:34:54,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:34:54,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:54,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:34:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:57,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:01,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:35:01,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 05:35:03,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:35:03,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:35:05,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:35:06,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:35:06,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:35:07,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:12,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:35:13,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:18,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:20,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:35:20,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:35:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:35:21,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:35:21,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:35:22,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 05:35:25,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:35:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:27,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 05:35:28,602 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.065e+02 2.224e+02 2.528e+02 3.449e+02, threshold=4.449e+02, percent-clipped=0.0 2023-10-04 05:35:28,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:32,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:33,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:35:35,020 INFO [train.py:1046] (2/4) Epoch 44, batch 3100, loss[loss=0.1506, simple_loss=0.2459, pruned_loss=0.0277, over 24642.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2373, pruned_loss=0.03811, over 4699114.43 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:35:36,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:35:39,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 05:35:40,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 05:35:43,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 05:35:45,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:35:47,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:35:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:51,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 05:35:56,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:00,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 05:36:02,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1543546.6666666667, ans=0.1 2023-10-04 05:36:02,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.46 vs. limit=15.0 2023-10-04 05:36:04,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:36:04,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:04,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:36:06,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:36:06,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 05:36:09,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:36:09,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 05:36:09,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:36:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:13,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 05:36:14,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:36:18,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:36:18,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 05:36:18,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 05:36:20,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:20,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:22,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:36:22,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:22,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:36:24,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:36:24,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:36:25,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:36:27,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:36:27,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:27,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 05:36:30,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1543680.0, ans=0.125 2023-10-04 05:36:31,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:36:32,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 05:36:35,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:36:35,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 05:36:37,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:36:37,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:37,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 05:36:46,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 05:36:47,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:36:47,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:49,294 INFO [train.py:1046] (2/4) Epoch 44, batch 3150, loss[loss=0.1625, simple_loss=0.2484, pruned_loss=0.03833, over 23293.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2359, pruned_loss=0.03759, over 4708544.87 frames. ], batch size: 93, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:36:50,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:36:50,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:36:52,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 05:36:54,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:36:54,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:36:55,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 05:36:58,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:00,049 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 05:37:03,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 05:37:03,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:37:04,751 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 05:37:04,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 05:37:05,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1543880.0, ans=0.0 2023-10-04 05:37:06,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 05:37:06,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 05:37:06,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 05:37:06,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:06,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:37:07,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:10,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 05:37:11,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:37:11,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:37:11,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:37:14,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:37:19,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 05:37:20,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:37:22,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:37:23,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:37:23,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 05:37:27,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 05:37:28,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:37:28,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 05:37:28,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:37:30,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:37:30,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:37:31,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:37:31,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:37:34,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 05:37:34,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:37:34,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:36,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:37:36,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:37:37,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 05:37:37,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:37:37,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1544013.3333333333, ans=0.125 2023-10-04 05:37:39,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 05:37:39,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:40,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 05:37:41,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 05:37:43,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:37:43,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:37:44,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 05:37:44,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 05:37:46,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:37:47,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:37:49,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:49,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:37:55,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:37:55,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:57,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.79 vs. limit=15.0 2023-10-04 05:37:58,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.021e+02 2.313e+02 2.490e+02 3.781e+02, threshold=4.625e+02, percent-clipped=0.0 2023-10-04 05:37:58,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 05:38:02,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:38:02,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:38:04,439 INFO [train.py:1046] (2/4) Epoch 44, batch 3200, loss[loss=0.1671, simple_loss=0.238, pruned_loss=0.04812, over 23840.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2344, pruned_loss=0.03728, over 4702826.30 frames. ], batch size: 179, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:38:05,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:38:06,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:38:06,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 05:38:06,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1544146.6666666667, ans=0.0 2023-10-04 05:38:09,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:38:13,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.58 vs. limit=12.0 2023-10-04 05:38:14,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:38:15,018 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:38:18,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:38:28,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:38:28,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1544213.3333333333, ans=0.125 2023-10-04 05:38:29,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1544213.3333333333, ans=0.0 2023-10-04 05:38:30,115 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.00 vs. limit=15.0 2023-10-04 05:38:36,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 05:38:38,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:38:39,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1544280.0, ans=0.0 2023-10-04 05:38:40,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 05:38:42,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:38:46,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:38:47,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:38:47,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:38:49,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 05:38:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 05:38:52,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 05:38:55,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 05:38:55,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1544346.6666666667, ans=0.125 2023-10-04 05:39:00,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:39:04,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:04,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:39:04,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:04,721 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 05:39:04,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:39:09,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:39:10,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 05:39:10,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 05:39:12,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 05:39:14,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 05:39:16,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:39:17,949 INFO [train.py:1046] (2/4) Epoch 44, batch 3250, loss[loss=0.1484, simple_loss=0.2255, pruned_loss=0.03564, over 23575.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2343, pruned_loss=0.03705, over 4706827.30 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:39:20,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:39:20,829 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 05:39:20,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:39:20,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:21,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1544480.0, ans=0.125 2023-10-04 05:39:22,372 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 05:39:26,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:39:29,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:39:36,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:39:36,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 05:39:38,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:39:38,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:38,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:39:39,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:39:39,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:39:42,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:44,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:39:44,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:45,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:45,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:39:48,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:39:50,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:39:51,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:51,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:52,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:53,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:39:53,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:39:53,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1544613.3333333333, ans=0.125 2023-10-04 05:39:59,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 05:40:01,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:40:01,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:40:02,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:04,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:40:10,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:40:12,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1544680.0, ans=0.0 2023-10-04 05:40:16,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:40:16,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:16,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 05:40:16,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:40:16,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:40:18,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:19,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 05:40:21,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 05:40:21,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:40:23,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:23,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:40:24,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 05:40:24,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:40:25,883 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.026e+02 2.220e+02 2.491e+02 3.845e+02, threshold=4.441e+02, percent-clipped=0.0 2023-10-04 05:40:27,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:40:27,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:40:28,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1544746.6666666667, ans=0.1 2023-10-04 05:40:30,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 05:40:30,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:31,332 INFO [train.py:1046] (2/4) Epoch 44, batch 3300, loss[loss=0.1564, simple_loss=0.2333, pruned_loss=0.03969, over 23636.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03699, over 4711419.94 frames. ], batch size: 134, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:40:32,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:40:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 05:40:34,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:40:34,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 05:40:36,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 05:40:37,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 05:40:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:40:42,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:40:42,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:43,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:40:44,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:40:47,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:49,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:40:54,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 05:40:55,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:40:55,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:57,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:58,040 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 05:41:00,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:00,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:41:00,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:41:00,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:02,151 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 05:41:05,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:41:05,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:41:07,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1544946.6666666667, ans=0.125 2023-10-04 05:41:08,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:08,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 05:41:08,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 05:41:10,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:10,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:41:13,038 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 05:41:13,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1544946.6666666667, ans=0.125 2023-10-04 05:41:14,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 05:41:14,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:41:16,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.47 vs. limit=15.0 2023-10-04 05:41:17,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 05:41:19,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:41:23,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:41:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:41:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:41:26,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:41:26,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:41:26,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:41:27,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:41:27,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:29,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:41:30,359 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 05:41:31,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 05:41:33,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:41:34,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:41:34,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:41:36,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:38,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:41:38,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:38,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:41:39,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:41:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 05:41:43,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:44,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:45,380 INFO [train.py:1046] (2/4) Epoch 44, batch 3350, loss[loss=0.188, simple_loss=0.2564, pruned_loss=0.05977, over 19342.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2351, pruned_loss=0.03736, over 4712431.59 frames. ], batch size: 388, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:41:46,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:41:46,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:41:48,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:41:48,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1545146.6666666667, ans=0.0 2023-10-04 05:41:49,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:49,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:53,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:41:54,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:56,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:41:58,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:01,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:42:01,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:42:01,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:42:03,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 05:42:05,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1545213.3333333333, ans=0.07 2023-10-04 05:42:06,885 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 05:42:06,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:42:09,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 05:42:09,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 05:42:11,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:42:11,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:42:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:12,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 05:42:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:13,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:42:15,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:16,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1545280.0, ans=0.125 2023-10-04 05:42:17,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:17,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:19,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:42:21,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:23,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:25,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:29,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:42:29,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:31,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:32,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:35,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:37,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 05:42:37,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:42:39,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 05:42:39,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:42:39,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 05:42:40,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:42,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:47,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:47,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 05:42:48,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:42:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:42:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:42:52,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.103e+02 2.289e+02 2.774e+02 3.662e+02, threshold=4.578e+02, percent-clipped=0.0 2023-10-04 05:42:56,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:42:56,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1545413.3333333333, ans=0.125 2023-10-04 05:42:59,551 INFO [train.py:1046] (2/4) Epoch 44, batch 3400, loss[loss=0.1679, simple_loss=0.2444, pruned_loss=0.04567, over 23698.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2362, pruned_loss=0.03779, over 4705324.50 frames. ], batch size: 232, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:42:59,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 05:42:59,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:43:00,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:43:01,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:01,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1545480.0, ans=0.125 2023-10-04 05:43:02,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 05:43:02,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:43:02,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 05:43:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:43:03,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1545480.0, ans=0.125 2023-10-04 05:43:05,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:43:07,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:43:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:43:07,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 05:43:10,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 05:43:10,503 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 05:43:10,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:13,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1545546.6666666667, ans=0.0 2023-10-04 05:43:14,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:43:14,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:43:14,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:17,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:43:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:43:22,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 05:43:23,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1545546.6666666667, ans=0.0 2023-10-04 05:43:27,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:43:29,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:31,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:31,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:43:37,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:43:40,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 05:43:41,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1545613.3333333333, ans=0.1 2023-10-04 05:43:47,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:47,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:47,927 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.56 vs. limit=15.0 2023-10-04 05:43:48,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 05:43:48,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:43:48,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:49,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:43:49,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:43:52,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:56,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:43:56,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:43:59,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1545746.6666666667, ans=0.0 2023-10-04 05:44:02,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:44:03,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 05:44:09,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:44:11,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1545746.6666666667, ans=0.2 2023-10-04 05:44:13,861 INFO [train.py:1046] (2/4) Epoch 44, batch 3450, loss[loss=0.1517, simple_loss=0.2412, pruned_loss=0.03114, over 24650.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2365, pruned_loss=0.03754, over 4707528.34 frames. ], batch size: 73, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:44:15,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 05:44:18,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 05:44:18,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:44:18,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1545813.3333333333, ans=0.125 2023-10-04 05:44:20,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:44:20,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 05:44:22,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:44:25,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:44:25,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1545813.3333333333, ans=0.125 2023-10-04 05:44:29,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:44:29,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:44:31,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:44:33,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:44:34,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:44:34,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1545880.0, ans=0.1 2023-10-04 05:44:40,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1545880.0, ans=0.125 2023-10-04 05:44:41,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 05:44:45,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 05:44:45,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:44:46,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:44:47,472 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.30 vs. limit=15.0 2023-10-04 05:44:48,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:44:53,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 05:44:53,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:44:57,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:44:57,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1546013.3333333333, ans=0.05 2023-10-04 05:44:58,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:44:58,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:44:59,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:45:01,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 05:45:01,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:45:01,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:45:04,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:45:06,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 05:45:11,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1546013.3333333333, ans=0.125 2023-10-04 05:45:12,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:45:16,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:45:18,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:22,080 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.019e+02 2.264e+02 2.676e+02 4.035e+02, threshold=4.528e+02, percent-clipped=0.0 2023-10-04 05:45:24,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:24,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:45:24,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:45:25,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:45:28,337 INFO [train.py:1046] (2/4) Epoch 44, batch 3500, loss[loss=0.1629, simple_loss=0.2493, pruned_loss=0.03819, over 24002.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2346, pruned_loss=0.03729, over 4707685.98 frames. ], batch size: 86, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:45:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:33,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:45:34,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 05:45:36,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:45:39,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 05:45:41,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:41,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 05:45:46,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:45:46,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:45:47,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:45:47,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:45:47,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:45:47,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:47,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:45:49,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 05:45:52,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:52,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:45:54,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1546213.3333333333, ans=0.125 2023-10-04 05:45:55,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:45:59,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:00,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 05:46:00,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:46:04,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:46:05,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:46:06,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:08,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:46:08,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:46:10,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 05:46:13,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 05:46:13,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 05:46:14,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:46:15,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.05 vs. limit=22.5 2023-10-04 05:46:16,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:17,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:46:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:46:17,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1546346.6666666667, ans=0.0 2023-10-04 05:46:18,596 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.66 vs. limit=5.0 2023-10-04 05:46:19,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:46:20,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:46:24,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:46:26,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 05:46:26,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 05:46:26,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:46:27,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:46:29,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:46:31,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:33,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 05:46:34,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:46:35,014 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:46:36,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:46:38,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 05:46:41,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 05:46:43,390 INFO [train.py:1046] (2/4) Epoch 44, batch 3550, loss[loss=0.1546, simple_loss=0.2329, pruned_loss=0.03816, over 24432.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2329, pruned_loss=0.03665, over 4707696.48 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:46:43,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:43,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:46:43,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:46:44,209 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.90 vs. limit=6.0 2023-10-04 05:46:44,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:46:47,015 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.82 vs. limit=22.5 2023-10-04 05:46:47,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:46:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:46:57,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 05:46:59,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:46:59,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1546546.6666666667, ans=0.2 2023-10-04 05:47:01,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:47:03,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:03,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:47:03,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:47:06,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:47:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:47:08,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:47:08,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:47:08,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:47:16,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:47:16,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:47:17,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:47:17,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:47:19,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:47:19,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 05:47:19,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:21,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:21,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:47:30,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:47:30,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:47:31,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:47:32,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1546680.0, ans=0.2 2023-10-04 05:47:32,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.97 vs. limit=6.0 2023-10-04 05:47:33,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 05:47:34,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:47:34,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 05:47:36,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:47:36,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1546680.0, ans=0.2 2023-10-04 05:47:38,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:47:38,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:47:40,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 05:47:42,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:47:47,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:47:49,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 05:47:50,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:47:54,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:55,575 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.057e+02 2.396e+02 2.886e+02 4.470e+02, threshold=4.792e+02, percent-clipped=0.0 2023-10-04 05:47:55,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 05:47:59,997 INFO [train.py:1046] (2/4) Epoch 44, batch 3600, loss[loss=0.1592, simple_loss=0.2359, pruned_loss=0.04121, over 23544.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2325, pruned_loss=0.03664, over 4690643.25 frames. ], batch size: 285, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:48:02,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 05:48:02,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:48:02,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:48:04,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:48:04,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:48:06,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:48:09,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:48:11,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:11,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:48:13,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:48:15,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:15,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 05:48:17,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:48:20,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:22,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1546880.0, ans=0.0 2023-10-04 05:48:23,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:48:25,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:48:27,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:48:27,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:48:27,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 05:48:28,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:48:31,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:31,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:48:34,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:48:34,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:48:36,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:48:36,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 05:48:40,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1546946.6666666667, ans=0.0 2023-10-04 05:48:44,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:48:46,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:48:48,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 05:48:51,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1547013.3333333333, ans=0.0 2023-10-04 05:48:52,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:48:55,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1547013.3333333333, ans=0.125 2023-10-04 05:48:58,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:01,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:04,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1547080.0, ans=0.04949747468305833 2023-10-04 05:49:05,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:49:05,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:49:05,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 05:49:07,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 05:49:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 05:49:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:49:09,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:49:11,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 05:49:12,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:49:12,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:49:12,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:49:13,944 INFO [train.py:1046] (2/4) Epoch 44, batch 3650, loss[loss=0.1762, simple_loss=0.2618, pruned_loss=0.04532, over 24357.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2336, pruned_loss=0.03659, over 4703042.97 frames. ], batch size: 77, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:49:14,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 05:49:14,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 05:49:17,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:19,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 05:49:20,031 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.70 vs. limit=12.0 2023-10-04 05:49:21,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1547146.6666666667, ans=0.0 2023-10-04 05:49:23,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 05:49:23,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:49:24,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-04 05:49:26,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 05:49:28,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 05:49:30,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.40 vs. limit=15.0 2023-10-04 05:49:31,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:49:31,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:49:31,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:49:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:49:35,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:49:35,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 05:49:37,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:49:37,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:49:37,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 05:49:38,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:49:38,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:49:38,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:49:41,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:49:44,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 05:49:44,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 05:49:46,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:49:49,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 05:49:50,036 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.50 vs. limit=15.0 2023-10-04 05:49:50,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:49:50,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:49:57,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:49:59,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:50:00,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:50:00,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1547346.6666666667, ans=0.125 2023-10-04 05:50:00,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1547346.6666666667, ans=0.1 2023-10-04 05:50:01,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:50:02,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:50:04,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:50:07,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1547346.6666666667, ans=0.125 2023-10-04 05:50:08,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:50:09,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:09,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:50:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:50:12,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:50:12,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:50:16,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 05:50:20,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:50:20,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:50:22,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:50:22,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:23,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:50:24,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:26,098 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 1.956e+02 2.107e+02 2.308e+02 3.469e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-04 05:50:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 05:50:26,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:27,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:50:28,998 INFO [train.py:1046] (2/4) Epoch 44, batch 3700, loss[loss=0.1637, simple_loss=0.2496, pruned_loss=0.03892, over 24037.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2344, pruned_loss=0.03711, over 4693369.37 frames. ], batch size: 80, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:50:30,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:50:31,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:50:33,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1547480.0, ans=0.0 2023-10-04 05:50:34,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:34,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 05:50:36,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:36,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 05:50:37,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:50:39,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:50:42,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:50:43,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:50:43,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:50:44,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:46,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:50:48,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:50:48,251 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 05:50:53,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.13 vs. limit=22.5 2023-10-04 05:50:56,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:50:57,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:50:58,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:51:00,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 05:51:00,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:51:01,048 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.89 vs. limit=15.0 2023-10-04 05:51:03,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:04,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 05:51:06,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:07,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:51:10,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:11,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:51:13,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 05:51:14,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1547680.0, ans=0.125 2023-10-04 05:51:16,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:51:16,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 05:51:17,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:51:17,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 05:51:22,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:51:22,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1547680.0, ans=0.125 2023-10-04 05:51:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:51:25,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:51:26,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 05:51:28,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:51:28,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:51:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:51:29,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:51:31,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:51:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 05:51:33,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 05:51:34,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:51:34,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:36,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:51:37,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:51:39,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:40,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:51:41,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:51:43,116 INFO [train.py:1046] (2/4) Epoch 44, batch 3750, loss[loss=0.1604, simple_loss=0.2463, pruned_loss=0.03724, over 24065.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03755, over 4684354.91 frames. ], batch size: 86, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:51:44,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 05:51:45,097 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=22.5 2023-10-04 05:51:45,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 05:51:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:51:50,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 05:51:51,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:51:53,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:53,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:56,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:51:59,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:52:02,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:52:03,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:52:06,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:52:11,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:52:11,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 05:52:11,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:52:12,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:52:12,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:52:15,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 05:52:19,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 05:52:22,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:52:22,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:52:25,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:52:29,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:52:30,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:52:34,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 05:52:37,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:52:38,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:52:40,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:52:43,234 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:52:45,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:52:48,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:52:50,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:52:51,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:52:54,519 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.028e+02 2.230e+02 2.527e+02 3.718e+02, threshold=4.460e+02, percent-clipped=0.0 2023-10-04 05:52:54,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:52:57,957 INFO [train.py:1046] (2/4) Epoch 44, batch 3800, loss[loss=0.1696, simple_loss=0.2396, pruned_loss=0.04983, over 23754.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2354, pruned_loss=0.03755, over 4695313.29 frames. ], batch size: 164, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:53:02,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:53:06,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:06,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:53:07,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1548146.6666666667, ans=0.125 2023-10-04 05:53:08,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 05:53:09,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:53:10,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:12,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:53:14,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 05:53:14,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:15,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:53:17,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:53:17,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:53:17,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1548213.3333333333, ans=0.1 2023-10-04 05:53:18,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:18,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 05:53:21,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 05:53:21,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:53:21,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1548213.3333333333, ans=0.1 2023-10-04 05:53:24,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:27,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:53:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:53:30,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:53:30,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:31,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:31,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:34,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:53:34,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 05:53:35,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1548280.0, ans=0.2 2023-10-04 05:53:37,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:53:44,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:53:47,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:53:50,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 05:53:54,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 05:53:55,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:55,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:53:56,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:58,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 05:54:00,753 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-10-04 05:54:02,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 05:54:02,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 05:54:02,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:04,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:54:10,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:54:11,774 INFO [train.py:1046] (2/4) Epoch 44, batch 3850, loss[loss=0.1496, simple_loss=0.2308, pruned_loss=0.03422, over 24659.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2341, pruned_loss=0.03725, over 4687402.89 frames. ], batch size: 65, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:54:11,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:54:17,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:54:17,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 05:54:18,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:54:20,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:23,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:54:26,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:54:28,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:54:29,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 05:54:35,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:36,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:39,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:54:40,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:54:42,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:43,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:54:43,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:54:43,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:54:44,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:54:45,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:54:47,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:47,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:54:47,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 05:54:48,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 05:54:48,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:54:48,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:51,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:54:53,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:54,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 05:54:55,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 05:54:57,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:54:59,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 05:55:00,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:55:05,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:05,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1548680.0, ans=0.125 2023-10-04 05:55:06,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:55:12,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:12,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 05:55:15,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 05:55:16,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-10-04 05:55:16,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:16,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:19,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:55:19,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:55:19,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:21,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:21,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:55:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 05:55:21,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:55:22,992 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.962e+02 2.115e+02 2.398e+02 3.315e+02, threshold=4.231e+02, percent-clipped=0.0 2023-10-04 05:55:23,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 05:55:24,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:24,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:25,732 INFO [train.py:1046] (2/4) Epoch 44, batch 3900, loss[loss=0.1498, simple_loss=0.2322, pruned_loss=0.03367, over 23310.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2333, pruned_loss=0.03706, over 4685952.65 frames. ], batch size: 105, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:55:25,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:55:25,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:27,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:55:28,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:28,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:30,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:55:30,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 05:55:31,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:36,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:55:36,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:55:36,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:55:37,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:55:39,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:55:39,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1548880.0, ans=0.05 2023-10-04 05:55:40,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:40,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:55:42,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 05:55:42,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:55:43,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 05:55:45,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:45,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 05:55:47,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 05:55:50,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:55:52,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:55:52,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:55:53,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:55:57,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:55:58,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:56:00,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1548946.6666666667, ans=0.2 2023-10-04 05:56:01,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:56:01,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:56:03,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:56:09,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:56:09,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:56:15,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:56:16,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:56:24,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1549080.0, ans=0.1 2023-10-04 05:56:25,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:56:27,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1549080.0, ans=0.09899494936611666 2023-10-04 05:56:30,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:56:30,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 05:56:30,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 05:56:31,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:56:33,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 05:56:34,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:56:36,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 05:56:38,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1549080.0, ans=0.0 2023-10-04 05:56:41,069 INFO [train.py:1046] (2/4) Epoch 44, batch 3950, loss[loss=0.1509, simple_loss=0.239, pruned_loss=0.03145, over 24566.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.233, pruned_loss=0.03704, over 4688318.36 frames. ], batch size: 71, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:56:41,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:56:43,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 05:56:43,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:56:45,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:56:48,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:56:50,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.52 vs. limit=15.0 2023-10-04 05:56:52,687 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 05:56:53,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:56:54,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 05:56:54,072 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 05:56:55,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:56:58,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:56:58,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:56:58,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:57:03,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 05:57:06,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:57:06,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:57:06,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:57:07,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:57:07,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:57:17,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:57:17,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:57:17,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1549280.0, ans=0.5 2023-10-04 05:57:21,208 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.22 vs. limit=15.0 2023-10-04 05:57:21,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 05:57:28,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 05:57:28,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 05:57:29,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:57:30,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:57:35,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1549346.6666666667, ans=0.125 2023-10-04 05:57:36,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:57:36,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:57:36,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:57:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:57:38,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 05:57:43,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:57:45,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:57:49,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 05:57:53,309 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.935e+02 2.171e+02 2.462e+02 3.454e+02, threshold=4.341e+02, percent-clipped=0.0 2023-10-04 05:57:56,210 INFO [train.py:1046] (2/4) Epoch 44, batch 4000, loss[loss=0.1498, simple_loss=0.2335, pruned_loss=0.03307, over 24442.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2335, pruned_loss=0.03689, over 4695000.07 frames. ], batch size: 63, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:57:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:06,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:08,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1549480.0, ans=0.025 2023-10-04 05:58:11,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:11,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:58:12,432 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.05 vs. limit=22.5 2023-10-04 05:58:13,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:13,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 05:58:14,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:58:15,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 05:58:15,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:58:15,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 05:58:17,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:19,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1549546.6666666667, ans=0.1 2023-10-04 05:58:20,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:58:20,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:58:20,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:58:20,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:58:20,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:58:23,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:58:23,337 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 05:58:24,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:58:26,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:27,437 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 05:58:28,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:58:28,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:58:35,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 05:58:35,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:58:39,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:58:39,352 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 05:58:42,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:58:42,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 05:58:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:58:43,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:43,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:58:45,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:58:45,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:58:45,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:58:48,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 05:58:48,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:51,308 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 05:58:55,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:58:56,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 05:58:58,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:58:59,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:59,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:59:01,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:07,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:59:09,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1549813.3333333333, ans=0.125 2023-10-04 05:59:09,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1549813.3333333333, ans=0.1 2023-10-04 05:59:10,860 INFO [train.py:1046] (2/4) Epoch 44, batch 4050, loss[loss=0.149, simple_loss=0.2312, pruned_loss=0.03344, over 23240.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03707, over 4685314.04 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:59:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 05:59:12,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 05:59:15,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:59:15,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:15,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1549813.3333333333, ans=0.1 2023-10-04 05:59:16,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:59:17,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:59:19,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:59:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:59:25,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:59:25,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:59:28,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:59:29,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:59:32,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:33,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:59:35,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1549880.0, ans=0.125 2023-10-04 05:59:37,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 05:59:39,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 05:59:39,121 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 05:59:42,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:59:47,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 05:59:47,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:59:51,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:54,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:55,360 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=15.0 2023-10-04 05:59:56,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:59:56,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:59,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:00:03,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 06:00:04,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:00:04,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1550013.3333333333, ans=0.125 2023-10-04 06:00:05,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:00:07,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 06:00:10,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:00:14,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.93 vs. limit=6.0 2023-10-04 06:00:16,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 06:00:16,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:00:16,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:00:20,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 06:00:21,022 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.954e+02 2.228e+02 2.538e+02 4.386e+02, threshold=4.457e+02, percent-clipped=1.0 2023-10-04 06:00:21,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 06:00:21,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:23,760 INFO [train.py:1046] (2/4) Epoch 44, batch 4100, loss[loss=0.163, simple_loss=0.2367, pruned_loss=0.0447, over 23818.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03725, over 4694318.98 frames. ], batch size: 212, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:00:23,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:00:23,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:25,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:00:26,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-10-04 06:00:31,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.65 vs. limit=12.0 2023-10-04 06:00:32,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 06:00:32,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 06:00:34,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 06:00:35,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 06:00:35,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:35,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:35,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:36,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:00:37,010 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 06:00:37,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1550213.3333333333, ans=0.125 2023-10-04 06:00:40,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:00:41,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:00:42,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:43,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:00:48,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:00:49,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:00:49,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:00:49,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 06:00:50,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:50,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:00:52,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:00:52,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:00:53,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 06:00:55,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:00:57,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 06:00:58,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:00:59,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:00:59,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 06:01:02,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:01:03,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:01:03,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:01:05,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 06:01:06,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:01:06,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:01:08,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 06:01:09,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:01:09,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:01:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:01:16,769 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.41 vs. limit=15.0 2023-10-04 06:01:17,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:19,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:01:20,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:01:28,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:01:28,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:01:30,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:01:33,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:01:37,270 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.74 vs. limit=12.0 2023-10-04 06:01:37,884 INFO [train.py:1046] (2/4) Epoch 44, batch 4150, loss[loss=0.1526, simple_loss=0.2398, pruned_loss=0.03271, over 24636.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2358, pruned_loss=0.0372, over 4705377.93 frames. ], batch size: 65, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:01:38,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:01:39,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:01:40,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:01:40,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:01:44,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 06:01:45,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:45,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 06:01:45,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 06:01:45,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 06:01:47,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:53,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:01:53,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:01:53,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1550546.6666666667, ans=0.125 2023-10-04 06:01:57,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:01:59,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:02:00,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:02:02,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:02:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:02:02,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:02:06,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:02:09,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:02:10,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 06:02:13,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 06:02:13,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:02:15,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 06:02:15,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:02:15,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:02:18,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:18,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:02:22,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.94 vs. limit=22.5 2023-10-04 06:02:23,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 06:02:27,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:02:27,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:02:28,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 06:02:28,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:02:30,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 06:02:30,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:02:32,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:02:32,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1550680.0, ans=0.0 2023-10-04 06:02:34,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:36,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 06:02:36,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:02:36,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:02:37,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:02:40,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 06:02:40,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:40,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:02:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:02:41,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn1.whiten.whitening_limit, batch_count=1550746.6666666667, ans=22.5 2023-10-04 06:02:42,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 06:02:42,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:02:43,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 06:02:45,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:02:46,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:48,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 06:02:48,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:02:49,482 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.733e+02 2.108e+02 2.384e+02 3.215e+02 5.967e+02, threshold=4.767e+02, percent-clipped=2.0 2023-10-04 06:02:52,736 INFO [train.py:1046] (2/4) Epoch 44, batch 4200, loss[loss=0.157, simple_loss=0.2271, pruned_loss=0.04342, over 23889.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2352, pruned_loss=0.03707, over 4713055.76 frames. ], batch size: 195, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:02:52,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:02:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 06:02:54,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1550813.3333333333, ans=0.1 2023-10-04 06:02:55,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:02:57,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:02:58,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:02:59,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:02:59,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:03:03,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 06:03:05,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 06:03:06,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:08,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:03:10,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:03:13,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 06:03:14,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:03:16,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:16,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 06:03:16,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:03:18,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:18,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:03:18,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:03:20,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:03:23,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 06:03:23,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:26,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:03:26,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:03:28,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:03:29,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:03:31,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=12.0 2023-10-04 06:03:32,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:03:32,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 06:03:32,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:03:33,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:03:34,212 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=15.0 2023-10-04 06:03:38,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:03:39,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:03:46,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:03:48,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 06:03:48,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1551013.3333333333, ans=0.1 2023-10-04 06:03:49,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.11 vs. limit=22.5 2023-10-04 06:03:50,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:03:54,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:03:54,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1551080.0, ans=0.125 2023-10-04 06:03:55,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:03:58,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 06:04:03,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:04:05,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1551080.0, ans=0.0 2023-10-04 06:04:06,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:04:06,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:04:06,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.29 vs. limit=15.0 2023-10-04 06:04:07,444 INFO [train.py:1046] (2/4) Epoch 44, batch 4250, loss[loss=0.1463, simple_loss=0.2364, pruned_loss=0.02814, over 24476.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2337, pruned_loss=0.03664, over 4701846.95 frames. ], batch size: 63, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:04:08,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:12,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:04:14,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 06:04:14,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:04:17,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:20,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:04:24,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:24,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:25,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:04:25,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:04:27,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:28,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:28,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:29,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1551213.3333333333, ans=0.0 2023-10-04 06:04:29,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1551213.3333333333, ans=0.0 2023-10-04 06:04:31,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:04:32,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1551213.3333333333, ans=0.0 2023-10-04 06:04:33,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:04:34,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 06:04:38,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1551280.0, ans=0.0 2023-10-04 06:04:39,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 06:04:39,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:40,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:04:42,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:43,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:04:43,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:43,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:48,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:04:49,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:04:51,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:04:53,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.00 vs. limit=22.5 2023-10-04 06:04:54,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:04:54,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 06:04:55,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:04:55,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 06:04:56,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:04:58,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:04:58,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1551346.6666666667, ans=0.125 2023-10-04 06:04:59,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:59,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:05:03,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 06:05:04,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:05:05,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:05:07,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:05:09,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:05:10,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:05:12,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:05:13,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:05:14,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:05:16,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:05:16,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 06:05:18,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:05:19,446 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 1.987e+02 2.246e+02 2.555e+02 4.795e+02, threshold=4.492e+02, percent-clipped=1.0 2023-10-04 06:05:22,549 INFO [train.py:1046] (2/4) Epoch 44, batch 4300, loss[loss=0.1594, simple_loss=0.2505, pruned_loss=0.03409, over 24682.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.234, pruned_loss=0.03675, over 4708131.29 frames. ], batch size: 73, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:05:22,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:05:23,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:05:26,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:05:31,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1551480.0, ans=0.04949747468305833 2023-10-04 06:05:34,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:05:34,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 06:05:35,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:05:35,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1551546.6666666667, ans=0.125 2023-10-04 06:05:37,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:05:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:05:38,886 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 06:05:40,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:05:41,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:05:44,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 06:05:44,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:05:44,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 06:05:49,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:05:51,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:05:53,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:05:53,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:05:55,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:05:56,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:05:58,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:05:58,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 06:05:59,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 06:05:59,800 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:06:00,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:06:02,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1551613.3333333333, ans=0.125 2023-10-04 06:06:03,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:03,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:06:03,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:05,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:06:05,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 06:06:05,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 06:06:06,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 06:06:06,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1551680.0, ans=0.0 2023-10-04 06:06:08,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:06:08,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 06:06:09,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 06:06:15,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:06:16,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.34 vs. limit=22.5 2023-10-04 06:06:16,714 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 06:06:16,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:06:18,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:18,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:06:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 06:06:21,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:06:21,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:21,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:06:21,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:06:22,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.28 vs. limit=15.0 2023-10-04 06:06:23,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:06:24,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:06:25,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1551746.6666666667, ans=0.0 2023-10-04 06:06:26,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:27,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:27,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:06:34,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 06:06:34,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:06:35,915 INFO [train.py:1046] (2/4) Epoch 44, batch 4350, loss[loss=0.1691, simple_loss=0.2521, pruned_loss=0.04309, over 24066.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.03682, over 4718206.09 frames. ], batch size: 80, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:06:38,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:06:40,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:40,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1551813.3333333333, ans=0.1 2023-10-04 06:06:44,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:06:44,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:06:50,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:06:53,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:57,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:06:57,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:06:59,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:07:00,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:07:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:07:04,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1551946.6666666667, ans=0.0 2023-10-04 06:07:06,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 06:07:08,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:07:08,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:13,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:16,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 06:07:18,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:20,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:07:24,419 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 06:07:26,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:07:26,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:07:27,694 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 06:07:27,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 06:07:27,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:07:29,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:07:30,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:07:30,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:07:31,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:07:31,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:07:34,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 06:07:34,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:34,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:34,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:36,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 06:07:37,510 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 06:07:37,514 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 06:07:37,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 06:07:41,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:07:41,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:07:41,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:07:43,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:07:44,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.74 vs. limit=15.0 2023-10-04 06:07:44,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 06:07:46,189 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 06:07:46,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:46,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=15.0 2023-10-04 06:07:47,973 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.987e+02 2.204e+02 2.500e+02 5.176e+02, threshold=4.408e+02, percent-clipped=1.0 2023-10-04 06:07:50,783 INFO [train.py:1046] (2/4) Epoch 44, batch 4400, loss[loss=0.1543, simple_loss=0.2421, pruned_loss=0.0333, over 23359.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.235, pruned_loss=0.03705, over 4718465.75 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:07:52,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:07:52,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:54,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:55,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 06:07:56,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 06:07:56,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 06:07:58,157 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 06:07:59,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:07:59,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:08:02,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 06:08:05,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:05,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 06:08:09,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:09,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 06:08:09,436 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 06:08:14,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 06:08:14,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 06:08:15,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 06:08:15,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:15,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:08:16,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:08:16,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:08:18,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 06:08:18,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 06:08:19,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:21,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:08:21,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:22,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:24,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:24,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 06:08:26,248 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 06:08:29,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:35,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:08:38,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 06:08:41,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:08:43,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:08:45,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:08:46,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 06:08:46,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:08:46,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:08:46,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:08:48,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:08:51,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1552413.3333333333, ans=0.125 2023-10-04 06:08:52,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 06:08:55,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 06:08:57,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 06:08:57,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:57,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 06:08:59,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:09:01,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:09:03,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 06:09:05,167 INFO [train.py:1046] (2/4) Epoch 44, batch 4450, loss[loss=0.1456, simple_loss=0.2283, pruned_loss=0.03149, over 24323.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2364, pruned_loss=0.03762, over 4723324.65 frames. ], batch size: 61, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:09:06,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:09:09,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:09,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:09:15,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:16,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:09:18,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:20,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:09:23,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:09:25,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:09:25,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 06:09:25,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:09:25,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:27,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:09:27,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:09:29,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:09:35,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:36,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:38,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:09:38,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.47 vs. limit=15.0 2023-10-04 06:09:39,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:09:41,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:09:45,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.46 vs. limit=15.0 2023-10-04 06:09:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 06:09:47,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 06:09:47,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 06:09:47,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:09:50,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:50,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 06:09:53,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:09:56,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:57,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 06:09:57,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:57,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:09:57,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:09:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:59,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:10:01,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:10:01,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 06:10:03,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:10:06,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:10:08,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:10:09,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:10:09,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1552746.6666666667, ans=0.125 2023-10-04 06:10:10,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:10:12,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:10:14,825 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.948e+02 2.234e+02 2.482e+02 3.571e+02, threshold=4.467e+02, percent-clipped=0.0 2023-10-04 06:10:14,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 06:10:16,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:10:18,168 INFO [train.py:1046] (2/4) Epoch 44, batch 4500, loss[loss=0.1616, simple_loss=0.2509, pruned_loss=0.0362, over 24445.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.237, pruned_loss=0.0377, over 4730837.82 frames. ], batch size: 69, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:10:22,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:10:22,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 06:10:22,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 06:10:25,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:10:30,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:10:30,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:10:31,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:10:31,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:10:32,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:10:33,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:10:33,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1552880.0, ans=0.05 2023-10-04 06:10:38,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=12.0 2023-10-04 06:10:38,901 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:10:46,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:10:46,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:10:46,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1552946.6666666667, ans=0.125 2023-10-04 06:10:47,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:10:49,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:10:49,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:10:55,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:11:00,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:11:04,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:11:05,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:11:05,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 06:11:07,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:07,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:07,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1553013.3333333333, ans=0.125 2023-10-04 06:11:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:09,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:11:13,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:11:13,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 06:11:13,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:11:13,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:15,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=15.0 2023-10-04 06:11:17,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:11:18,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:11:20,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:23,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:11:23,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:11:24,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 06:11:26,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 06:11:26,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 06:11:28,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1553080.0, ans=0.0 2023-10-04 06:11:30,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 06:11:32,271 INFO [train.py:1046] (2/4) Epoch 44, batch 4550, loss[loss=0.1477, simple_loss=0.2275, pruned_loss=0.03388, over 21084.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2364, pruned_loss=0.03722, over 4736429.91 frames. ], batch size: 45, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:11:32,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 06:11:33,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:11:36,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:11:36,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:11:40,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:11:45,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:11:45,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1553213.3333333333, ans=0.1 2023-10-04 06:11:46,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:48,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:11:48,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:11:48,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:49,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:11:51,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:11:54,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1553213.3333333333, ans=0.0 2023-10-04 06:11:55,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:11:56,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1553213.3333333333, ans=0.125 2023-10-04 06:11:58,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 06:11:58,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 06:11:59,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:12:00,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 06:12:03,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 06:12:03,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1553280.0, ans=0.1 2023-10-04 06:12:04,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:12:07,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 06:12:09,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:12:12,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:12,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:13,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:12:15,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 06:12:17,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:12:19,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:21,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:12:21,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1553346.6666666667, ans=0.1 2023-10-04 06:12:23,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:12:23,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 06:12:24,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 06:12:24,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:12:25,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 06:12:26,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 06:12:27,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:12:27,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:12:27,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:12:30,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:30,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:12:30,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1553413.3333333333, ans=0.1 2023-10-04 06:12:32,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:12:33,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 06:12:34,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:12:34,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 06:12:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 06:12:34,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:12:36,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 06:12:37,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:12:37,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:12:40,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:12:40,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:40,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:12:42,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:12:43,831 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.968e+02 2.214e+02 2.646e+02 3.245e+02, threshold=4.428e+02, percent-clipped=0.0 2023-10-04 06:12:43,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:12:45,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:12:46,538 INFO [train.py:1046] (2/4) Epoch 44, batch 4600, loss[loss=0.1493, simple_loss=0.2204, pruned_loss=0.03909, over 22702.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2361, pruned_loss=0.03714, over 4741463.08 frames. ], batch size: 322, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:12:46,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:12:46,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1553480.0, ans=0.0 2023-10-04 06:12:49,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:12:50,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:12:50,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:12:52,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 06:12:53,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:12:57,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:12:58,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:01,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:07,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 06:13:08,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:09,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.27 vs. limit=15.0 2023-10-04 06:13:12,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:12,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1553546.6666666667, ans=0.2 2023-10-04 06:13:14,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1553613.3333333333, ans=0.0 2023-10-04 06:13:15,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:13:15,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:18,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 06:13:18,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:13:19,453 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-10-04 06:13:20,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:13:26,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:26,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:13:26,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1553613.3333333333, ans=0.1 2023-10-04 06:13:28,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:13:32,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 06:13:32,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:13:32,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1553680.0, ans=0.125 2023-10-04 06:13:36,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:37,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:13:40,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:40,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 06:13:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:42,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 06:13:42,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:42,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:43,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:45,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:45,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:46,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 06:13:48,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 06:13:48,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 06:13:48,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:13:49,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:13:49,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:13:51,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:57,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1553746.6666666667, ans=0.125 2023-10-04 06:14:00,584 INFO [train.py:1046] (2/4) Epoch 44, batch 4650, loss[loss=0.1512, simple_loss=0.2334, pruned_loss=0.0345, over 23412.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2355, pruned_loss=0.03707, over 4736081.23 frames. ], batch size: 119, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:14:00,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:14:02,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:14:02,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:14:03,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:14:03,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:14:03,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:14:04,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:14:07,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 06:14:10,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:14:12,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 06:14:12,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:14:13,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 06:14:13,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:14:13,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 06:14:13,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1553880.0, ans=0.0 2023-10-04 06:14:13,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1553880.0, ans=0.04949747468305833 2023-10-04 06:14:14,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 06:14:14,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:16,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:14:19,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:14:21,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:21,065 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 06:14:22,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1553880.0, ans=0.07 2023-10-04 06:14:24,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:24,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1553880.0, ans=0.125 2023-10-04 06:14:25,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 06:14:27,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:27,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:14:28,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 06:14:30,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:14:33,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:14:34,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1553946.6666666667, ans=0.0 2023-10-04 06:14:37,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:14:43,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:45,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:45,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:46,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:14:47,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 06:14:49,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 06:14:49,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 06:14:49,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 06:14:51,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:00,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:15:00,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:00,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 06:15:01,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:02,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1554080.0, ans=0.0 2023-10-04 06:15:03,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:03,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:15:04,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:15:07,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:15:07,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:07,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:15:11,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:11,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:15:12,919 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.072e+02 2.400e+02 3.094e+02 4.124e+02, threshold=4.800e+02, percent-clipped=0.0 2023-10-04 06:15:12,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:15:13,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 06:15:14,287 INFO [train.py:1046] (2/4) Epoch 44, batch 4700, loss[loss=0.1597, simple_loss=0.2487, pruned_loss=0.03533, over 24636.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2358, pruned_loss=0.03679, over 4736740.54 frames. ], batch size: 73, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:15:14,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:15:14,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 06:15:16,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1554146.6666666667, ans=0.125 2023-10-04 06:15:22,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:22,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:15:24,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:26,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:15:30,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 06:15:31,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 06:15:33,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:35,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:15:36,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:15:37,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:39,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.77 vs. limit=15.0 2023-10-04 06:15:43,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:15:44,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:15:46,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:46,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1554280.0, ans=0.1 2023-10-04 06:15:53,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 06:15:54,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:15:58,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:00,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 06:16:03,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:08,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:16:08,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 06:16:09,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:09,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:11,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:16:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:16:11,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 06:16:13,851 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 06:16:15,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:15,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 06:16:17,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:17,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1554413.3333333333, ans=0.0 2023-10-04 06:16:20,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 06:16:20,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.20 vs. limit=15.0 2023-10-04 06:16:22,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:16:25,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:27,069 INFO [train.py:1046] (2/4) Epoch 44, batch 4750, loss[loss=0.1379, simple_loss=0.2214, pruned_loss=0.02718, over 24462.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2362, pruned_loss=0.03725, over 4731504.14 frames. ], batch size: 66, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:16:28,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:29,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:16:30,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 06:16:30,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:16:30,259 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:16:34,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 06:16:37,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:16:37,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:38,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:16:41,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 06:16:46,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:16:47,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 06:16:48,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:16:50,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:50,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:50,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:51,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1554546.6666666667, ans=0.2 2023-10-04 06:16:52,151 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 06:16:52,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 06:16:57,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 06:16:59,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:02,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:04,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:17:04,935 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 06:17:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:17:08,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:17:10,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1554680.0, ans=0.07 2023-10-04 06:17:11,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:17:12,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 06:17:12,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 06:17:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:17:12,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:17:12,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:14,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:17:14,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 06:17:15,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 06:17:19,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:17:22,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:17:22,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 06:17:23,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:17:25,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:17:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:17:27,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:28,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:17:31,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:17:32,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 06:17:33,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 06:17:34,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 06:17:35,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:17:36,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:17:38,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 06:17:40,115 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 2.005e+02 2.194e+02 2.471e+02 3.662e+02, threshold=4.389e+02, percent-clipped=0.0 2023-10-04 06:17:41,387 INFO [train.py:1046] (2/4) Epoch 44, batch 4800, loss[loss=0.1721, simple_loss=0.2489, pruned_loss=0.04761, over 23748.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2371, pruned_loss=0.03778, over 4718925.28 frames. ], batch size: 232, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:17:42,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:42,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:17:42,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1554813.3333333333, ans=0.5 2023-10-04 06:17:48,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:17:48,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:48,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 06:17:50,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:17:51,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:17:51,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1554813.3333333333, ans=0.1 2023-10-04 06:17:54,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:17:59,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:00,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:00,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:18:02,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:02,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 06:18:02,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:03,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:06,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:08,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:10,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:10,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:18:11,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:18:13,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:15,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 06:18:15,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 06:18:15,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:15,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:18:16,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:18:16,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:18:16,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:18:20,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:18:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:18:25,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:18:27,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:30,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:18:32,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1555013.3333333333, ans=0.0 2023-10-04 06:18:32,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1555013.3333333333, ans=0.125 2023-10-04 06:18:34,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 06:18:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:35,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:35,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:18:37,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:18:41,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:18:41,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:43,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:18:44,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:18:45,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:18:50,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:18:50,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:50,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:50,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 06:18:52,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 06:18:54,253 INFO [train.py:1046] (2/4) Epoch 44, batch 4850, loss[loss=0.1585, simple_loss=0.2406, pruned_loss=0.03823, over 24682.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.237, pruned_loss=0.03757, over 4725824.86 frames. ], batch size: 73, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:18:54,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:54,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:54,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:18:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:57,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:19:04,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 06:19:05,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:19:07,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1555146.6666666667, ans=0.2 2023-10-04 06:19:09,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:19:11,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:19:11,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:19:15,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:19:15,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:19:17,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:19:17,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 06:19:17,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1555213.3333333333, ans=0.125 2023-10-04 06:19:21,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:19:21,983 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-10-04 06:19:22,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:19:23,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:19:25,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:19:25,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 06:19:27,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:19:27,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:28,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1555280.0, ans=0.1 2023-10-04 06:19:33,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:33,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 06:19:34,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 06:19:35,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:19:42,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:19:43,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 06:19:43,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:19:43,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:19:46,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:19:47,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1555346.6666666667, ans=0.0 2023-10-04 06:19:48,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 06:19:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:49,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 06:19:51,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:19:52,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:19:52,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 06:19:54,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1555413.3333333333, ans=0.07 2023-10-04 06:20:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:05,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:20:05,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:08,908 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.027e+02 2.287e+02 2.709e+02 3.937e+02, threshold=4.574e+02, percent-clipped=0.0 2023-10-04 06:20:08,934 INFO [train.py:1046] (2/4) Epoch 44, batch 4900, loss[loss=0.1454, simple_loss=0.2341, pruned_loss=0.02839, over 24682.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2364, pruned_loss=0.03724, over 4693885.54 frames. ], batch size: 65, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:20:11,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 06:20:11,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:20:11,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1555480.0, ans=0.125 2023-10-04 06:20:17,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:19,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:20:19,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:20:21,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 06:20:26,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 06:20:27,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 06:20:29,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 06:20:29,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:20:30,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:20:30,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:20:30,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:30,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:20:32,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 06:20:34,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 06:20:35,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:20:35,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:20:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:20:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:20:40,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:41,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:41,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 06:20:41,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1555613.3333333333, ans=0.035 2023-10-04 06:20:42,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1555613.3333333333, ans=0.125 2023-10-04 06:20:43,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:20:45,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:45,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 06:20:45,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 06:20:49,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 06:20:50,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:20:52,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:20:52,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:20:53,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:53,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 06:20:54,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:20:54,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 06:20:57,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:58,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:21:00,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:21:04,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 06:21:05,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:21:05,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 06:21:05,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 06:21:10,274 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:21:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:21:14,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:21:16,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 06:21:16,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:21:16,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:21:18,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:21:21,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:21:21,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:21:22,803 INFO [train.py:1046] (2/4) Epoch 44, batch 4950, loss[loss=0.1472, simple_loss=0.2239, pruned_loss=0.03529, over 23576.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2352, pruned_loss=0.0372, over 4697669.22 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:21:22,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:21:22,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 06:21:23,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:21:25,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:21:25,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:21:28,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 06:21:28,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 06:21:28,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:21:28,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 06:21:29,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:29,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:21:29,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:21:30,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1555813.3333333333, ans=0.125 2023-10-04 06:21:31,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:21:33,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:21:34,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1555813.3333333333, ans=0.125 2023-10-04 06:21:35,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:21:36,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:21:36,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1555880.0, ans=0.125 2023-10-04 06:21:37,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:21:40,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:40,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:21:44,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:21:50,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:50,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:21:52,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:52,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:21:53,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:21:55,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 06:21:56,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 06:21:58,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1555946.6666666667, ans=0.125 2023-10-04 06:21:59,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:02,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:22:02,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:22:03,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:22:03,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:22:05,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:22:07,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:22:10,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:22:13,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:22:13,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:22:13,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 06:22:14,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:22:16,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:22:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:22:23,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:22:23,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:22:23,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:23,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:22:23,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:22:26,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:22:26,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:22:26,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:22:27,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 06:22:32,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:22:36,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 06:22:36,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 06:22:37,368 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.058e+02 2.273e+02 2.535e+02 3.965e+02, threshold=4.546e+02, percent-clipped=0.0 2023-10-04 06:22:37,394 INFO [train.py:1046] (2/4) Epoch 44, batch 5000, loss[loss=0.1606, simple_loss=0.2497, pruned_loss=0.03575, over 24558.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2338, pruned_loss=0.03677, over 4692630.95 frames. ], batch size: 71, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:22:42,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:42,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:22:45,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 06:22:45,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 06:22:47,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1556146.6666666667, ans=0.2 2023-10-04 06:22:48,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:22:48,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 06:22:49,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:22:49,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:22:51,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 06:22:51,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:22:53,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:22:54,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 06:22:54,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:22:54,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:22:57,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 06:22:57,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 06:22:57,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:22:58,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 06:22:58,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:22:58,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:22:59,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:22:59,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 06:22:59,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 06:23:01,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 06:23:01,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:23:02,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:03,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1556213.3333333333, ans=0.0 2023-10-04 06:23:04,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 06:23:04,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:23:06,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:08,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:23:08,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 06:23:09,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 06:23:10,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:23:11,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:23:15,169 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 06:23:15,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1556280.0, ans=0.125 2023-10-04 06:23:17,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:23:19,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:19,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:22,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 06:23:22,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:23:22,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:23:22,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:23:25,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 06:23:25,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:23:28,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:23:29,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:23:34,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 06:23:38,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:42,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1556413.3333333333, ans=0.125 2023-10-04 06:23:48,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:23:50,156 INFO [train.py:1046] (2/4) Epoch 44, batch 5050, loss[loss=0.1607, simple_loss=0.2492, pruned_loss=0.03613, over 24548.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2343, pruned_loss=0.03683, over 4706926.08 frames. ], batch size: 71, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:23:50,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:50,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:23:50,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:23:50,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:23:51,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:23:51,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:57,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:57,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 06:23:57,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:24:00,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:24:01,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:24:01,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 06:24:03,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:24:03,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:24:06,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:24:08,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:24:08,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:24:12,097 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.27 vs. limit=15.0 2023-10-04 06:24:19,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 06:24:19,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:24:20,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:24:20,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 06:24:21,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:24:23,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:23,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:24:24,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:24:24,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 06:24:24,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 06:24:26,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:27,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:24:31,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:32,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 06:24:33,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:24:36,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 06:24:38,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:24:38,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:24:39,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:24:40,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:24:42,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:24:44,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:24:46,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:46,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:24:46,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:24:47,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 06:24:48,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:24:48,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:24:51,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:24:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 06:24:51,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:24:53,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:24:53,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:53,396 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 06:24:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:24:56,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 06:24:56,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:25:00,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:00,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 06:25:00,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1556746.6666666667, ans=0.125 2023-10-04 06:25:01,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 06:25:04,318 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.11 vs. limit=22.5 2023-10-04 06:25:05,120 INFO [train.py:1046] (2/4) Epoch 44, batch 5100, loss[loss=0.15, simple_loss=0.2382, pruned_loss=0.03094, over 24284.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03668, over 4725895.54 frames. ], batch size: 61, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 06:25:05,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:06,394 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.012e+02 2.210e+02 2.512e+02 3.231e+02, threshold=4.420e+02, percent-clipped=0.0 2023-10-04 06:25:06,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:25:09,276 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 06:25:11,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:25:14,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 06:25:16,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 06:25:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:17,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1556813.3333333333, ans=0.125 2023-10-04 06:25:18,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:25:21,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:25:21,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 06:25:21,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 06:25:24,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1556880.0, ans=0.125 2023-10-04 06:25:26,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:25:27,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:25:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:33,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 06:25:33,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:36,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:36,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 06:25:37,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:39,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:39,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 06:25:41,944 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 06:25:43,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:43,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 06:25:43,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 06:25:47,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:55,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:25:59,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 06:25:59,540 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 06:25:59,547 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 06:26:02,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 06:26:02,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:26:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 06:26:07,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 06:26:08,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:26:10,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:26:11,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 06:26:12,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:26:14,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 06:26:17,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1557146.6666666667, ans=0.07 2023-10-04 06:26:19,210 INFO [train.py:1046] (2/4) Epoch 44, batch 5150, loss[loss=0.1826, simple_loss=0.2564, pruned_loss=0.05444, over 19281.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.236, pruned_loss=0.03726, over 4720326.94 frames. ], batch size: 388, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 06:26:19,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:26:19,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:26:19,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:26:19,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:26:19,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:26:20,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:26:20,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 06:26:20,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 06:26:22,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 06:26:22,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:26:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 06:26:23,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:26:24,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 06:26:26,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:26:27,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:26:32,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=15.0 2023-10-04 06:26:33,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:26:33,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 06:26:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:26:34,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:26:37,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.22 vs. limit=15.0 2023-10-04 06:26:38,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:26:38,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:26:38,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:26:38,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:26:38,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:26:38,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 06:26:41,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:26:41,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:26:44,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:26:44,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 06:26:46,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:26:53,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:26:54,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 06:26:57,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:26:59,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1557280.0, ans=0.0 2023-10-04 06:27:03,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:27:05,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:27:08,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1557346.6666666667, ans=0.125 2023-10-04 06:27:09,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:11,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:27:12,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 06:27:15,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:27:15,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1557346.6666666667, ans=0.5 2023-10-04 06:27:17,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:27:17,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:27:21,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:21,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:27:22,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 06:27:22,639 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:27:27,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:27:27,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:27:30,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:27:30,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:27:31,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:27:31,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:27:31,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:27:31,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:27:32,831 INFO [train.py:1046] (2/4) Epoch 44, batch 5200, loss[loss=0.1567, simple_loss=0.2396, pruned_loss=0.03694, over 24525.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2361, pruned_loss=0.03758, over 4714159.73 frames. ], batch size: 66, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:27:33,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:27:34,667 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.022e+02 2.221e+02 2.671e+02 4.836e+02, threshold=4.441e+02, percent-clipped=1.0 2023-10-04 06:27:36,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:27:38,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:42,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 06:27:43,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:27:43,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:27:46,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:49,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:27:49,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:27:50,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 06:27:51,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1557546.6666666667, ans=0.0 2023-10-04 06:27:52,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:27:52,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:55,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 06:27:57,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:27:59,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:27:59,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1557546.6666666667, ans=0.2 2023-10-04 06:28:01,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 06:28:01,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 06:28:04,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 06:28:04,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:28:04,427 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 06:28:04,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:28:06,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:07,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:28:07,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 06:28:07,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:28:11,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:28:13,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 06:28:14,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 06:28:14,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 06:28:18,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 06:28:20,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:28:25,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:28:25,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:28:27,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 06:28:27,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:28:28,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 06:28:28,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:28,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:28:30,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1557680.0, ans=0.125 2023-10-04 06:28:31,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:28:32,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:28:36,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:28:38,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:28:38,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:43,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:28:43,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 06:28:45,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:28:45,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:28:46,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:46,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:28:47,814 INFO [train.py:1046] (2/4) Epoch 44, batch 5250, loss[loss=0.1634, simple_loss=0.2465, pruned_loss=0.04015, over 23230.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2358, pruned_loss=0.03724, over 4718798.11 frames. ], batch size: 119, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:28:47,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:28:51,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:28:52,603 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:28:53,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:28:53,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:28:56,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:29:00,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:29:02,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:29:02,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1557880.0, ans=0.0 2023-10-04 06:29:03,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:29:05,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:29:08,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 06:29:08,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:29:10,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:29:24,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1557946.6666666667, ans=0.05 2023-10-04 06:29:26,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1557946.6666666667, ans=0.125 2023-10-04 06:29:45,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1558080.0, ans=0.1 2023-10-04 06:29:45,907 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:29:56,286 INFO [train.py:1046] (2/4) Epoch 44, batch 5300, loss[loss=0.1355, simple_loss=0.2042, pruned_loss=0.03337, over 23706.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2344, pruned_loss=0.03684, over 4721529.80 frames. ], batch size: 232, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:29:57,452 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.098e+02 2.407e+02 2.784e+02 3.746e+02, threshold=4.815e+02, percent-clipped=0.0 2023-10-04 06:30:10,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:30:10,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 06:30:10,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 06:30:10,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:10,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:10,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:10,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:10,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:10,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:10,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:10,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:30:11,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:30:11,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 06:30:11,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 06:30:11,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 06:30:11,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:30:11,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 06:30:11,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 06:30:12,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:12,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:12,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:30:12,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:30:12,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:30:12,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:30:12,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:12,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:30:12,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:12,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:30:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:12,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:30:13,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 06:30:13,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:30:14,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:14,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 06:30:14,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 06:30:14,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:30:14,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:14,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 06:30:14,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 06:30:14,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:30:15,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:30:15,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:30:15,273 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 06:30:15,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 06:30:15,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:30:15,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:15,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 06:30:15,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 06:30:15,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 06:30:15,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:30:22,556 INFO [train.py:1046] (2/4) Epoch 45, batch 0, loss[loss=0.1671, simple_loss=0.2565, pruned_loss=0.0388, over 24278.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2565, pruned_loss=0.0388, over 24278.00 frames. ], batch size: 74, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:30:22,557 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 06:30:34,475 INFO [train.py:1078] (2/4) Epoch 45, validation: loss=0.3306, simple_loss=0.275, pruned_loss=0.1931, over 1125622.00 frames. 2023-10-04 06:30:34,476 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 06:30:35,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 06:30:37,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:30:38,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:30:42,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:42,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:30:42,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:44,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 06:30:45,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 06:30:45,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1558226.6666666667, ans=0.1 2023-10-04 06:30:47,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:48,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:51,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:51,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:52,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:30:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:30:56,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 06:30:58,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:31:05,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:31:05,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:31:07,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 06:31:12,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:31:12,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:31:13,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:31:13,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1558360.0, ans=0.125 2023-10-04 06:31:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:31:19,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:31:20,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1558426.6666666667, ans=0.1 2023-10-04 06:31:25,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 06:31:25,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1558426.6666666667, ans=0.125 2023-10-04 06:31:28,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1558426.6666666667, ans=0.125 2023-10-04 06:31:29,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 06:31:29,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:31:29,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:30,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:31:30,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:31:34,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 06:31:37,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:37,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:42,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:31:46,111 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 06:31:46,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:31:47,523 INFO [train.py:1046] (2/4) Epoch 45, batch 50, loss[loss=0.1536, simple_loss=0.2325, pruned_loss=0.03731, over 23833.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2387, pruned_loss=0.03653, over 1072963.32 frames. ], batch size: 212, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:31:50,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:31:53,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:31:53,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 06:31:53,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:31:53,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:31:56,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:31:56,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:31:57,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:31:59,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1558560.0, ans=0.125 2023-10-04 06:32:00,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 06:32:00,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:08,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:32:08,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 06:32:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 06:32:12,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:32:13,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:32:13,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:14,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:32:16,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:32:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:32:16,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:16,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1558693.3333333333, ans=0.0 2023-10-04 06:32:23,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1558693.3333333333, ans=0.125 2023-10-04 06:32:26,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:32:27,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:32:27,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:32:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 06:32:30,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:32:32,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:32:32,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 06:32:32,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:32:33,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 06:32:41,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:32:41,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:32:43,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:32:44,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:32:44,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:32:45,943 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.033e+02 2.243e+02 2.613e+02 6.562e+02, threshold=4.487e+02, percent-clipped=2.0 2023-10-04 06:32:46,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 06:32:47,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 06:32:47,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:32:48,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:32:48,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:32:50,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:32:50,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 06:32:50,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 06:32:51,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 06:32:53,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:32:53,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:32:53,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 06:32:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 06:32:53,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:32:55,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:32:56,136 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.49 vs. limit=15.0 2023-10-04 06:32:57,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:32:57,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:33:00,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:33:01,414 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.38 vs. limit=6.0 2023-10-04 06:33:02,051 INFO [train.py:1046] (2/4) Epoch 45, batch 100, loss[loss=0.158, simple_loss=0.2467, pruned_loss=0.03466, over 24457.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.238, pruned_loss=0.0373, over 1883319.73 frames. ], batch size: 69, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:33:02,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:33:04,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:33:08,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 06:33:08,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:33:13,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:33:13,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:33:13,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:33:13,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:33:13,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:33:14,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 06:33:16,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:33:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:17,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:33:17,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:33:20,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 06:33:21,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:21,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:33:23,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:33:25,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1558960.0, ans=0.0 2023-10-04 06:33:26,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:33:30,291 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 06:33:30,314 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 06:33:31,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:33:31,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:33:35,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:33:36,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1559026.6666666667, ans=0.04949747468305833 2023-10-04 06:33:37,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:37,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:42,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:44,051 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 06:33:44,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1559026.6666666667, ans=0.0 2023-10-04 06:33:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 06:33:48,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1559093.3333333333, ans=0.04949747468305833 2023-10-04 06:33:50,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:33:52,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:33:53,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:58,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:33:58,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:34:01,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:34:02,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:03,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:04,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:05,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:34:05,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:05,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 06:34:06,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=1559160.0, ans=12.0 2023-10-04 06:34:06,783 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 06:34:06,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:08,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:34:08,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:08,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:08,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 06:34:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:34:08,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:34:08,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:09,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:11,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:13,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:34:13,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:34:16,640 INFO [train.py:1046] (2/4) Epoch 45, batch 150, loss[loss=0.1659, simple_loss=0.2416, pruned_loss=0.04511, over 23852.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2377, pruned_loss=0.03808, over 2507796.87 frames. ], batch size: 195, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:34:16,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:19,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:34:19,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:19,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:23,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:23,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:26,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:34:28,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:31,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 06:34:31,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 06:34:31,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 06:34:32,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:34:32,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:34:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:34:34,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:34,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:34,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:35,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:36,820 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 06:34:38,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:40,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1559293.3333333333, ans=10.0 2023-10-04 06:34:45,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:46,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:34:49,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 06:34:52,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:34:52,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:53,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:34:54,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:34:56,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:56,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:34:58,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:59,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 06:35:03,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:05,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:05,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:35:05,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:35:05,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1559426.6666666667, ans=0.125 2023-10-04 06:35:06,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:08,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 06:35:12,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:35:14,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:35:16,218 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.999e+02 2.185e+02 2.526e+02 4.113e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-04 06:35:16,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:35:17,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:35:17,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 06:35:18,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:35:18,972 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 06:35:21,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:35:26,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:35:26,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:35:28,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 06:35:28,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:35:28,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:30,712 INFO [train.py:1046] (2/4) Epoch 45, batch 200, loss[loss=0.1528, simple_loss=0.231, pruned_loss=0.03726, over 23256.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.238, pruned_loss=0.03783, over 2993266.31 frames. ], batch size: 119, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:35:32,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 06:35:33,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:35:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:37,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:37,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1559560.0, ans=0.125 2023-10-04 06:35:43,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:35:43,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:35:43,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:49,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1559626.6666666667, ans=0.04949747468305833 2023-10-04 06:36:00,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:36:02,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:36:02,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:36:04,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:36:05,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 06:36:05,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:36:08,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:08,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:36:09,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:36:09,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:36:11,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 06:36:12,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:36:12,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:12,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1559760.0, ans=0.125 2023-10-04 06:36:17,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:36:20,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1559760.0, ans=0.1 2023-10-04 06:36:22,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:36:28,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:29,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:36:35,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:35,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1559826.6666666667, ans=0.125 2023-10-04 06:36:38,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 06:36:38,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:38,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:36:38,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:36:39,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:36:39,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 06:36:40,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:36:40,973 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 06:36:42,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1559893.3333333333, ans=0.125 2023-10-04 06:36:43,593 INFO [train.py:1046] (2/4) Epoch 45, batch 250, loss[loss=0.1536, simple_loss=0.2414, pruned_loss=0.03285, over 24622.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2371, pruned_loss=0.03711, over 3384995.44 frames. ], batch size: 73, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:36:43,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:45,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:36:45,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1559893.3333333333, ans=0.1 2023-10-04 06:36:46,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:48,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:50,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:36:50,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:52,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:36:56,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:36:56,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1559893.3333333333, ans=0.125 2023-10-04 06:36:58,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1559960.0, ans=0.125 2023-10-04 06:37:04,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:37:06,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:37:07,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:37:14,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:37:14,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:37:16,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:37:16,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:37:18,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:37:18,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:37:19,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:37:22,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:37:24,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 06:37:25,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:37:25,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:37:27,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:37:27,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:37:27,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:37:27,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1560093.3333333333, ans=0.09899494936611666 2023-10-04 06:37:28,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:37:28,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:37:29,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:37:31,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:37:31,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:37:35,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:37:39,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:37:41,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:37:42,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.020e+02 2.227e+02 2.604e+02 5.544e+02, threshold=4.454e+02, percent-clipped=1.0 2023-10-04 06:37:46,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:37:47,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1560160.0, ans=0.125 2023-10-04 06:37:50,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:37:53,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 06:37:55,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:37:55,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:37:58,149 INFO [train.py:1046] (2/4) Epoch 45, batch 300, loss[loss=0.1419, simple_loss=0.2163, pruned_loss=0.03374, over 23851.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.03701, over 3683065.81 frames. ], batch size: 212, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:37:58,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 06:37:58,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:37:58,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:37:58,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 06:38:03,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:38:03,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:38:07,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:38:09,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 06:38:10,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:38:11,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:38:11,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 06:38:11,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:38:14,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:38:19,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:38:20,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.83 vs. limit=15.0 2023-10-04 06:38:21,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 06:38:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 06:38:24,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:28,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:38:30,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:30,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 06:38:30,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:38:32,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:38:35,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:38:35,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:38:35,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1560360.0, ans=0.0 2023-10-04 06:38:40,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 06:38:40,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 06:38:40,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:38:43,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:44,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 06:38:46,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:38:49,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:38:52,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:38:52,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 06:38:55,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:56,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:38:58,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:39:00,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:39:02,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 06:39:02,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:39:02,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:03,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 06:39:04,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:39:06,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:39:07,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:09,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:11,732 INFO [train.py:1046] (2/4) Epoch 45, batch 350, loss[loss=0.1381, simple_loss=0.2193, pruned_loss=0.02847, over 24348.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03649, over 3904111.12 frames. ], batch size: 61, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:39:11,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:39:11,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 06:39:16,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:20,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:39:21,471 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.57 vs. limit=6.0 2023-10-04 06:39:23,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:24,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:27,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 06:39:28,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1560626.6666666667, ans=0.0 2023-10-04 06:39:29,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:39:29,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 06:39:32,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:32,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 06:39:32,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:36,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 06:39:38,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:39:38,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:39,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:39:42,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:39:42,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:39:42,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:39:42,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:43,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:39:44,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:39:44,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:51,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:39:51,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:39:52,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1560693.3333333333, ans=0.0 2023-10-04 06:39:53,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:39:54,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:58,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1560760.0, ans=0.125 2023-10-04 06:39:59,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 06:39:59,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:40:04,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:04,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:04,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:40:05,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 06:40:07,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:09,870 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 06:40:09,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 06:40:09,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:10,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1560826.6666666667, ans=0.125 2023-10-04 06:40:11,196 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.951e+02 2.094e+02 2.424e+02 4.061e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-04 06:40:12,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:40:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 06:40:15,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:17,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:40:19,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:20,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:20,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:22,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:26,050 INFO [train.py:1046] (2/4) Epoch 45, batch 400, loss[loss=0.146, simple_loss=0.2343, pruned_loss=0.02884, over 24354.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2331, pruned_loss=0.03649, over 4094374.31 frames. ], batch size: 61, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:40:26,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:40:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:40:30,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 06:40:30,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:30,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:33,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:40:33,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:36,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:38,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:39,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 06:40:41,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 06:40:41,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:42,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 06:40:42,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:45,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:40:45,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:45,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 06:40:45,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:40:47,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:47,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:47,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:50,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 06:40:50,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 06:40:56,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:57,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:57,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 06:40:59,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 06:41:03,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:41:05,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:11,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 06:41:13,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:41:13,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 06:41:16,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:41:18,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:41:19,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 06:41:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:41:24,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:41:24,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:41:27,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:27,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 06:41:30,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:41:30,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 06:41:33,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:41:33,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:41:34,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 06:41:37,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:41:38,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:41:39,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:41:40,722 INFO [train.py:1046] (2/4) Epoch 45, batch 450, loss[loss=0.1471, simple_loss=0.2297, pruned_loss=0.03227, over 24335.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2339, pruned_loss=0.03688, over 4227337.66 frames. ], batch size: 61, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:41:40,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 06:41:40,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:41:40,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:41:41,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1561226.6666666667, ans=0.0 2023-10-04 06:41:42,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:41:42,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 06:41:42,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:41:43,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:41:47,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:41:55,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:55,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:41:57,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 06:41:59,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 06:42:03,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:42:06,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:42:06,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:07,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1561293.3333333333, ans=0.0 2023-10-04 06:42:09,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:42:09,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1561360.0, ans=0.125 2023-10-04 06:42:10,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:42:13,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 06:42:13,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 06:42:16,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 06:42:16,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:42:18,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:19,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:42:22,287 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 06:42:22,296 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 06:42:22,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:42:24,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:42:25,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:42:28,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:42:28,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:42:30,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 06:42:30,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 06:42:33,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:42:34,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:42:35,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:42:37,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 06:42:39,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.888e+02 2.086e+02 2.350e+02 3.841e+02, threshold=4.172e+02, percent-clipped=0.0 2023-10-04 06:42:41,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:42:42,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 06:42:42,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 06:42:44,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:42:47,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1561493.3333333333, ans=0.0 2023-10-04 06:42:48,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:42:51,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:42:53,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:42:53,311 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 06:42:54,428 INFO [train.py:1046] (2/4) Epoch 45, batch 500, loss[loss=0.1632, simple_loss=0.2517, pruned_loss=0.03731, over 24040.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2349, pruned_loss=0.03691, over 4340895.23 frames. ], batch size: 80, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:42:56,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:57,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:42:57,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:42:59,225 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 06:43:00,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 06:43:00,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:43:02,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:43:05,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:43:06,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:43:09,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:43:09,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:43:10,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:20,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:20,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:43:21,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:43:22,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:22,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 06:43:22,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:43:26,296 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.54 vs. limit=15.0 2023-10-04 06:43:27,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:43:28,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:43:28,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:43:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:30,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 06:43:33,368 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 06:43:34,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:43:35,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1561693.3333333333, ans=0.0 2023-10-04 06:43:38,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:38,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:39,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:39,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:43:42,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 06:43:43,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:43:45,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:43:46,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1561760.0, ans=0.0 2023-10-04 06:43:47,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:43:51,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:57,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:43:59,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 06:43:59,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:43:59,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:44:03,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 06:44:03,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:44:05,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:44:09,782 INFO [train.py:1046] (2/4) Epoch 45, batch 550, loss[loss=0.1702, simple_loss=0.2441, pruned_loss=0.04812, over 23443.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2359, pruned_loss=0.03737, over 4425145.14 frames. ], batch size: 285, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:44:09,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 06:44:12,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 06:44:12,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:44:12,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 06:44:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:44:12,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:44:14,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:14,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:14,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:44:14,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1561893.3333333333, ans=0.1 2023-10-04 06:44:15,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:44:19,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:44:19,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 06:44:20,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:44:24,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1561960.0, ans=0.125 2023-10-04 06:44:26,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:26,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:28,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:44:30,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:34,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 06:44:35,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 06:44:35,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1561960.0, ans=0.125 2023-10-04 06:44:37,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:44:38,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1562026.6666666667, ans=0.125 2023-10-04 06:44:42,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:44:42,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:44:44,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:44:44,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1562026.6666666667, ans=0.125 2023-10-04 06:44:48,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:48,045 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 06:44:48,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:49,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 06:44:52,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:44:52,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:44:52,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:44:54,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:55,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 06:44:57,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 06:44:58,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:44:58,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:44:58,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:44:58,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:45:01,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:45:01,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1562093.3333333333, ans=0.125 2023-10-04 06:45:02,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:45:06,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:45:07,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:07,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:45:09,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:45:10,264 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.015e+02 2.214e+02 2.569e+02 3.684e+02, threshold=4.428e+02, percent-clipped=0.0 2023-10-04 06:45:10,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:45:12,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:45:12,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:12,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1562160.0, ans=10.0 2023-10-04 06:45:15,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:45:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 06:45:20,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 06:45:24,040 INFO [train.py:1046] (2/4) Epoch 45, batch 600, loss[loss=0.1502, simple_loss=0.2381, pruned_loss=0.03109, over 24654.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2363, pruned_loss=0.03747, over 4490394.44 frames. ], batch size: 65, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:45:25,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 06:45:28,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:45:28,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:45:28,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:45:31,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1562226.6666666667, ans=0.0 2023-10-04 06:45:34,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:45:35,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:45:37,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 06:45:37,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1562293.3333333333, ans=0.125 2023-10-04 06:45:40,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:45:40,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:45:43,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:44,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 06:45:44,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:45:45,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1562293.3333333333, ans=0.2 2023-10-04 06:45:51,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 06:45:55,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:45:55,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:55,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:46:02,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:46:02,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:46:02,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:08,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:46:13,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:13,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:46:13,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:46:13,991 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.34 vs. limit=22.5 2023-10-04 06:46:20,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 06:46:22,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1562493.3333333333, ans=0.0 2023-10-04 06:46:24,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:46:24,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:46:30,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 06:46:30,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:46:30,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1562493.3333333333, ans=0.2 2023-10-04 06:46:32,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 06:46:33,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:46:33,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1562493.3333333333, ans=0.2 2023-10-04 06:46:34,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:46:35,003 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.38 vs. limit=15.0 2023-10-04 06:46:39,047 INFO [train.py:1046] (2/4) Epoch 45, batch 650, loss[loss=0.1569, simple_loss=0.2367, pruned_loss=0.03858, over 23495.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2359, pruned_loss=0.03728, over 4530788.11 frames. ], batch size: 106, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:46:39,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 06:46:41,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:46:41,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1562560.0, ans=0.0 2023-10-04 06:46:42,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:46:45,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:46:46,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:46:48,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 06:46:49,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:49,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1562560.0, ans=0.125 2023-10-04 06:46:52,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:46:52,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:46:57,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:01,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 06:47:01,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:47:02,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:47:06,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:47:06,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 06:47:09,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:09,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:09,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:47:11,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:12,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:47:14,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:47:15,392 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 06:47:15,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:15,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:47:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:19,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:47:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:20,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:47:22,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 06:47:22,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:47:23,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:47:25,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:47:25,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:47:26,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:47:28,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 06:47:28,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1562760.0, ans=0.125 2023-10-04 06:47:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 06:47:29,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:29,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:47:29,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:47:29,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:47:32,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:47:39,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:39,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:47:40,359 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.090e+02 2.308e+02 2.598e+02 4.000e+02, threshold=4.617e+02, percent-clipped=0.0 2023-10-04 06:47:40,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:44,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:44,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 06:47:45,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:47,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1562826.6666666667, ans=0.125 2023-10-04 06:47:52,602 INFO [train.py:1046] (2/4) Epoch 45, batch 700, loss[loss=0.1644, simple_loss=0.2555, pruned_loss=0.03671, over 24557.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2349, pruned_loss=0.03697, over 4579996.12 frames. ], batch size: 71, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:47:52,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:47:52,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:47:52,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:47:53,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:47:58,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 06:48:00,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 06:48:02,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 06:48:02,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:03,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:48:04,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 06:48:09,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:48:11,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:48:12,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:13,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:48:13,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:48:16,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:16,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1562960.0, ans=0.05 2023-10-04 06:48:17,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 06:48:19,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:48:20,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 06:48:22,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 06:48:25,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1563026.6666666667, ans=0.125 2023-10-04 06:48:25,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1563026.6666666667, ans=0.125 2023-10-04 06:48:26,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:48:26,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:48:28,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:48:32,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:48:32,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 06:48:34,709 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=15.0 2023-10-04 06:48:37,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1563093.3333333333, ans=0.0 2023-10-04 06:48:38,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:48:38,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:48:40,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 06:48:42,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:48:43,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:48:44,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1563093.3333333333, ans=0.125 2023-10-04 06:48:45,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:48:46,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.88 vs. limit=15.0 2023-10-04 06:48:51,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:48:51,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 06:48:52,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1563160.0, ans=0.125 2023-10-04 06:48:54,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 06:48:54,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 06:48:57,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:00,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:00,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:49:03,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:03,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 06:49:06,720 INFO [train.py:1046] (2/4) Epoch 45, batch 750, loss[loss=0.1726, simple_loss=0.2571, pruned_loss=0.04401, over 23437.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2345, pruned_loss=0.03671, over 4610949.94 frames. ], batch size: 93, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:49:08,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 06:49:08,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 06:49:08,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 06:49:10,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 06:49:10,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 06:49:10,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:49:12,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 06:49:13,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:13,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:49:16,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:16,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1563226.6666666667, ans=0.07 2023-10-04 06:49:18,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:49:18,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:49:18,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:22,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:49:22,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:49:23,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:49:25,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:25,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:49:26,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 06:49:26,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:49:29,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:49:30,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:49:31,014 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-10-04 06:49:31,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:49:33,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 06:49:33,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:49:36,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 06:49:36,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 06:49:36,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 06:49:36,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:49:36,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:49:38,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:49:42,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:49:44,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:49:44,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:49:45,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:46,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:46,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 06:49:48,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:49:49,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 06:49:49,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:49:53,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:49:53,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 06:49:53,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:49:59,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:59,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:50:01,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:04,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:50:07,900 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 2.004e+02 2.219e+02 2.508e+02 4.389e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 06:50:08,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 06:50:09,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:50:10,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:12,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:12,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:13,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1563493.3333333333, ans=0.0 2023-10-04 06:50:14,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:14,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:50:15,832 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.36 vs. limit=6.0 2023-10-04 06:50:20,448 INFO [train.py:1046] (2/4) Epoch 45, batch 800, loss[loss=0.1841, simple_loss=0.2545, pruned_loss=0.05684, over 19225.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03674, over 4624351.72 frames. ], batch size: 388, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:50:21,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:21,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:23,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:50:23,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:24,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:24,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:27,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:28,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1563560.0, ans=0.1 2023-10-04 06:50:32,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:50:36,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 06:50:38,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:39,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:40,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:50:40,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:50:40,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 06:50:41,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:41,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 06:50:44,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:45,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:48,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:48,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:50:49,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:51,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:52,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1563693.3333333333, ans=0.125 2023-10-04 06:50:54,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:50:55,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:50:55,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 06:50:56,748 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 06:50:58,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 06:50:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:50:58,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:59,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:59,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:51:03,392 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 06:51:03,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1563760.0, ans=0.0 2023-10-04 06:51:04,090 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.96 vs. limit=12.0 2023-10-04 06:51:04,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 06:51:06,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:51:08,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:51:12,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:51:16,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:51:18,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 06:51:18,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:51:22,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 06:51:26,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1563826.6666666667, ans=0.1 2023-10-04 06:51:28,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:51:31,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:51:31,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 06:51:33,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:51:33,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:51:34,458 INFO [train.py:1046] (2/4) Epoch 45, batch 850, loss[loss=0.1436, simple_loss=0.2245, pruned_loss=0.03138, over 24459.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2357, pruned_loss=0.0369, over 4653165.64 frames. ], batch size: 58, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:51:34,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 06:51:34,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:51:37,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:51:39,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:51:42,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:51:42,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:51:43,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 06:51:45,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 06:51:45,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 06:51:45,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:51:45,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:51:48,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:51:48,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:51:48,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:51:52,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:51:53,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:51:53,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 06:51:56,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 06:52:00,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:52:00,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 06:52:03,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 06:52:06,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1564026.6666666667, ans=0.1 2023-10-04 06:52:07,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 06:52:07,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1564026.6666666667, ans=0.125 2023-10-04 06:52:09,022 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 06:52:09,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:52:09,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:52:09,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 06:52:11,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:13,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:13,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 06:52:14,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1564026.6666666667, ans=0.125 2023-10-04 06:52:16,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:52:16,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:52:17,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:52:19,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:52:20,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:52:20,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:52:21,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 06:52:24,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.79 vs. limit=6.0 2023-10-04 06:52:26,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:52:26,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:52:26,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:52:26,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:52:27,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:52:29,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1564093.3333333333, ans=0.125 2023-10-04 06:52:29,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1564093.3333333333, ans=0.125 2023-10-04 06:52:30,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:31,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:52:33,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:52:33,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:52:33,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:52:34,987 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.097e+02 2.346e+02 2.719e+02 4.087e+02, threshold=4.692e+02, percent-clipped=0.0 2023-10-04 06:52:41,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:52:43,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:52:43,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 06:52:43,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1564160.0, ans=0.1 2023-10-04 06:52:44,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:52:44,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:52:46,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 06:52:48,994 INFO [train.py:1046] (2/4) Epoch 45, batch 900, loss[loss=0.1975, simple_loss=0.2682, pruned_loss=0.06337, over 19639.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2359, pruned_loss=0.03718, over 4665956.68 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 06:52:51,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:52:54,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:52:54,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 06:52:56,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1564226.6666666667, ans=0.0 2023-10-04 06:52:57,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:52:57,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 06:52:58,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:53:00,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:53:00,767 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=12.0 2023-10-04 06:53:01,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:01,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:53:01,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:53:12,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:53:14,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:53:16,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:20,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 06:53:23,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:53:25,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:53:27,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:53:27,382 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 06:53:28,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 06:53:34,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:53:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:53:35,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:53:39,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:39,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:53:42,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 06:53:42,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:44,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 06:53:45,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:53:47,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:48,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:53:48,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:53:51,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 06:53:51,743 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 06:53:53,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 06:53:54,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 06:53:57,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:59,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 06:54:03,118 INFO [train.py:1046] (2/4) Epoch 45, batch 950, loss[loss=0.1436, simple_loss=0.221, pruned_loss=0.03312, over 16669.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2361, pruned_loss=0.03742, over 4662546.91 frames. ], batch size: 36, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 06:54:07,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:09,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:09,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:10,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:54:11,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.66 vs. limit=5.0 2023-10-04 06:54:12,444 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 06:54:16,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:16,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:54:17,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:17,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:54:17,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 06:54:21,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 06:54:21,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:22,014 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.52 vs. limit=15.0 2023-10-04 06:54:23,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 06:54:25,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:54:28,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:28,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:54:28,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:54:29,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 06:54:30,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:54:32,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:54:33,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:54:36,833 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.41 vs. limit=15.0 2023-10-04 06:54:37,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1564693.3333333333, ans=0.125 2023-10-04 06:54:40,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:54:40,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 06:54:44,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 06:54:44,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:54:46,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:54:47,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:47,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:54:52,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 06:54:53,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:54:56,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:54:57,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:57,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 06:54:57,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:57,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:54:58,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 06:55:01,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:55:03,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1564826.6666666667, ans=0.2 2023-10-04 06:55:04,546 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.998e+02 2.203e+02 2.459e+02 3.275e+02, threshold=4.407e+02, percent-clipped=0.0 2023-10-04 06:55:04,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:55:07,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1564826.6666666667, ans=10.0 2023-10-04 06:55:11,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:55:12,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 06:55:12,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 06:55:12,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1564826.6666666667, ans=0.125 2023-10-04 06:55:15,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:55:17,329 INFO [train.py:1046] (2/4) Epoch 45, batch 1000, loss[loss=0.1527, simple_loss=0.2354, pruned_loss=0.03495, over 24662.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.03706, over 4678914.07 frames. ], batch size: 65, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 06:55:17,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1564893.3333333333, ans=0.125 2023-10-04 06:55:20,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 06:55:20,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:55:25,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:55:26,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 06:55:26,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 06:55:30,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:55:30,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:55:32,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:55:34,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 06:55:37,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 06:55:39,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 06:55:40,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:55:44,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 06:55:45,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 06:55:45,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 06:55:47,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:55:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:55:57,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:55:59,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:55:59,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:00,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:56:00,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 06:56:00,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:56:01,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:56:01,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:56:02,014 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 06:56:03,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 06:56:05,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 06:56:07,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 06:56:10,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:56:16,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:17,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1565160.0, ans=0.0 2023-10-04 06:56:18,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:56:18,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:56:20,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 06:56:21,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:56:21,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 06:56:23,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 06:56:24,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:56:24,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:56:25,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.45 vs. limit=15.0 2023-10-04 06:56:28,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:56:29,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:56:30,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:56:32,346 INFO [train.py:1046] (2/4) Epoch 45, batch 1050, loss[loss=0.1466, simple_loss=0.2184, pruned_loss=0.0374, over 23947.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2337, pruned_loss=0.03672, over 4683044.25 frames. ], batch size: 212, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:56:33,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:56:35,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:56:36,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:56:37,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:39,907 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.80 vs. limit=22.5 2023-10-04 06:56:40,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:56:42,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:56:44,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:56:45,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1565293.3333333333, ans=0.125 2023-10-04 06:56:47,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:56:47,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:56:47,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:56:48,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:56:50,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 06:56:50,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:56:50,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 06:56:51,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:56:51,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 06:56:53,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 06:56:59,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:57:00,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:57:00,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:57:01,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 06:57:01,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 06:57:02,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:57:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 06:57:08,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 06:57:08,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:13,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 06:57:15,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 06:57:16,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:57:16,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:57:17,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1565426.6666666667, ans=0.125 2023-10-04 06:57:21,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:57:24,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 06:57:25,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 06:57:27,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 06:57:27,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:57:27,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:57:29,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1565426.6666666667, ans=0.1 2023-10-04 06:57:30,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 06:57:33,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:57:34,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:57:34,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:57:34,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:57:36,079 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.013e+02 2.292e+02 2.767e+02 4.836e+02, threshold=4.583e+02, percent-clipped=4.0 2023-10-04 06:57:36,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:38,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:39,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 06:57:40,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:57:40,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 06:57:40,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 06:57:41,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:57:45,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1565560.0, ans=0.125 2023-10-04 06:57:46,713 INFO [train.py:1046] (2/4) Epoch 45, batch 1100, loss[loss=0.1547, simple_loss=0.21, pruned_loss=0.04968, over 19472.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.0362, over 4703966.05 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:57:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:57:51,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:57:54,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.76 vs. limit=22.5 2023-10-04 06:57:55,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:57:57,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:57:57,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:57:58,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 06:58:00,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:02,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:58:04,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:58:06,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:58:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 06:58:08,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 06:58:09,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:58:09,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:58:11,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:58:12,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:58:17,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:58:20,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 06:58:22,082 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 06:58:22,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:24,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:26,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:58:26,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:58:28,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 06:58:28,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:58:28,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:58:28,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:58:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:29,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 06:58:35,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1565760.0, ans=0.125 2023-10-04 06:58:36,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:58:36,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 06:58:37,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:58:39,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1565760.0, ans=0.0 2023-10-04 06:58:40,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:58:43,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 06:58:43,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:58:46,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:48,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:58:50,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:50,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 06:58:50,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1565826.6666666667, ans=0.125 2023-10-04 06:58:51,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:58:51,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:53,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 06:58:55,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:58:55,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 06:58:56,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:58:56,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:58:56,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:58:59,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1565893.3333333333, ans=0.07 2023-10-04 06:59:00,775 INFO [train.py:1046] (2/4) Epoch 45, batch 1150, loss[loss=0.1575, simple_loss=0.2417, pruned_loss=0.03666, over 24672.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2348, pruned_loss=0.03684, over 4712648.47 frames. ], batch size: 65, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:59:02,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:02,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1565893.3333333333, ans=0.1 2023-10-04 06:59:04,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:59:06,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:59:06,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:59:06,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 06:59:06,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:59:07,170 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.31 vs. limit=15.0 2023-10-04 06:59:09,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 06:59:11,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:11,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:59:16,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 06:59:18,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1565960.0, ans=0.125 2023-10-04 06:59:21,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:59:23,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1565960.0, ans=0.125 2023-10-04 06:59:24,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:25,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:25,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 06:59:25,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:59:26,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=15.0 2023-10-04 06:59:27,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:59:30,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 06:59:33,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:59:34,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:59:39,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1566026.6666666667, ans=0.125 2023-10-04 06:59:43,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:48,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:48,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 06:59:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:59:50,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:59:57,245 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 06:59:58,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:03,652 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 07:00:04,475 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.86 vs. limit=15.0 2023-10-04 07:00:04,887 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.046e+02 2.266e+02 2.583e+02 3.861e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 07:00:06,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1566160.0, ans=0.95 2023-10-04 07:00:07,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:08,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:00:10,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:00:10,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:00:11,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:00:13,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1566226.6666666667, ans=0.125 2023-10-04 07:00:14,500 INFO [train.py:1046] (2/4) Epoch 45, batch 1200, loss[loss=0.1589, simple_loss=0.2516, pruned_loss=0.03304, over 24260.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2344, pruned_loss=0.03642, over 4721493.92 frames. ], batch size: 74, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:00:16,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:00:16,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:00:18,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:00:18,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:18,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:00:22,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:00:24,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:00:25,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:00:25,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:29,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 07:00:29,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1566293.3333333333, ans=0.0 2023-10-04 07:00:31,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 07:00:32,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1566293.3333333333, ans=0.125 2023-10-04 07:00:33,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:00:34,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:00:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:00:37,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1566293.3333333333, ans=0.1 2023-10-04 07:00:39,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:00:39,051 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 07:00:40,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:46,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:00:46,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:00:47,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 07:00:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:00:52,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 07:00:57,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 07:00:57,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:59,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:00,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:01:01,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:01:01,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:01:03,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:01:03,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 07:01:04,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:01:05,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:01:05,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:01:07,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:01:07,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:12,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:01:14,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:01:17,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 07:01:19,948 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 07:01:21,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:01:23,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:01:24,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:01:26,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:01:28,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 07:01:30,152 INFO [train.py:1046] (2/4) Epoch 45, batch 1250, loss[loss=0.1455, simple_loss=0.2313, pruned_loss=0.02987, over 24488.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03739, over 4712789.58 frames. ], batch size: 63, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:01:33,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:01:33,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:01:34,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 07:01:37,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:01:38,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:01:40,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1566560.0, ans=0.125 2023-10-04 07:01:41,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:01:42,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:01:44,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:01:44,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:01:47,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:01:49,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:01:51,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:01:51,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:01:52,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:01:54,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:01:56,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:57,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:02:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 07:02:03,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:02:05,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:02:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 07:02:06,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:02:06,750 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 07:02:06,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:08,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:10,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:02:13,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:02:13,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:02:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 07:02:16,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 07:02:17,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 07:02:20,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:02:20,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 07:02:20,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:26,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 07:02:26,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:02:28,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 07:02:28,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:02:29,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:02:29,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:02:30,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:02:32,844 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.067e+02 2.229e+02 2.579e+02 4.151e+02, threshold=4.458e+02, percent-clipped=0.0 2023-10-04 07:02:32,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 07:02:34,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:02:36,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:02:37,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:02:40,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:02:43,073 INFO [train.py:1046] (2/4) Epoch 45, batch 1300, loss[loss=0.1313, simple_loss=0.2121, pruned_loss=0.02525, over 19128.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.236, pruned_loss=0.03757, over 4714896.75 frames. ], batch size: 42, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:02:43,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:02:43,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 07:02:44,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1566893.3333333333, ans=0.0 2023-10-04 07:02:46,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:02:47,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:02:47,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:02:48,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:51,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:02:51,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 07:02:56,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:02:59,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:03:00,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 07:03:04,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:03:08,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1566960.0, ans=0.125 2023-10-04 07:03:09,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:11,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:03:11,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1567026.6666666667, ans=0.125 2023-10-04 07:03:14,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:03:14,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:14,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:03:15,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:03:15,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 07:03:19,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:03:19,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:03:21,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 07:03:22,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:03:25,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:03:28,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:03:28,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 07:03:28,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:03:28,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 07:03:31,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:03:36,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:03:36,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:03:41,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 07:03:42,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 07:03:43,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 07:03:46,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:03:49,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 07:03:50,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:56,817 INFO [train.py:1046] (2/4) Epoch 45, batch 1350, loss[loss=0.1354, simple_loss=0.2, pruned_loss=0.03541, over 22724.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03727, over 4718844.87 frames. ], batch size: 322, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:03:56,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 07:04:01,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:02,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:04,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:04:06,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:07,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:04:07,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:04:12,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:04:12,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 07:04:15,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:04:15,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:04:19,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 07:04:20,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:04:21,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:04:21,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 07:04:24,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 07:04:26,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 07:04:28,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:28,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 07:04:34,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-10-04 07:04:39,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:40,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1567426.6666666667, ans=0.05 2023-10-04 07:04:47,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:47,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:04:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 07:04:50,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:04:52,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 07:04:52,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:04:53,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:55,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:04:58,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 07:04:58,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:05:01,349 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.135e+02 2.425e+02 2.910e+02 3.812e+02, threshold=4.850e+02, percent-clipped=0.0 2023-10-04 07:05:02,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 07:05:04,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 07:05:11,834 INFO [train.py:1046] (2/4) Epoch 45, batch 1400, loss[loss=0.1417, simple_loss=0.2348, pruned_loss=0.02428, over 24631.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2334, pruned_loss=0.03689, over 4705186.15 frames. ], batch size: 68, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:05:11,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 07:05:12,607 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-10-04 07:05:13,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:05:16,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:05:16,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:05:20,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 07:05:23,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 07:05:24,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1567626.6666666667, ans=0.0 2023-10-04 07:05:34,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:05:36,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1567626.6666666667, ans=0.1 2023-10-04 07:05:37,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:05:38,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:05:38,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:05:42,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:05:44,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 07:05:53,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:05:53,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:05:57,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 07:05:57,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:05:57,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:05:59,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:06:00,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:06:01,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:06:01,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:06:02,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:06:03,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 07:06:03,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:06:09,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:12,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:06:15,061 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.93 vs. limit=10.0 2023-10-04 07:06:18,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.11 vs. limit=22.5 2023-10-04 07:06:18,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 07:06:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 07:06:21,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:06:22,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 07:06:24,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:25,395 INFO [train.py:1046] (2/4) Epoch 45, batch 1450, loss[loss=0.1549, simple_loss=0.2408, pruned_loss=0.03446, over 24496.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.233, pruned_loss=0.0369, over 4702910.38 frames. ], batch size: 66, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:06:25,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:06:27,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:06:29,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:06:29,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:29,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 07:06:36,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:36,754 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.90 vs. limit=15.0 2023-10-04 07:06:37,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:06:38,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:06:38,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 07:06:40,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:06:40,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 07:06:41,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:41,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:41,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 07:06:43,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:06:44,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:06:45,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 07:06:45,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:47,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:06:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:50,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:52,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:06:52,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:06:56,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:56,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:57,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:57,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:06:59,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:59,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:03,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 07:07:07,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:07:07,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1568026.6666666667, ans=0.125 2023-10-04 07:07:10,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1568093.3333333333, ans=0.2 2023-10-04 07:07:11,206 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 07:07:12,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:07:14,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:07:14,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:15,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 07:07:17,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1568093.3333333333, ans=0.0 2023-10-04 07:07:19,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:20,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 07:07:20,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 07:07:20,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:23,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:07:23,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:07:23,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1568160.0, ans=0.0 2023-10-04 07:07:25,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 07:07:27,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1568160.0, ans=0.1 2023-10-04 07:07:28,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 07:07:29,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1568160.0, ans=0.125 2023-10-04 07:07:30,135 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.783e+02 2.030e+02 2.333e+02 2.690e+02 5.278e+02, threshold=4.666e+02, percent-clipped=1.0 2023-10-04 07:07:30,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 07:07:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:33,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:07:39,662 INFO [train.py:1046] (2/4) Epoch 45, batch 1500, loss[loss=0.1741, simple_loss=0.2588, pruned_loss=0.04471, over 24007.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2338, pruned_loss=0.03683, over 4712008.06 frames. ], batch size: 80, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:07:42,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 07:07:42,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:07:42,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:07:43,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:07:44,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1568226.6666666667, ans=0.0 2023-10-04 07:07:45,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:07:45,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1568226.6666666667, ans=0.125 2023-10-04 07:07:46,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 07:07:48,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:07:49,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:07:49,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:07:49,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:07:52,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:07:54,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:07:55,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1568293.3333333333, ans=0.05 2023-10-04 07:07:58,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:07:58,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 07:07:58,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:08:00,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:08:01,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:08:03,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 07:08:08,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 07:08:09,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:08:10,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 07:08:12,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:08:15,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:08:16,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:08:16,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:08:16,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 07:08:18,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:08:18,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:08:18,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 07:08:18,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:08:24,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:08:24,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 07:08:30,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:08:30,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:08:35,106 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 07:08:35,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:36,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 07:08:36,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:08:38,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:08:38,492 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 07:08:39,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:08:42,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 07:08:43,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:46,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:08:47,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:48,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:08:49,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:49,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:08:50,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 07:08:52,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 07:08:52,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:08:53,554 INFO [train.py:1046] (2/4) Epoch 45, batch 1550, loss[loss=0.1406, simple_loss=0.2196, pruned_loss=0.03074, over 20622.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2341, pruned_loss=0.03714, over 4700752.94 frames. ], batch size: 45, lr: 2.26e-03, grad_scale: 4.0 2023-10-04 07:08:53,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 07:08:53,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 07:08:55,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:08:58,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:08:58,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:08:58,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:09:00,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:00,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1568560.0, ans=0.09899494936611666 2023-10-04 07:09:01,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:06,245 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 07:09:06,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:07,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:09:07,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:09:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:09:10,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 07:09:12,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:09:12,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 07:09:13,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 07:09:13,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 07:09:13,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:15,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:15,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1568626.6666666667, ans=0.0 2023-10-04 07:09:16,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.09 vs. limit=15.0 2023-10-04 07:09:20,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:09:23,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 07:09:23,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 07:09:27,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1568693.3333333333, ans=0.04949747468305833 2023-10-04 07:09:29,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:34,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:09:34,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:09:34,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:09:34,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1568693.3333333333, ans=0.0 2023-10-04 07:09:35,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 07:09:40,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:09:41,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:44,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:09:47,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:09:47,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:49,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 07:09:49,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:09:50,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:09:50,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:51,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 07:09:51,987 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 07:09:55,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:59,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 07:10:00,709 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.950e+02 2.179e+02 2.499e+02 3.800e+02, threshold=4.358e+02, percent-clipped=0.0 2023-10-04 07:10:04,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1568826.6666666667, ans=0.125 2023-10-04 07:10:05,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:10:06,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:08,089 INFO [train.py:1046] (2/4) Epoch 45, batch 1600, loss[loss=0.1983, simple_loss=0.2649, pruned_loss=0.06584, over 19513.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.03723, over 4721859.91 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:10:08,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 07:10:08,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:10:09,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:10:09,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:10:09,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:10:11,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:10:12,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:10:14,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 07:10:15,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 07:10:17,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 07:10:18,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:10:20,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 07:10:21,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:10:24,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:10:28,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:10:30,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1568960.0, ans=0.125 2023-10-04 07:10:33,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 07:10:35,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:10:35,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 07:10:36,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:10:36,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 07:10:37,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1569026.6666666667, ans=6.0 2023-10-04 07:10:42,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 07:10:50,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:50,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 07:10:51,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:52,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:10:52,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:10:54,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 07:10:58,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:11:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:11:01,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:03,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:04,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:11:06,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:11:07,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:11:07,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1569160.0, ans=0.125 2023-10-04 07:11:08,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1569160.0, ans=0.1 2023-10-04 07:11:09,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:11:09,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1569160.0, ans=0.125 2023-10-04 07:11:15,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:16,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:11:18,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1569160.0, ans=0.2 2023-10-04 07:11:19,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 07:11:19,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:11:19,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 07:11:22,718 INFO [train.py:1046] (2/4) Epoch 45, batch 1650, loss[loss=0.1694, simple_loss=0.2565, pruned_loss=0.04118, over 23980.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2359, pruned_loss=0.03738, over 4718972.73 frames. ], batch size: 80, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:11:24,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:11:26,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:11:28,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:11:28,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 07:11:28,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 07:11:28,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 07:11:28,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 07:11:32,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:33,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:11:33,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:11:34,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:11:36,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:11:38,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 07:11:38,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1569293.3333333333, ans=0.1 2023-10-04 07:11:40,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:11:40,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:11:40,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:11:40,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:11:41,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 07:11:41,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 07:11:47,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:11:50,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:11:56,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 07:11:56,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1569360.0, ans=0.0 2023-10-04 07:11:58,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:00,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 07:12:03,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:06,293 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-10-04 07:12:06,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:12:06,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:12:06,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:08,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:12:08,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:11,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:11,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:13,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:12:14,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:12:14,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:12:16,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:12:18,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:12:20,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 07:12:21,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:12:21,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 07:12:23,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 07:12:23,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 07:12:23,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:12:23,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:12:23,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:25,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:25,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 07:12:27,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:29,078 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.045e+02 2.430e+02 3.009e+02 4.606e+02, threshold=4.861e+02, percent-clipped=4.0 2023-10-04 07:12:30,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:12:30,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:32,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 07:12:36,952 INFO [train.py:1046] (2/4) Epoch 45, batch 1700, loss[loss=0.1775, simple_loss=0.2632, pruned_loss=0.04587, over 24373.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2359, pruned_loss=0.03756, over 4718672.57 frames. ], batch size: 77, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:12:37,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:12:38,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 07:12:38,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:12:38,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:12:38,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:42,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:12:42,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:12:42,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 07:12:44,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:12:49,240 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.15 vs. limit=10.0 2023-10-04 07:12:52,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:56,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:13:00,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:13:01,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.40 vs. limit=22.5 2023-10-04 07:13:01,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:13:01,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:13:02,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:13:04,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 07:13:08,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:13:09,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:09,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:13:09,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1569693.3333333333, ans=0.125 2023-10-04 07:13:12,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:13:13,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 07:13:13,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 07:13:15,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:15,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 07:13:17,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:13:22,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1569760.0, ans=0.1 2023-10-04 07:13:22,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1569760.0, ans=0.125 2023-10-04 07:13:24,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:25,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:27,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:13:30,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:13:30,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 07:13:30,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:13:31,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:31,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 07:13:31,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:13:31,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:13:32,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:32,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:13:36,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:13:36,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:13:37,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:37,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:13:37,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:40,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:13:41,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 07:13:43,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:44,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:13:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 07:13:48,556 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.76 vs. limit=15.0 2023-10-04 07:13:50,338 INFO [train.py:1046] (2/4) Epoch 45, batch 1750, loss[loss=0.1792, simple_loss=0.2658, pruned_loss=0.04633, over 24345.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.0371, over 4704684.08 frames. ], batch size: 77, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:13:51,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:53,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:13:53,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:13:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 07:13:55,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:57,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.48 vs. limit=6.0 2023-10-04 07:13:59,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:13:59,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:03,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 07:14:07,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:08,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.71 vs. limit=22.5 2023-10-04 07:14:10,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 07:14:10,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:14:11,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:14:11,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1569960.0, ans=0.0 2023-10-04 07:14:13,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:14:14,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 07:14:15,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1569960.0, ans=0.125 2023-10-04 07:14:16,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:14:16,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 07:14:22,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1570026.6666666667, ans=0.125 2023-10-04 07:14:24,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:14:26,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:14:26,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:14:26,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1570026.6666666667, ans=0.125 2023-10-04 07:14:30,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:30,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:14:31,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:14:35,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:38,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:14:38,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:14:39,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 07:14:41,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:14:44,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 07:14:45,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:14:47,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:48,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:14:51,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:14:51,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 07:14:52,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:54,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:14:54,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1570160.0, ans=0.125 2023-10-04 07:14:56,583 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.970e+02 2.244e+02 2.815e+02 4.858e+02, threshold=4.488e+02, percent-clipped=0.0 2023-10-04 07:14:58,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:15:00,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:15:02,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:15:02,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 07:15:04,202 INFO [train.py:1046] (2/4) Epoch 45, batch 1800, loss[loss=0.1752, simple_loss=0.246, pruned_loss=0.05221, over 23709.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2341, pruned_loss=0.0369, over 4701938.33 frames. ], batch size: 179, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:15:04,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:15:06,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:15:06,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:06,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:15:06,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:15:06,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1570226.6666666667, ans=0.125 2023-10-04 07:15:08,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:15:09,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:15:10,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:15:12,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:15:13,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:15:17,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:15:17,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1570226.6666666667, ans=0.125 2023-10-04 07:15:19,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:15:20,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:15:21,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1570293.3333333333, ans=0.125 2023-10-04 07:15:22,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1570293.3333333333, ans=0.0 2023-10-04 07:15:23,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.96 vs. limit=15.0 2023-10-04 07:15:23,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:23,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:25,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:15:28,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:15:28,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 07:15:29,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:31,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:31,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1570293.3333333333, ans=0.1 2023-10-04 07:15:35,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 07:15:37,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 07:15:38,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 07:15:38,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:15:38,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1570360.0, ans=0.125 2023-10-04 07:15:40,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:40,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:15:41,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:15:48,325 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 07:15:49,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:15:50,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:52,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 07:15:52,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 07:15:52,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:15:55,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:15:55,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:15:57,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 07:16:05,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:16:05,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 07:16:06,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:16:06,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:16:06,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:16:08,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 07:16:09,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:16:11,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:16:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 07:16:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:16:14,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1570493.3333333333, ans=0.95 2023-10-04 07:16:17,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:16:17,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:16:17,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:16:18,852 INFO [train.py:1046] (2/4) Epoch 45, batch 1850, loss[loss=0.1542, simple_loss=0.2344, pruned_loss=0.03697, over 24477.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2346, pruned_loss=0.03712, over 4704173.97 frames. ], batch size: 63, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:16:18,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:16:20,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:16:20,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:16:21,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:16:24,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:16:24,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:16:33,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:16:33,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 07:16:35,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 07:16:38,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 07:16:44,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:16:44,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 07:16:44,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 07:16:44,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1570626.6666666667, ans=0.0 2023-10-04 07:16:51,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:16:53,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 07:16:55,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:16:56,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:17:01,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 07:17:01,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:03,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:17:04,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:17:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:17:09,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:17:13,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:17:13,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:13,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:17:13,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:15,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:17:17,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:17:21,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 07:17:21,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:17:24,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:17:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:17:25,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 07:17:25,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 07:17:27,198 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.009e+02 2.134e+02 2.480e+02 4.687e+02, threshold=4.268e+02, percent-clipped=1.0 2023-10-04 07:17:27,392 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 07:17:28,726 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 07:17:30,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:17:30,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:17:30,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:17:30,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:31,513 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 07:17:31,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:17:32,791 INFO [train.py:1046] (2/4) Epoch 45, batch 1900, loss[loss=0.1481, simple_loss=0.2282, pruned_loss=0.034, over 24316.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.03767, over 4711671.60 frames. ], batch size: 61, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:17:32,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:34,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:17:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:17:34,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1570893.3333333333, ans=0.1 2023-10-04 07:17:35,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:17:35,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 07:17:37,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:37,667 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 07:17:37,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:17:38,583 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.00 vs. limit=22.5 2023-10-04 07:17:39,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:44,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:45,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:17:47,344 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 07:17:47,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 07:17:50,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:17:50,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:17:50,610 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 07:17:51,892 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 07:17:54,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 07:17:56,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:18:00,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 07:18:00,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1570960.0, ans=0.125 2023-10-04 07:18:00,568 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:18:01,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 07:18:12,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 07:18:12,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1571026.6666666667, ans=0.0 2023-10-04 07:18:13,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 07:18:13,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:18:13,978 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 07:18:13,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 07:18:15,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 07:18:17,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 07:18:17,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:18:20,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 07:18:23,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:18:24,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1571093.3333333333, ans=0.2 2023-10-04 07:18:26,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:18:26,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 07:18:27,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:18:30,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 07:18:30,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:18:36,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:18:36,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:18:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:18:37,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:18:39,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:18:39,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:18:41,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:18:44,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:18:44,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:18:46,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:18:46,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:18:47,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:18:48,854 INFO [train.py:1046] (2/4) Epoch 45, batch 1950, loss[loss=0.1479, simple_loss=0.2341, pruned_loss=0.0309, over 23872.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2356, pruned_loss=0.03754, over 4717064.19 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:18:48,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:18:51,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:18:53,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:18:53,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:18:53,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:18:56,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 07:18:57,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 07:18:59,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:00,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:01,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:19:01,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:03,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:03,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:19:04,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:19:04,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:19:06,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:19:06,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:12,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:16,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:19:16,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:16,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:19:16,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 07:19:16,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:19:16,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:19:16,873 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:19:18,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:21,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:22,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:19:24,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:19:27,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:19:27,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:19:29,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 07:19:29,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:19:31,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1571426.6666666667, ans=0.125 2023-10-04 07:19:33,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:19:34,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:19:35,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:19:43,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:44,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:47,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:55,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:19:55,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:55,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1571493.3333333333, ans=0.0 2023-10-04 07:19:56,526 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.043e+02 2.250e+02 2.599e+02 4.071e+02, threshold=4.500e+02, percent-clipped=0.0 2023-10-04 07:19:56,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 07:19:56,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:19:58,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:59,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 07:20:00,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:20:02,303 INFO [train.py:1046] (2/4) Epoch 45, batch 2000, loss[loss=0.1581, simple_loss=0.2436, pruned_loss=0.0363, over 23908.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2366, pruned_loss=0.03782, over 4723851.15 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:20:05,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:20:06,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:20:07,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:20:07,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1571560.0, ans=0.125 2023-10-04 07:20:08,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:20:10,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:11,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 07:20:13,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:20:18,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:20:19,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 07:20:19,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:20:19,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:20:22,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:20:26,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 07:20:27,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:28,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:28,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:30,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 07:20:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:20:33,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 07:20:33,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:20:36,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:20:37,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:20:37,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:37,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:20:37,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:20:39,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 07:20:42,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1571693.3333333333, ans=0.125 2023-10-04 07:20:43,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 07:20:43,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:20:43,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:20:47,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:48,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:20:49,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:20:49,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:20:50,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:20:52,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:52,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:20:52,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:52,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:55,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:20:57,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 07:20:59,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1571760.0, ans=0.125 2023-10-04 07:21:01,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:21:03,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:07,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:07,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:21:11,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:12,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:21:12,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:15,610 INFO [train.py:1046] (2/4) Epoch 45, batch 2050, loss[loss=0.1455, simple_loss=0.207, pruned_loss=0.04198, over 19506.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.236, pruned_loss=0.03737, over 4723940.14 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:21:15,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:21:15,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:21:18,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:18,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:20,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:21:21,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:26,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:21:28,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:21:28,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:29,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:21:32,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 07:21:32,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:21:32,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1571960.0, ans=0.025 2023-10-04 07:21:33,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:21:33,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:21:42,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:21:42,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:43,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 07:21:45,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1572026.6666666667, ans=0.0 2023-10-04 07:21:47,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:47,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 07:21:47,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:21:50,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:21:51,975 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.19 vs. limit=15.0 2023-10-04 07:21:52,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:21:54,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:21:56,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:21:58,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:21:59,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:21:59,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:22:01,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1572093.3333333333, ans=0.125 2023-10-04 07:22:02,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:03,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:22:06,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:22:08,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:22:12,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:22:16,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:22:17,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 07:22:20,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1572160.0, ans=0.04949747468305833 2023-10-04 07:22:23,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:22:24,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:22:25,687 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 2.015e+02 2.281e+02 2.581e+02 4.368e+02, threshold=4.563e+02, percent-clipped=0.0 2023-10-04 07:22:27,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:22:29,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 07:22:30,489 INFO [train.py:1046] (2/4) Epoch 45, batch 2100, loss[loss=0.1681, simple_loss=0.2385, pruned_loss=0.04883, over 23739.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2344, pruned_loss=0.03712, over 4718957.40 frames. ], batch size: 164, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:22:31,940 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 07:22:31,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:22:33,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:33,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:22:34,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:22:34,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 07:22:34,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 07:22:37,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:22:40,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:22:40,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:22:41,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1572226.6666666667, ans=0.125 2023-10-04 07:22:42,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:22:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:22:43,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 07:22:44,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:22:46,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 07:22:46,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 07:22:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:22:47,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:22:47,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 07:22:47,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 07:22:52,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 07:22:52,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:22:56,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:22:56,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:23:00,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:23:01,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 07:23:01,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:01,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 07:23:04,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 07:23:04,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 07:23:04,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 07:23:04,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 07:23:07,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:23:08,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:23:11,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:23:11,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:23:13,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:16,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:16,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 07:23:16,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:16,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:16,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:16,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 07:23:18,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 07:23:18,601 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.26 vs. limit=15.0 2023-10-04 07:23:19,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 07:23:22,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.44 vs. limit=15.0 2023-10-04 07:23:23,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:23:25,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:23:26,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 07:23:32,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:33,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:23:33,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:23:34,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:23:34,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 07:23:35,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:23:37,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:37,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:23:38,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:23:38,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:40,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 07:23:41,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 07:23:43,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:23:43,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1572560.0, ans=0.0 2023-10-04 07:23:44,315 INFO [train.py:1046] (2/4) Epoch 45, batch 2150, loss[loss=0.1519, simple_loss=0.21, pruned_loss=0.04688, over 19244.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2338, pruned_loss=0.03687, over 4714603.06 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:23:46,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:46,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:23:46,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:23:46,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1572560.0, ans=0.125 2023-10-04 07:23:47,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:23:53,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 07:23:55,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:23:56,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:57,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:23:57,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:23:58,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:24:01,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:01,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:24:01,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:24:03,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1572626.6666666667, ans=0.125 2023-10-04 07:24:04,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1572626.6666666667, ans=0.1 2023-10-04 07:24:04,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1572626.6666666667, ans=0.125 2023-10-04 07:24:07,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:07,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 07:24:09,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.99 vs. limit=22.5 2023-10-04 07:24:11,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:12,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:24:13,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.53 vs. limit=15.0 2023-10-04 07:24:14,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:14,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:15,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:15,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:24:15,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:24:15,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:24:17,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:24:18,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 07:24:19,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1572693.3333333333, ans=0.0 2023-10-04 07:24:20,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:24:20,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:21,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:21,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1572693.3333333333, ans=0.0 2023-10-04 07:24:23,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:24:26,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:24:27,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:27,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:24:29,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:29,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 07:24:29,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:24:31,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:33,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:33,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:24:35,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:36,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:36,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 07:24:38,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 07:24:38,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:24:38,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 07:24:38,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:38,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:24:39,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 07:24:39,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:24:39,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 07:24:40,904 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 07:24:40,905 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 07:24:40,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 07:24:42,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:42,848 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.03 vs. limit=15.0 2023-10-04 07:24:43,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:24:43,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:24:43,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:45,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:24:47,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:47,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:47,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.63 vs. limit=6.0 2023-10-04 07:24:48,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1572826.6666666667, ans=0.125 2023-10-04 07:24:54,372 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.986e+02 2.257e+02 2.566e+02 4.341e+02, threshold=4.515e+02, percent-clipped=0.0 2023-10-04 07:24:55,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:24:55,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 07:24:58,659 INFO [train.py:1046] (2/4) Epoch 45, batch 2200, loss[loss=0.1638, simple_loss=0.2394, pruned_loss=0.04409, over 23254.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.03669, over 4722740.40 frames. ], batch size: 105, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:25:00,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:25:06,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:06,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:25:06,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:07,233 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.80 vs. limit=6.0 2023-10-04 07:25:07,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:25:09,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:25:10,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:25:10,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 07:25:11,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1572893.3333333333, ans=0.125 2023-10-04 07:25:12,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1572960.0, ans=0.1 2023-10-04 07:25:13,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1572960.0, ans=0.0 2023-10-04 07:25:14,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 07:25:18,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:25:22,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 07:25:22,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1572960.0, ans=0.0 2023-10-04 07:25:26,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:25:27,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:25:30,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:25:32,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 07:25:33,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1573026.6666666667, ans=0.2 2023-10-04 07:25:36,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:25:38,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:38,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 07:25:40,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:25:42,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:25:43,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:25:45,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:47,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 07:25:48,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1573093.3333333333, ans=0.05 2023-10-04 07:25:49,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:52,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 07:25:54,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1573093.3333333333, ans=0.05 2023-10-04 07:25:55,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:55,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:25:55,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:57,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:25:57,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:25:58,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:58,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:58,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:25:58,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:26:00,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:26:04,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:26:04,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:26:06,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:26:08,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 07:26:08,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:26:10,881 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 07:26:12,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:26:12,227 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 07:26:13,493 INFO [train.py:1046] (2/4) Epoch 45, batch 2250, loss[loss=0.1604, simple_loss=0.2423, pruned_loss=0.03923, over 23926.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2343, pruned_loss=0.03669, over 4727469.72 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:26:13,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:26:14,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:26:16,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:26:17,739 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 07:26:17,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:26:20,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:26:26,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:26:28,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:26:30,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:32,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:26:32,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:26:39,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 07:26:39,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:26:39,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:26:42,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 07:26:42,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:26:42,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1573293.3333333333, ans=0.1 2023-10-04 07:26:44,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:45,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:26:48,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:26:50,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:26:50,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:26:51,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1573360.0, ans=0.0 2023-10-04 07:26:51,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1573360.0, ans=0.07 2023-10-04 07:26:52,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 07:26:54,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:55,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1573360.0, ans=0.0 2023-10-04 07:26:57,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:26:58,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.15 vs. limit=15.0 2023-10-04 07:27:00,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1573426.6666666667, ans=0.125 2023-10-04 07:27:01,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:27:02,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:27:03,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:03,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:27:05,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:27:07,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:27:07,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1573426.6666666667, ans=0.125 2023-10-04 07:27:10,199 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-04 07:27:12,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:27:13,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:27:19,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:27:19,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:27:20,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:27:23,220 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.76 vs. limit=15.0 2023-10-04 07:27:23,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1573493.3333333333, ans=0.2 2023-10-04 07:27:25,018 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.969e+02 2.340e+02 2.701e+02 4.212e+02, threshold=4.680e+02, percent-clipped=0.0 2023-10-04 07:27:25,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:27:26,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:27:26,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 07:27:26,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:28,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:27:29,707 INFO [train.py:1046] (2/4) Epoch 45, batch 2300, loss[loss=0.1577, simple_loss=0.2477, pruned_loss=0.03382, over 24538.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03694, over 4730064.34 frames. ], batch size: 71, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:27:29,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 07:27:33,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:27:34,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:39,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:40,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:27:40,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1573560.0, ans=0.125 2023-10-04 07:27:43,155 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 07:27:43,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:51,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:27:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:27:52,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:27:53,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:53,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 07:27:53,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:27:55,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:27:56,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:27:59,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:28:02,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:28:06,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:28:09,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:28:09,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:28:10,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-10-04 07:28:13,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:28:15,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:28:18,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1573760.0, ans=0.1 2023-10-04 07:28:19,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:28:21,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:28:21,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:28:21,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 07:28:25,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:28:25,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:28:26,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:28:28,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:28:28,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:28:28,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 07:28:28,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:28:28,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 07:28:29,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:28:29,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:28:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 07:28:36,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:28:41,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:28:43,540 INFO [train.py:1046] (2/4) Epoch 45, batch 2350, loss[loss=0.1407, simple_loss=0.2164, pruned_loss=0.03254, over 24274.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.03743, over 4721846.82 frames. ], batch size: 56, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:28:43,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:28:43,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:28:45,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:28:46,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:28:46,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:28:47,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:28:47,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 07:28:53,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:28:53,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 07:28:58,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1573960.0, ans=0.125 2023-10-04 07:28:59,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 07:29:02,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:29:03,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1573960.0, ans=0.0 2023-10-04 07:29:04,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:04,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:04,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:29:06,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:29:06,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 07:29:09,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:29:10,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1573960.0, ans=15.0 2023-10-04 07:29:13,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 07:29:15,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:29:18,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:29:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:29:19,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:29:20,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 07:29:22,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:29:23,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:29:23,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:29:23,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1574026.6666666667, ans=0.125 2023-10-04 07:29:25,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:29:28,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:29:30,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 07:29:31,005 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.90 vs. limit=12.0 2023-10-04 07:29:31,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:29:33,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:33,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:29:36,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 07:29:37,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:29:39,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 07:29:39,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:29:45,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 07:29:45,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1574160.0, ans=0.2 2023-10-04 07:29:49,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 07:29:49,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:29:49,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 07:29:50,605 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 07:29:50,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 07:29:52,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 07:29:52,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1574160.0, ans=0.2 2023-10-04 07:29:53,380 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.034e+02 2.312e+02 2.597e+02 4.365e+02, threshold=4.623e+02, percent-clipped=0.0 2023-10-04 07:29:55,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:29:58,606 INFO [train.py:1046] (2/4) Epoch 45, batch 2400, loss[loss=0.1454, simple_loss=0.2324, pruned_loss=0.02918, over 23983.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2348, pruned_loss=0.03736, over 4717392.93 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:29:58,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:30:01,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:30:03,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:30:04,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 07:30:04,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 07:30:09,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1574226.6666666667, ans=0.125 2023-10-04 07:30:12,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:30:12,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:30:13,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1574293.3333333333, ans=0.125 2023-10-04 07:30:15,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 07:30:16,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:30:17,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:17,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 07:30:22,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:24,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 07:30:29,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:30:33,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 07:30:34,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:30:37,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:38,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1574360.0, ans=0.0 2023-10-04 07:30:42,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:30:42,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 07:30:42,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:30:44,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1574426.6666666667, ans=0.125 2023-10-04 07:30:49,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:30:52,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:30:55,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:30:55,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:30:55,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:30:55,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:30:55,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:30:57,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:30:57,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:30:58,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1574493.3333333333, ans=0.2 2023-10-04 07:31:01,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:31:03,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:31:03,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 07:31:03,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 07:31:05,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:31:05,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:31:07,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 07:31:07,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 07:31:08,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 07:31:08,644 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 07:31:10,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 07:31:11,740 INFO [train.py:1046] (2/4) Epoch 45, batch 2450, loss[loss=0.1402, simple_loss=0.2235, pruned_loss=0.02842, over 24583.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03715, over 4723498.10 frames. ], batch size: 60, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:31:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:31:11,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:11,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:31:13,819 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 07:31:15,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:15,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:31:19,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:31:19,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:31:22,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:22,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:31:22,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 07:31:26,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:31:28,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:31,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:31:32,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:31:32,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:31:33,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 07:31:36,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:37,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:31:39,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:31:44,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:31:44,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:31:44,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1574693.3333333333, ans=0.2 2023-10-04 07:31:45,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:31:46,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:48,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 07:31:49,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:31:55,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:31:57,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:57,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:31:58,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:31:58,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:32:00,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:32:02,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 07:32:04,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:32:04,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:32:07,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:32:07,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:32:12,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.86 vs. limit=8.0 2023-10-04 07:32:14,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:32:14,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 07:32:15,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:32:15,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:32:15,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 07:32:17,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:32:18,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:32:21,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:32:22,523 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.042e+02 2.331e+02 2.732e+02 3.935e+02, threshold=4.662e+02, percent-clipped=0.0 2023-10-04 07:32:23,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:32:24,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:32:26,603 INFO [train.py:1046] (2/4) Epoch 45, batch 2500, loss[loss=0.1563, simple_loss=0.2251, pruned_loss=0.04376, over 23759.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2336, pruned_loss=0.03726, over 4717727.89 frames. ], batch size: 179, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:32:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 07:32:28,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:32:35,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:32:44,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:32:44,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:32:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:32:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 07:32:49,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1574960.0, ans=0.125 2023-10-04 07:32:52,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:32:52,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:32:53,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:32:53,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:32:54,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 07:32:54,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:32:56,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:32:56,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 07:32:56,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:32:56,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1575026.6666666667, ans=0.125 2023-10-04 07:32:58,126 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 07:32:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:32:59,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1575026.6666666667, ans=0.0 2023-10-04 07:33:00,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:33:02,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:33:03,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.05 vs. limit=22.5 2023-10-04 07:33:05,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:33:05,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 07:33:07,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:33:08,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:11,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:16,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:20,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:33:24,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:33:26,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 07:33:26,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:33:26,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:33:28,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:33:28,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:33:29,478 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 07:33:29,479 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 07:33:29,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 07:33:29,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1575160.0, ans=0.125 2023-10-04 07:33:32,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:33,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 07:33:33,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 07:33:34,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1575160.0, ans=0.125 2023-10-04 07:33:35,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:33:35,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 07:33:38,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 07:33:40,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:33:41,299 INFO [train.py:1046] (2/4) Epoch 45, batch 2550, loss[loss=0.1621, simple_loss=0.2451, pruned_loss=0.03957, over 23253.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2337, pruned_loss=0.03731, over 4704356.06 frames. ], batch size: 93, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:33:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:33:44,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:33:47,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:33:48,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 07:33:48,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:33:51,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1575226.6666666667, ans=0.2 2023-10-04 07:33:53,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 07:33:53,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:33:55,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:58,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:33:58,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 07:33:59,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:33:59,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:33:59,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:34:00,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:34:02,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 07:34:02,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:34:02,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:02,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 07:34:08,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1575293.3333333333, ans=0.0 2023-10-04 07:34:08,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1575293.3333333333, ans=0.0 2023-10-04 07:34:13,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1575360.0, ans=0.125 2023-10-04 07:34:16,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:34:17,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1575360.0, ans=0.125 2023-10-04 07:34:21,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:34:21,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:21,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:34:23,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:34:26,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1575426.6666666667, ans=0.125 2023-10-04 07:34:29,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:34:30,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:34:31,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:34:31,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:34:32,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:34:32,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:34:35,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:34:35,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:39,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:34:40,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 07:34:40,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:34:40,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:41,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1575493.3333333333, ans=0.125 2023-10-04 07:34:42,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:34:42,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:34:44,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:34:45,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1575493.3333333333, ans=0.125 2023-10-04 07:34:48,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1575493.3333333333, ans=0.0 2023-10-04 07:34:50,977 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.015e+02 2.178e+02 2.479e+02 3.759e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-04 07:34:51,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:34:52,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:34:55,037 INFO [train.py:1046] (2/4) Epoch 45, batch 2600, loss[loss=0.154, simple_loss=0.2412, pruned_loss=0.03339, over 24339.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2345, pruned_loss=0.03735, over 4700401.51 frames. ], batch size: 74, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:34:55,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 07:34:58,684 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 07:34:58,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:35:00,053 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 07:35:00,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 07:35:00,154 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 07:35:04,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:35:04,326 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 07:35:05,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 07:35:05,771 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 07:35:08,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:35:09,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.61 vs. limit=22.5 2023-10-04 07:35:11,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 07:35:11,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 07:35:13,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:35:14,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 07:35:16,570 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 07:35:16,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 07:35:17,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1575626.6666666667, ans=0.2 2023-10-04 07:35:17,659 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.57 vs. limit=15.0 2023-10-04 07:35:24,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:35:24,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:35:25,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:35:25,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 07:35:27,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:35:31,961 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 07:35:37,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:35:38,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:35:38,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 07:35:38,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:35:38,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:35:40,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 07:35:45,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:35:45,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:35:45,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1575760.0, ans=0.1 2023-10-04 07:35:46,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:35:50,516 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 07:35:50,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:35:51,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:35:52,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1575760.0, ans=0.95 2023-10-04 07:35:58,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:35:59,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:35:59,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 07:36:00,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:36:00,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1575826.6666666667, ans=0.125 2023-10-04 07:36:01,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:36:03,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:36:08,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 07:36:10,049 INFO [train.py:1046] (2/4) Epoch 45, batch 2650, loss[loss=0.1657, simple_loss=0.247, pruned_loss=0.04217, over 23339.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.03742, over 4706312.84 frames. ], batch size: 93, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:36:10,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:11,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:36:16,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 07:36:16,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:18,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:36:20,231 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 07:36:20,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:36:23,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:25,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:36:27,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:36:27,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1575960.0, ans=0.0 2023-10-04 07:36:29,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:36:31,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 07:36:31,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:36:31,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:36:33,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 07:36:34,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1575960.0, ans=0.125 2023-10-04 07:36:36,015 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 07:36:37,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:36:40,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 07:36:40,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:36:40,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 07:36:45,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:45,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:36:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:45,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:36:49,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 07:36:49,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 07:36:53,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:36:57,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 07:36:57,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:58,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:36:58,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:36:58,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1576093.3333333333, ans=0.125 2023-10-04 07:36:59,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:36:59,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:37:01,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:37:01,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=15.0 2023-10-04 07:37:03,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:37:04,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:37:05,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:37:05,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:37:07,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:07,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:37:07,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:07,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1576093.3333333333, ans=0.125 2023-10-04 07:37:08,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:37:09,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1576160.0, ans=0.1 2023-10-04 07:37:10,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:37:14,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:14,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:37:14,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1576160.0, ans=0.2 2023-10-04 07:37:15,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:15,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 07:37:15,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1576160.0, ans=0.1 2023-10-04 07:37:19,695 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.024e+02 2.233e+02 2.497e+02 3.556e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-04 07:37:19,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:37:21,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:22,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:24,507 INFO [train.py:1046] (2/4) Epoch 45, batch 2700, loss[loss=0.1651, simple_loss=0.2518, pruned_loss=0.03922, over 24573.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2356, pruned_loss=0.03738, over 4696407.55 frames. ], batch size: 71, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:37:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:24,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:37:27,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:29,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:37:29,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 07:37:31,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:37:34,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 07:37:35,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:37:35,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:35,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:37,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1576226.6666666667, ans=0.0 2023-10-04 07:37:38,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:37:38,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:38,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:37:38,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:37:38,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 07:37:40,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:37:41,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:37:43,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:37:43,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:45,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:37:47,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 07:37:47,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:37:50,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:37:50,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:37:58,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:37:58,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:37:58,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:37:58,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:38:01,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:04,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:38:04,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:38:04,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:38:09,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:09,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:38:12,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1576426.6666666667, ans=0.2 2023-10-04 07:38:16,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:38:16,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:38:19,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:38:19,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:22,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:23,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:23,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:38:25,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:28,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:28,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:38:30,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:38:31,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:38:31,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:38:35,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 07:38:35,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:39,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:38:39,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 07:38:40,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.04 vs. limit=15.0 2023-10-04 07:38:40,773 INFO [train.py:1046] (2/4) Epoch 45, batch 2750, loss[loss=0.1548, simple_loss=0.2305, pruned_loss=0.03955, over 23322.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2355, pruned_loss=0.03751, over 4688662.04 frames. ], batch size: 119, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:38:40,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 07:38:40,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:42,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:38:43,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:45,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:45,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:38:45,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:46,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1576560.0, ans=0.0 2023-10-04 07:38:49,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:38:49,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:38:49,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:38:49,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:49,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 07:38:49,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:38:49,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:55,027 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-10-04 07:38:55,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 07:38:57,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:38:58,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:00,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:39:01,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:39:02,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:04,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:39:04,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:04,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:08,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:39:08,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:39:09,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:39:10,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:10,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1576693.3333333333, ans=0.0 2023-10-04 07:39:11,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:39:17,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1576693.3333333333, ans=0.125 2023-10-04 07:39:18,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:19,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1576693.3333333333, ans=0.1 2023-10-04 07:39:20,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:39:20,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:20,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1576693.3333333333, ans=0.2 2023-10-04 07:39:20,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.03 vs. limit=15.0 2023-10-04 07:39:22,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1576693.3333333333, ans=0.1 2023-10-04 07:39:25,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:25,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:39:26,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:39:32,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:39:33,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.17 vs. limit=15.0 2023-10-04 07:39:34,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:39:34,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 07:39:35,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1576760.0, ans=0.125 2023-10-04 07:39:38,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:38,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1576826.6666666667, ans=0.0 2023-10-04 07:39:40,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1576826.6666666667, ans=0.125 2023-10-04 07:39:41,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 07:39:46,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:39:48,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:39:48,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 07:39:50,134 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.025e+02 2.213e+02 2.495e+02 4.523e+02, threshold=4.427e+02, percent-clipped=1.0 2023-10-04 07:39:50,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:39:51,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:39:51,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 07:39:53,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:39:54,447 INFO [train.py:1046] (2/4) Epoch 45, batch 2800, loss[loss=0.1621, simple_loss=0.2511, pruned_loss=0.03652, over 24547.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2335, pruned_loss=0.03692, over 4683488.35 frames. ], batch size: 71, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:39:54,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 07:39:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:39:56,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:39:56,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 07:39:56,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:57,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:59,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:59,328 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 07:39:59,328 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 07:40:03,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:40:05,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:40:05,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:40:08,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:40:09,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 07:40:12,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 07:40:14,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 07:40:14,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1576960.0, ans=0.0 2023-10-04 07:40:15,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:15,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:40:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:40:18,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1576960.0, ans=0.125 2023-10-04 07:40:19,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:40:19,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:19,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:40:21,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:40:28,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:40:30,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:40:32,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:33,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1577026.6666666667, ans=0.125 2023-10-04 07:40:34,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:40:34,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:40:35,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.14 vs. limit=10.0 2023-10-04 07:40:39,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:40:39,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 07:40:39,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:40:41,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:40:41,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:40:45,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:40:46,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:46,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1577093.3333333333, ans=0.125 2023-10-04 07:40:46,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1577093.3333333333, ans=0.125 2023-10-04 07:40:50,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:40:52,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:40:52,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:52,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:40:52,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:40:52,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:40:53,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:55,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 07:40:55,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:40:56,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:40:56,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:40:59,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 07:41:00,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:00,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:41:00,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:41:01,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 07:41:08,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:41:08,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:41:08,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:41:09,400 INFO [train.py:1046] (2/4) Epoch 45, batch 2850, loss[loss=0.1482, simple_loss=0.2209, pruned_loss=0.03773, over 23705.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2326, pruned_loss=0.03681, over 4679173.79 frames. ], batch size: 232, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:41:11,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:41:14,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:41:15,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:41:15,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:41:16,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:18,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:41:19,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:41:19,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 07:41:26,908 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 07:41:26,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:41:29,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 07:41:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:33,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 07:41:33,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 07:41:34,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:41,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1577360.0, ans=0.5 2023-10-04 07:41:46,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:48,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:41:48,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:41:48,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:41:48,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1577360.0, ans=0.07 2023-10-04 07:41:49,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:41:49,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:41:51,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:41:52,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 07:41:54,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:41:54,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:41:54,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1577426.6666666667, ans=0.125 2023-10-04 07:41:55,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:55,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:58,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:00,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:00,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:01,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:42:01,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:42:03,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:05,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:06,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:42:08,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1577493.3333333333, ans=0.125 2023-10-04 07:42:10,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:42:11,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-10-04 07:42:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 07:42:12,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 07:42:15,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:42:15,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:15,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 07:42:16,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:42:17,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:17,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:42:17,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:42:17,952 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 07:42:19,174 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.756e+02 2.081e+02 2.377e+02 2.931e+02 5.661e+02, threshold=4.754e+02, percent-clipped=4.0 2023-10-04 07:42:19,321 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 07:42:19,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:42:19,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:20,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1577493.3333333333, ans=0.2 2023-10-04 07:42:22,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1577560.0, ans=0.0 2023-10-04 07:42:23,476 INFO [train.py:1046] (2/4) Epoch 45, batch 2900, loss[loss=0.151, simple_loss=0.2318, pruned_loss=0.0351, over 23274.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2331, pruned_loss=0.03666, over 4692331.82 frames. ], batch size: 105, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:42:23,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:42:23,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:42:24,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:42:25,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 07:42:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:29,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 07:42:29,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 07:42:30,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:42:32,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:42:32,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:34,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:42:39,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:42:39,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:41,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.20 vs. limit=12.0 2023-10-04 07:42:42,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:42:42,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 07:42:42,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:42:45,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:46,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 07:42:46,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 07:42:50,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:50,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 07:42:50,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:42:53,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:42:53,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:42:56,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:58,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:43:02,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:43:05,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:08,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 07:43:08,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 07:43:08,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:43:13,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:43:14,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 07:43:14,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:43:17,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1577760.0, ans=0.1 2023-10-04 07:43:21,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:43:23,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1577826.6666666667, ans=0.0 2023-10-04 07:43:28,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1577826.6666666667, ans=0.125 2023-10-04 07:43:30,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:43:30,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:43:32,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 07:43:32,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1577826.6666666667, ans=0.125 2023-10-04 07:43:33,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:33,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 07:43:34,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:43:34,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:43:38,094 INFO [train.py:1046] (2/4) Epoch 45, batch 2950, loss[loss=0.1582, simple_loss=0.2316, pruned_loss=0.04245, over 23880.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2339, pruned_loss=0.03695, over 4704326.72 frames. ], batch size: 195, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:43:42,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:43:42,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 07:43:44,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:43:44,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:44,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1577893.3333333333, ans=0.2 2023-10-04 07:43:47,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:43:47,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:43:49,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 07:43:49,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 07:43:50,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:43:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:43:53,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1577960.0, ans=0.0 2023-10-04 07:43:57,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:43:58,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:44:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:44:00,615 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.72 vs. limit=22.5 2023-10-04 07:44:01,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:44:01,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1577960.0, ans=0.125 2023-10-04 07:44:03,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:44:03,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:44:06,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:44:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:44:07,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:44:11,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 07:44:11,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1578026.6666666667, ans=0.125 2023-10-04 07:44:11,714 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.71 vs. limit=10.0 2023-10-04 07:44:17,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 07:44:17,100 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 07:44:17,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:44:20,316 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 07:44:21,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 07:44:21,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:44:23,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:44:23,014 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 07:44:23,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:44:24,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 07:44:25,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:44:25,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:44:30,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:44:31,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:44:31,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:32,774 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 07:44:32,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:44:32,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 07:44:38,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:40,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:44:40,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 07:44:41,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:44:43,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 07:44:46,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:44:48,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:44:48,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:44:48,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=1578160.0, ans=0.02 2023-10-04 07:44:49,416 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.943e+02 2.173e+02 2.663e+02 4.538e+02, threshold=4.346e+02, percent-clipped=0.0 2023-10-04 07:44:49,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:49,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:44:50,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:44:50,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:44:50,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:44:52,364 INFO [train.py:1046] (2/4) Epoch 45, batch 3000, loss[loss=0.1449, simple_loss=0.2237, pruned_loss=0.0331, over 19119.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2346, pruned_loss=0.03696, over 4713607.32 frames. ], batch size: 41, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:44:52,364 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 07:45:04,915 INFO [train.py:1078] (2/4) Epoch 45, validation: loss=0.3664, simple_loss=0.2817, pruned_loss=0.2256, over 1125622.00 frames. 2023-10-04 07:45:04,916 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 07:45:05,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:45:05,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:45:06,244 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.56 vs. limit=6.0 2023-10-04 07:45:07,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:45:08,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:45:09,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 07:45:11,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:45:14,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:45:14,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:45:16,537 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 07:45:17,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 07:45:19,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:45:19,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:45:20,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 07:45:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:45:25,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:45:35,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:45:40,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 07:45:40,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:45:44,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:45:44,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:45:44,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:45:47,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:45:47,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 07:45:48,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 07:45:49,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:45:50,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:45:53,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:45:53,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:45:54,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:45:54,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:45:57,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:45:57,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:45:57,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:45:58,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:46:00,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 07:46:02,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:46:03,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:03,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:46:07,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:07,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:11,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 07:46:11,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 07:46:11,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:46:11,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 07:46:12,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:46:15,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 07:46:15,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1578493.3333333333, ans=0.125 2023-10-04 07:46:18,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:46:18,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 07:46:19,803 INFO [train.py:1046] (2/4) Epoch 45, batch 3050, loss[loss=0.1478, simple_loss=0.2401, pruned_loss=0.0277, over 24317.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2354, pruned_loss=0.03723, over 4704540.53 frames. ], batch size: 74, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:46:19,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 07:46:19,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 07:46:19,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:46:21,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:46:21,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:21,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:46:21,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:22,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:46:25,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 07:46:28,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:46:30,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1578560.0, ans=0.2 2023-10-04 07:46:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:31,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:46:34,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:36,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 07:46:42,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 07:46:43,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 07:46:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:46:47,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:46:52,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:52,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:52,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:46:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:46:54,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:46:56,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:46:56,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:56,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:46:58,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:59,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:01,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.88 vs. limit=15.0 2023-10-04 07:47:02,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:47:02,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 07:47:02,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:47:02,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:47:06,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:47:06,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:47:08,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:47:08,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:11,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1578760.0, ans=0.125 2023-10-04 07:47:13,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:47:13,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:20,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:20,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:47:20,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:47:21,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:47:21,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:47:23,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:47:23,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 07:47:24,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:47:24,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:26,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 07:47:27,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:30,495 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 2.029e+02 2.330e+02 2.767e+02 4.279e+02, threshold=4.661e+02, percent-clipped=0.0 2023-10-04 07:47:32,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:33,797 INFO [train.py:1046] (2/4) Epoch 45, batch 3100, loss[loss=0.1565, simple_loss=0.2424, pruned_loss=0.03528, over 24618.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2347, pruned_loss=0.03736, over 4702550.18 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:47:33,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:47:35,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:47:38,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 07:47:41,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 07:47:42,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 07:47:45,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:47:48,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:47:48,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:51,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:47:55,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:01,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 07:48:06,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 07:48:06,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:06,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:48:06,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:48:06,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 07:48:08,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:48:08,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 07:48:08,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:48:10,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:10,610 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:48:11,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 07:48:13,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:48:16,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:48:16,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 07:48:18,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 07:48:18,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:19,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:21,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:21,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:21,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:48:21,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1579093.3333333333, ans=0.0 2023-10-04 07:48:22,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:48:22,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:48:25,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:48:25,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:48:25,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:25,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 07:48:29,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:48:31,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 07:48:32,639 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=15.0 2023-10-04 07:48:34,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:48:34,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 07:48:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:36,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:36,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 07:48:40,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.26 vs. limit=15.0 2023-10-04 07:48:45,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 07:48:48,176 INFO [train.py:1046] (2/4) Epoch 45, batch 3150, loss[loss=0.122, simple_loss=0.1791, pruned_loss=0.03249, over 19106.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2329, pruned_loss=0.03685, over 4676375.98 frames. ], batch size: 388, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:48:48,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:48:48,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:51,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:48:51,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:48:52,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 07:48:52,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:48:52,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:48:53,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 07:48:55,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:56,956 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 07:49:01,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 07:49:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:49:03,339 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 07:49:03,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 07:49:06,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 07:49:06,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 07:49:06,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 07:49:06,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:49:06,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:49:07,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:49:10,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 07:49:11,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:49:11,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:49:13,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:49:14,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:49:16,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1579360.0, ans=0.125 2023-10-04 07:49:18,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.31 vs. limit=12.0 2023-10-04 07:49:20,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 07:49:20,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:49:22,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:49:23,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:49:23,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 07:49:26,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 07:49:27,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:49:27,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 07:49:27,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 07:49:29,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:49:29,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:49:31,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:49:31,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:49:33,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 07:49:33,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:49:33,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:35,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:49:35,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:49:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 07:49:35,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:49:38,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 07:49:38,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:40,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 07:49:40,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 07:49:40,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1579426.6666666667, ans=0.0 2023-10-04 07:49:41,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:49:41,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:49:41,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 07:49:44,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 07:49:44,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:49:47,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:49:49,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:50,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:49:53,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:49:55,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:57,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 07:49:59,211 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.953e+02 2.189e+02 2.513e+02 3.532e+02, threshold=4.378e+02, percent-clipped=0.0 2023-10-04 07:50:02,608 INFO [train.py:1046] (2/4) Epoch 45, batch 3200, loss[loss=0.1792, simple_loss=0.2661, pruned_loss=0.04611, over 23938.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2321, pruned_loss=0.03688, over 4682917.09 frames. ], batch size: 86, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:50:04,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:50:04,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:50:08,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:50:10,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:50:10,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 07:50:10,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:50:14,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:50:16,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1579626.6666666667, ans=0.0 2023-10-04 07:50:17,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:50:25,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:50:26,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1579626.6666666667, ans=0.125 2023-10-04 07:50:33,211 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.76 vs. limit=22.5 2023-10-04 07:50:35,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 07:50:35,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:50:39,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 07:50:39,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:50:44,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:50:44,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:50:44,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:50:48,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 07:50:50,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 07:50:52,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 07:50:52,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1579760.0, ans=0.125 2023-10-04 07:50:54,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 07:50:55,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:51:01,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:01,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:51:01,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:02,451 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 07:51:02,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 07:51:07,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:08,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 07:51:09,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 07:51:11,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 07:51:11,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1579826.6666666667, ans=0.125 2023-10-04 07:51:12,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 07:51:15,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:51:16,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-10-04 07:51:17,260 INFO [train.py:1046] (2/4) Epoch 45, batch 3250, loss[loss=0.1686, simple_loss=0.2394, pruned_loss=0.04892, over 23611.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2318, pruned_loss=0.03722, over 4669547.35 frames. ], batch size: 256, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:51:17,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:51:17,304 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 07:51:17,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:51:17,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:17,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1579893.3333333333, ans=0.1 2023-10-04 07:51:18,705 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 07:51:22,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:51:24,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:51:33,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:51:33,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 07:51:35,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:36,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:36,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:51:36,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:51:36,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1579960.0, ans=0.125 2023-10-04 07:51:38,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:51:41,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:41,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:51:41,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:41,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:41,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:42,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:51:44,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:51:45,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:51:45,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1580026.6666666667, ans=0.125 2023-10-04 07:51:48,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:48,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:50,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:51,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:51:51,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:51:56,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 07:51:56,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:51:57,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:51:59,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:59,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:52:01,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=15.0 2023-10-04 07:52:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:52:12,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:52:12,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:12,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 07:52:12,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:52:12,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:52:14,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:17,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 07:52:17,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 07:52:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:52:19,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:21,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:52:22,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:52:22,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:52:25,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:52:27,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:52:28,381 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.929e+02 2.141e+02 2.387e+02 3.299e+02, threshold=4.283e+02, percent-clipped=0.0 2023-10-04 07:52:28,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 07:52:28,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:31,155 INFO [train.py:1046] (2/4) Epoch 45, batch 3300, loss[loss=0.1451, simple_loss=0.2295, pruned_loss=0.03032, over 24489.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2326, pruned_loss=0.03686, over 4688282.94 frames. ], batch size: 66, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:52:31,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:52:31,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 07:52:34,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:52:34,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 07:52:37,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 07:52:38,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 07:52:38,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:40,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1580226.6666666667, ans=0.0 2023-10-04 07:52:41,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:52:41,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:52:41,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:44,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:52:45,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:52:45,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1580293.3333333333, ans=0.2 2023-10-04 07:52:46,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:47,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:52:49,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1580293.3333333333, ans=0.1 2023-10-04 07:52:51,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1580293.3333333333, ans=0.1 2023-10-04 07:52:52,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 07:52:52,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:52:52,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:53,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:55,428 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 07:52:55,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:52:55,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:52:55,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:52:55,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:52:56,946 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 07:52:59,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:59,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:53:01,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:01,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 07:53:03,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 07:53:03,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:03,715 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-10-04 07:53:04,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:53:07,242 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 07:53:08,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 07:53:08,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:53:11,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 07:53:13,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:53:16,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:53:17,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:53:19,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:53:20,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:53:20,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:53:21,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:53:21,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:23,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:53:25,085 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 07:53:25,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1580426.6666666667, ans=0.125 2023-10-04 07:53:26,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 07:53:28,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:53:29,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:53:29,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:53:32,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:32,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:53:32,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:34,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:53:34,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:35,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1580493.3333333333, ans=0.2 2023-10-04 07:53:36,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:53:39,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 07:53:39,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:39,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:41,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:53:41,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1580493.3333333333, ans=0.07 2023-10-04 07:53:42,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:53:42,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:45,545 INFO [train.py:1046] (2/4) Epoch 45, batch 3350, loss[loss=0.1517, simple_loss=0.237, pruned_loss=0.03322, over 23407.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.234, pruned_loss=0.0373, over 4680592.54 frames. ], batch size: 106, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:53:45,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:45,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:48,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:53:49,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:51,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:53:54,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:55,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:53:57,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:58,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:54:00,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 07:54:02,121 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 07:54:02,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:54:06,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 07:54:06,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 07:54:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:54:08,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:54:09,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:10,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 07:54:10,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:11,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:54:11,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1580626.6666666667, ans=0.0 2023-10-04 07:54:15,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:16,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:17,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:17,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:54:20,408 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.53 vs. limit=15.0 2023-10-04 07:54:21,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:22,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1580693.3333333333, ans=0.2 2023-10-04 07:54:23,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:23,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:27,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:54:28,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:30,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:30,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 07:54:34,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1580760.0, ans=0.0 2023-10-04 07:54:35,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:54:36,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 07:54:36,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:54:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 07:54:38,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:40,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:46,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:46,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 07:54:46,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:54:47,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:54:49,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:54:49,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1580826.6666666667, ans=0.125 2023-10-04 07:54:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:54:55,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 07:54:57,147 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.935e+02 2.206e+02 2.412e+02 3.759e+02, threshold=4.413e+02, percent-clipped=0.0 2023-10-04 07:54:57,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:54:57,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:54:59,077 INFO [train.py:1046] (2/4) Epoch 45, batch 3400, loss[loss=0.1472, simple_loss=0.2243, pruned_loss=0.03502, over 21072.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2343, pruned_loss=0.03738, over 4694798.26 frames. ], batch size: 46, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:54:59,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:59,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 07:55:00,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:55:00,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 07:55:00,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1580893.3333333333, ans=0.125 2023-10-04 07:55:02,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:55:02,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:55:03,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:55:05,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:55:05,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 07:55:09,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 07:55:09,773 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 07:55:09,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:13,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:55:13,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:55:14,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:16,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:55:20,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:55:21,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 07:55:25,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:55:28,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:28,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:55:30,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:55:31,105 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.89 vs. limit=22.5 2023-10-04 07:55:34,449 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.94 vs. limit=22.5 2023-10-04 07:55:35,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:55:39,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 07:55:43,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:45,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:45,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 07:55:46,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:55:46,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:55:48,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:55:48,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:55:52,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:55,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:55:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:55:59,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1581160.0, ans=0.125 2023-10-04 07:56:00,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:56:00,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1581160.0, ans=0.0 2023-10-04 07:56:02,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 07:56:10,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:56:12,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1581226.6666666667, ans=0.125 2023-10-04 07:56:13,663 INFO [train.py:1046] (2/4) Epoch 45, batch 3450, loss[loss=0.1467, simple_loss=0.2376, pruned_loss=0.0279, over 24690.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2348, pruned_loss=0.03752, over 4687462.21 frames. ], batch size: 73, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:56:14,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.93 vs. limit=12.0 2023-10-04 07:56:15,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 07:56:17,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 07:56:17,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:56:19,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:56:19,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 07:56:20,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:56:25,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:56:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:56:28,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:56:29,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:56:29,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:56:31,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:56:37,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 07:56:43,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 07:56:44,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:56:44,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:56:46,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:56:49,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 07:56:50,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:56:50,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1581360.0, ans=0.025 2023-10-04 07:56:55,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:56:56,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:56:57,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:56:57,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:56:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 07:56:59,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:57:01,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:57:05,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:57:07,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 07:57:12,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:57:14,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:57:16,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:19,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:23,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:23,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:57:24,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:57:24,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:57:26,407 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 1.981e+02 2.112e+02 2.378e+02 3.937e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-04 07:57:27,802 INFO [train.py:1046] (2/4) Epoch 45, batch 3500, loss[loss=0.1346, simple_loss=0.1875, pruned_loss=0.04089, over 19255.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2332, pruned_loss=0.03724, over 4674695.32 frames. ], batch size: 390, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:57:29,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:32,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:57:33,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.40 vs. limit=12.0 2023-10-04 07:57:33,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 07:57:35,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:57:38,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 07:57:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:41,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 07:57:48,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:57:48,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:57:48,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:57:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:57:50,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:57:50,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:51,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:57:51,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 07:57:54,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:54,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:57:55,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:57:58,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:58,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 07:58:00,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:58:03,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:58:05,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:58:07,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:09,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:58:11,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:58:12,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 07:58:13,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 07:58:13,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 07:58:14,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:58:16,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:16,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:58:16,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:58:19,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.31 vs. limit=22.5 2023-10-04 07:58:20,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:58:20,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:58:25,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:58:25,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 07:58:25,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 07:58:25,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:58:26,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:58:27,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:58:29,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:30,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 07:58:30,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1581826.6666666667, ans=0.125 2023-10-04 07:58:32,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:58:32,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:58:34,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 07:58:34,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1581826.6666666667, ans=0.1 2023-10-04 07:58:37,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 07:58:40,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:42,167 INFO [train.py:1046] (2/4) Epoch 45, batch 3550, loss[loss=0.1476, simple_loss=0.2241, pruned_loss=0.0355, over 23618.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2325, pruned_loss=0.03704, over 4681569.26 frames. ], batch size: 135, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:58:42,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:58:42,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:58:42,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:58:45,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:58:52,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:58:53,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 07:58:57,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:58:57,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:58:59,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:00,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:59:00,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:59:04,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:59:04,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:59:05,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:59:06,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:59:06,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:59:06,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1581960.0, ans=0.0 2023-10-04 07:59:10,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:59:10,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:59:12,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:59:12,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:59:13,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:59:13,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 07:59:13,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:15,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:16,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:59:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:59:21,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:59:23,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:59:24,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 07:59:24,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:59:28,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 07:59:28,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:59:29,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:59:31,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:59:34,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 07:59:36,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:59:42,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:59:42,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 07:59:43,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:59:48,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:49,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 07:59:52,804 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.55 vs. limit=6.0 2023-10-04 07:59:54,809 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.938e+02 2.121e+02 2.490e+02 3.705e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-04 07:59:54,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 07:59:54,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:59:56,213 INFO [train.py:1046] (2/4) Epoch 45, batch 3600, loss[loss=0.1433, simple_loss=0.2204, pruned_loss=0.03307, over 24469.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2327, pruned_loss=0.03684, over 4691694.69 frames. ], batch size: 58, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:59:56,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:59:58,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:59:59,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:00:00,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:00:04,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:00:08,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:09,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:00:09,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:00:11,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 08:00:16,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:00:17,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:19,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1582293.3333333333, ans=0.125 2023-10-04 08:00:20,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:00:23,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:00:23,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:00:23,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:00:23,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 08:00:24,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:00:26,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:00:30,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:00:31,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:00:31,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1582360.0, ans=0.125 2023-10-04 08:00:32,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:00:34,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 08:00:42,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:00:43,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:00:45,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 08:00:47,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1582426.6666666667, ans=0.1 2023-10-04 08:00:48,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:00:51,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:00:55,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:01:00,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:01:00,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:01:00,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 08:01:02,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 08:01:04,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 08:01:06,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:01:06,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:01:07,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 08:01:07,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:01:07,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:01:07,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:01:09,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1582560.0, ans=0.0 2023-10-04 08:01:10,163 INFO [train.py:1046] (2/4) Epoch 45, batch 3650, loss[loss=0.1365, simple_loss=0.223, pruned_loss=0.025, over 24698.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.03666, over 4688214.46 frames. ], batch size: 65, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:01:10,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 08:01:10,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 08:01:15,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:01:15,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 08:01:20,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 08:01:21,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:01:24,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 08:01:25,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 08:01:30,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:01:30,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:01:31,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:01:34,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:01:34,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:01:35,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 08:01:35,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:01:35,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:01:35,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 08:01:36,221 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:01:37,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:01:37,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:01:37,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:01:41,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:01:42,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 08:01:43,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 08:01:45,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:01:47,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 08:01:49,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:01:49,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:01:53,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:01:54,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:01:54,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:01:56,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:01:57,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:02:00,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:02:04,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:02:05,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:05,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:02:07,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:02:07,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:02:08,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:02:15,056 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 08:02:19,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.28 vs. limit=22.5 2023-10-04 08:02:20,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:02:20,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:02:21,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:02:21,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:21,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:02:24,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:25,453 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.054e+02 2.374e+02 2.935e+02 4.345e+02, threshold=4.749e+02, percent-clipped=1.0 2023-10-04 08:02:25,483 INFO [train.py:1046] (2/4) Epoch 45, batch 3700, loss[loss=0.1455, simple_loss=0.2326, pruned_loss=0.02923, over 24450.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.03662, over 4701316.34 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:02:25,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1582893.3333333333, ans=0.0 2023-10-04 08:02:26,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 08:02:26,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:29,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:02:29,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1582893.3333333333, ans=0.0 2023-10-04 08:02:31,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:02:31,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:02:33,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:33,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 08:02:33,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:35,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:02:35,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:02:36,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:02:40,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:02:40,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:02:41,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:02:41,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:41,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:02:44,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:02:46,327 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 08:02:55,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:02:55,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:02:56,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:02:56,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 08:02:56,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:03:00,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:02,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 08:03:02,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:02,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1583026.6666666667, ans=0.125 2023-10-04 08:03:03,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:03:03,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1583026.6666666667, ans=0.125 2023-10-04 08:03:06,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:06,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:03:06,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1583026.6666666667, ans=0.1 2023-10-04 08:03:09,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:03:14,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:03:14,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 08:03:14,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:03:14,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 08:03:14,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1583093.3333333333, ans=0.125 2023-10-04 08:03:20,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:03:21,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:03:24,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:03:24,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 08:03:26,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:03:26,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:03:26,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:03:27,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:03:29,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:03:29,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 08:03:30,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 08:03:31,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:03:32,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:33,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:03:35,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:03:36,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1583160.0, ans=0.125 2023-10-04 08:03:38,461 INFO [train.py:1046] (2/4) Epoch 45, batch 3750, loss[loss=0.1707, simple_loss=0.2542, pruned_loss=0.04359, over 23288.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2356, pruned_loss=0.03674, over 4713345.13 frames. ], batch size: 105, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:03:38,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:40,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:03:41,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:03:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 08:03:44,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.17 vs. limit=22.5 2023-10-04 08:03:44,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 08:03:47,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:03:47,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 08:03:49,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:03:50,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:54,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:54,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:03:58,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:04:00,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:04:00,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1583293.3333333333, ans=0.0 2023-10-04 08:04:01,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:04:02,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:04:05,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:04:06,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 08:04:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:04:09,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:04:09,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:04:13,679 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.47 vs. limit=15.0 2023-10-04 08:04:15,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 08:04:17,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 08:04:19,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:04:19,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:04:21,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:04:25,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:04:25,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1583426.6666666667, ans=0.09899494936611666 2023-10-04 08:04:26,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 08:04:27,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.47 vs. limit=22.5 2023-10-04 08:04:28,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 08:04:32,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:04:34,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:04:36,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:04:39,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:04:41,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1583493.3333333333, ans=0.0 2023-10-04 08:04:43,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:04:45,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:04:47,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:04:49,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:04:51,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:04:52,547 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.048e+02 2.267e+02 2.690e+02 4.764e+02, threshold=4.534e+02, percent-clipped=1.0 2023-10-04 08:04:52,585 INFO [train.py:1046] (2/4) Epoch 45, batch 3800, loss[loss=0.1628, simple_loss=0.2512, pruned_loss=0.03718, over 24649.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2359, pruned_loss=0.03721, over 4703422.38 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:04:58,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:05:01,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1583560.0, ans=0.125 2023-10-04 08:05:02,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:02,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 08:05:04,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 08:05:05,360 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.43 vs. limit=15.0 2023-10-04 08:05:05,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:05:05,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff3.min_abs, batch_count=1583626.6666666667, ans=0.2 2023-10-04 08:05:07,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:08,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 08:05:10,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:05:10,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:11,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:05:13,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:05:13,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:05:13,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:14,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 08:05:18,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:05:19,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:05:21,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:23,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:05:25,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:05:26,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:05:26,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:28,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:29,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:32,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 08:05:33,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 08:05:35,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:05:35,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.54 vs. limit=6.0 2023-10-04 08:05:42,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:05:46,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:05:49,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 08:05:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 08:05:51,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:53,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:05:53,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:56,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 08:05:57,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1583826.6666666667, ans=0.025 2023-10-04 08:05:59,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 08:05:59,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 08:05:59,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:00,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:06:06,068 INFO [train.py:1046] (2/4) Epoch 45, batch 3850, loss[loss=0.1561, simple_loss=0.2382, pruned_loss=0.03702, over 23191.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2347, pruned_loss=0.0371, over 4709901.31 frames. ], batch size: 105, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:06:06,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:06:07,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:06:11,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:06:11,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 08:06:13,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:06:14,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-10-04 08:06:15,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:19,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:06:19,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1583893.3333333333, ans=0.07 2023-10-04 08:06:21,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:06:24,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:06:24,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 08:06:31,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:31,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1583960.0, ans=0.0 2023-10-04 08:06:32,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:33,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:06:35,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:06:38,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:38,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:06:38,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:06:38,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:06:39,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:06:42,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:06:45,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:45,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:06:45,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 08:06:45,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 08:06:46,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:06:46,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:48,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:06:50,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:50,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 08:06:53,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 08:06:53,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:06:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 08:06:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 08:07:02,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1584093.3333333333, ans=0.0 2023-10-04 08:07:03,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:04,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.42 vs. limit=15.0 2023-10-04 08:07:04,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:07:04,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1584160.0, ans=0.2 2023-10-04 08:07:08,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:08,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 08:07:12,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 08:07:12,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:13,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:16,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:07:16,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:07:17,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:19,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:19,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:07:19,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 08:07:21,119 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.982e+02 2.141e+02 2.479e+02 3.654e+02, threshold=4.281e+02, percent-clipped=0.0 2023-10-04 08:07:21,144 INFO [train.py:1046] (2/4) Epoch 45, batch 3900, loss[loss=0.1555, simple_loss=0.2312, pruned_loss=0.03987, over 20239.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2333, pruned_loss=0.03672, over 4713685.45 frames. ], batch size: 44, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:07:21,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:07:21,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 08:07:21,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:21,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:24,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:07:24,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:26,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:07:26,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:26,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:27,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:07:27,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 08:07:29,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:30,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.80 vs. limit=15.0 2023-10-04 08:07:31,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.42 vs. limit=15.0 2023-10-04 08:07:32,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:07:33,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:07:33,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:07:33,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:07:37,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:07:37,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:38,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:07:40,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 08:07:40,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:07:42,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 08:07:43,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:44,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 08:07:46,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 08:07:49,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:07:51,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:07:51,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:07:52,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:07:56,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:07:58,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:08:01,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:08:01,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:08:02,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:08:05,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1584426.6666666667, ans=0.125 2023-10-04 08:08:06,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:08:06,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:08:11,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1584426.6666666667, ans=0.125 2023-10-04 08:08:13,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1584426.6666666667, ans=0.125 2023-10-04 08:08:15,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:08:17,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:08:23,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:08:26,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:08:27,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 08:08:27,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 08:08:27,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:08:29,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 08:08:31,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:08:31,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 08:08:35,205 INFO [train.py:1046] (2/4) Epoch 45, batch 3950, loss[loss=0.1539, simple_loss=0.2384, pruned_loss=0.03476, over 24497.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03675, over 4716873.16 frames. ], batch size: 66, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:08:35,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1584560.0, ans=0.125 2023-10-04 08:08:36,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:08:38,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 08:08:38,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:08:40,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:08:42,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:08:47,049 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 08:08:47,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:08:48,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 08:08:48,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 08:08:48,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1584626.6666666667, ans=0.0 2023-10-04 08:08:50,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:08:55,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:08:55,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:08:55,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:08:57,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 08:09:00,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:09:02,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:09:02,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:09:02,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:09:02,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:09:07,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1584693.3333333333, ans=0.2 2023-10-04 08:09:12,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:09:12,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:09:17,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 08:09:23,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 08:09:23,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 08:09:23,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:09:24,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:09:32,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:09:32,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:09:32,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:09:33,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:09:33,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 08:09:36,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1584826.6666666667, ans=0.09899494936611666 2023-10-04 08:09:38,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:09:39,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:09:44,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 08:09:48,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.55 vs. limit=15.0 2023-10-04 08:09:49,581 INFO [train.py:1046] (2/4) Epoch 45, batch 4000, loss[loss=0.1563, simple_loss=0.2461, pruned_loss=0.03329, over 24584.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2339, pruned_loss=0.03735, over 4705440.67 frames. ], batch size: 71, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:09:51,381 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.026e+02 2.265e+02 2.595e+02 5.973e+02, threshold=4.529e+02, percent-clipped=1.0 2023-10-04 08:09:51,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1584893.3333333333, ans=0.025 2023-10-04 08:09:51,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1584893.3333333333, ans=0.125 2023-10-04 08:09:53,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1584893.3333333333, ans=0.125 2023-10-04 08:09:53,284 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.99 vs. limit=15.0 2023-10-04 08:09:53,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=14.66 vs. limit=15.0 2023-10-04 08:09:54,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:09:57,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1584893.3333333333, ans=0.125 2023-10-04 08:10:00,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:00,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1584893.3333333333, ans=0.125 2023-10-04 08:10:05,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1584960.0, ans=0.1 2023-10-04 08:10:06,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:06,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:10:08,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:08,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 08:10:09,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:10:09,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 08:10:09,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:10:09,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 08:10:12,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:13,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:10:15,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:10:15,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:10:15,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:10:15,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:10:16,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:10:19,736 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 08:10:19,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:10:20,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:24,308 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 08:10:24,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:10:24,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:10:29,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 08:10:31,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:10:34,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:10:35,822 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 08:10:37,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:10:37,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 08:10:37,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:10:38,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:40,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:10:40,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:10:41,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:10:41,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:10:42,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 08:10:42,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:44,352 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 08:10:44,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1585093.3333333333, ans=0.125 2023-10-04 08:10:46,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1585093.3333333333, ans=0.09899494936611666 2023-10-04 08:10:49,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:10:52,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 08:10:55,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:10:55,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:55,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:10:56,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:10:59,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:11:03,977 INFO [train.py:1046] (2/4) Epoch 45, batch 4050, loss[loss=0.1564, simple_loss=0.2313, pruned_loss=0.0408, over 23833.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2342, pruned_loss=0.03746, over 4680628.50 frames. ], batch size: 212, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:11:04,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:11:04,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 08:11:06,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:11:06,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:07,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:11:08,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:11:10,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:11:13,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:11:14,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1585226.6666666667, ans=0.2 2023-10-04 08:11:15,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:11:17,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 08:11:18,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:11:19,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:11:22,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:11:23,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:11:27,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 08:11:29,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 08:11:29,259 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 08:11:31,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:11:35,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1585360.0, ans=0.09899494936611666 2023-10-04 08:11:37,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1585360.0, ans=0.0 2023-10-04 08:11:38,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 08:11:38,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:11:42,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:45,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:11:45,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:11:45,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:49,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:11:51,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 08:11:53,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:11:54,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:11:55,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1585426.6666666667, ans=0.125 2023-10-04 08:11:56,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 08:11:58,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:12:08,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 08:12:08,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:12:08,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:12:09,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 08:12:09,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 08:12:09,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:12,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:12:12,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:13,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:12:17,825 INFO [train.py:1046] (2/4) Epoch 45, batch 4100, loss[loss=0.1765, simple_loss=0.2598, pruned_loss=0.04656, over 23805.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2354, pruned_loss=0.03784, over 4687751.48 frames. ], batch size: 85, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:12:20,951 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 1.978e+02 2.170e+02 2.458e+02 4.039e+02, threshold=4.339e+02, percent-clipped=0.0 2023-10-04 08:12:21,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 08:12:22,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 08:12:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 08:12:25,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 08:12:25,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:25,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:27,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:27,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:12:27,465 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 08:12:31,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:12:31,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1585626.6666666667, ans=0.125 2023-10-04 08:12:32,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:12:32,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:32,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:12:38,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:12:39,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:12:40,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:12:40,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 08:12:40,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:40,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:12:41,659 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-04 08:12:42,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:12:42,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:12:42,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 08:12:45,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:12:45,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 08:12:47,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:12:49,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:12:49,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 08:12:51,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:12:52,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:12:52,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:12:52,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1585693.3333333333, ans=0.04949747468305833 2023-10-04 08:12:56,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 08:12:58,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:12:59,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:13:00,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 08:13:00,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:13:00,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:13:05,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:13:11,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:13:15,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:13:21,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:13:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:13:26,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:13:27,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:13:30,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:13:30,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:13:32,179 INFO [train.py:1046] (2/4) Epoch 45, batch 4150, loss[loss=0.156, simple_loss=0.2461, pruned_loss=0.03291, over 24454.00 frames. ], tot_loss[loss=0.156, simple_loss=0.236, pruned_loss=0.03802, over 4693133.40 frames. ], batch size: 69, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:13:32,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:13:32,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:13:33,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 08:13:35,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:35,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 08:13:36,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 08:13:36,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 08:13:37,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:43,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:13:43,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:13:47,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:13:48,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:13:48,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:13:51,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:13:51,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:13:53,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:13:57,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:14:00,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:14:02,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 08:14:04,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 08:14:04,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:14:07,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 08:14:07,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:14:07,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:14:11,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:12,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:14:15,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 08:14:18,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:14:19,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:14:20,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1586093.3333333333, ans=0.125 2023-10-04 08:14:21,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 08:14:22,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:14:22,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 08:14:24,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:14:26,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:14:26,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:26,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 08:14:26,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:14:26,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:14:27,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1586093.3333333333, ans=0.125 2023-10-04 08:14:29,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:14:30,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1586160.0, ans=0.125 2023-10-04 08:14:31,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 08:14:31,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:31,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:14:31,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:14:33,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 08:14:34,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:14:34,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:14:34,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:14:34,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1586160.0, ans=0.0 2023-10-04 08:14:35,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:36,566 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=12.0 2023-10-04 08:14:37,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 08:14:37,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:14:41,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:14:44,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 08:14:46,508 INFO [train.py:1046] (2/4) Epoch 45, batch 4200, loss[loss=0.1352, simple_loss=0.1972, pruned_loss=0.03657, over 23405.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2344, pruned_loss=0.03737, over 4688037.81 frames. ], batch size: 285, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:14:46,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:14:49,036 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.036e+02 2.349e+02 2.806e+02 3.824e+02, threshold=4.697e+02, percent-clipped=0.0 2023-10-04 08:14:49,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:14:49,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:14:50,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:14:50,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:14:53,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 08:14:56,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1586226.6666666667, ans=0.0 2023-10-04 08:14:57,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 08:14:57,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1586226.6666666667, ans=0.1 2023-10-04 08:14:58,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:00,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:15:02,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:15:05,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:15:08,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:15:08,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:10,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 08:15:10,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:15:11,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:11,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:15:13,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:15:13,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:15:14,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 08:15:14,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:19,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:15:20,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:15:22,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:15:23,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:15:28,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:15:28,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 08:15:28,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:15:29,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:15:32,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1586426.6666666667, ans=0.1 2023-10-04 08:15:35,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:15:35,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:15:39,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:15:42,436 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.40 vs. limit=15.0 2023-10-04 08:15:43,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 08:15:45,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:15:46,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.27 vs. limit=15.0 2023-10-04 08:15:47,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1586493.3333333333, ans=0.0 2023-10-04 08:15:50,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:15:51,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:15:52,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.51 vs. limit=15.0 2023-10-04 08:15:54,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 08:15:59,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:16:01,104 INFO [train.py:1046] (2/4) Epoch 45, batch 4250, loss[loss=0.1551, simple_loss=0.2235, pruned_loss=0.04332, over 23444.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2334, pruned_loss=0.03692, over 4696777.34 frames. ], batch size: 285, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:16:04,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:16:04,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:16:05,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:08,262 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.28 vs. limit=5.0 2023-10-04 08:16:10,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:16:10,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 08:16:11,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:16:14,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:16,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1586626.6666666667, ans=0.0 2023-10-04 08:16:18,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:16:23,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:23,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:25,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:16:25,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:16:27,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:28,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:30,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:33,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:16:33,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1586693.3333333333, ans=0.2 2023-10-04 08:16:35,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:16:36,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 08:16:39,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 08:16:39,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:40,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:16:40,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:43,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:16:43,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:46,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:16:47,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:16:50,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:16:53,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:16:53,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 08:16:53,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:16:55,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 08:16:55,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:16:56,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:17:00,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:17:00,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:17:01,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 08:17:03,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:17:04,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:17:05,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1586826.6666666667, ans=0.0 2023-10-04 08:17:07,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1586826.6666666667, ans=0.2 2023-10-04 08:17:08,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:17:11,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:17:11,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:17:13,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:17:15,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:17:16,428 INFO [train.py:1046] (2/4) Epoch 45, batch 4300, loss[loss=0.1582, simple_loss=0.2504, pruned_loss=0.03302, over 24439.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2333, pruned_loss=0.0366, over 4703335.84 frames. ], batch size: 69, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:17:16,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:17:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:17:16,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 08:17:18,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:17:19,312 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.964e+02 2.175e+02 2.393e+02 4.014e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-04 08:17:22,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:17:22,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:17:23,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1586893.3333333333, ans=0.125 2023-10-04 08:17:25,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1586893.3333333333, ans=0.0 2023-10-04 08:17:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:17:35,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:17:35,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 08:17:35,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:17:37,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:17:37,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:17:37,679 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 08:17:41,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:17:43,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:17:46,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 08:17:46,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:17:46,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 08:17:49,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:17:50,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:17:51,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1587026.6666666667, ans=0.0 2023-10-04 08:17:52,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:17:52,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:17:53,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:17:54,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:17:54,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:17:56,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 08:17:57,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 08:17:59,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:18:02,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:03,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:18:03,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:04,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:18:04,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 08:18:04,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 08:18:04,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 08:18:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:18:05,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 08:18:06,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 08:18:08,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:18:10,098 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 08:18:11,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:18:13,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:13,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:18:14,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 08:18:15,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1587160.0, ans=0.1 2023-10-04 08:18:16,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:18:16,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:16,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:18:17,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:18:17,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:18:20,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:18:23,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:23,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1587160.0, ans=0.0 2023-10-04 08:18:24,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:24,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:18:27,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 08:18:29,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:18:30,994 INFO [train.py:1046] (2/4) Epoch 45, batch 4350, loss[loss=0.1536, simple_loss=0.2309, pruned_loss=0.0382, over 23619.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2343, pruned_loss=0.03703, over 4709732.73 frames. ], batch size: 256, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:18:33,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:18:35,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:38,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:18:38,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:18:42,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.55 vs. limit=22.5 2023-10-04 08:18:44,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:18:46,742 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.14 vs. limit=15.0 2023-10-04 08:18:47,865 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=15.0 2023-10-04 08:18:48,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:51,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:18:51,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:18:54,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:18:56,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:18:58,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:19:02,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 08:19:04,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:19:05,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:10,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:13,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 08:19:15,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:17,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:19:17,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1587426.6666666667, ans=0.125 2023-10-04 08:19:21,182 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 08:19:21,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1587426.6666666667, ans=0.1 2023-10-04 08:19:23,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:19:23,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:19:25,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 08:19:25,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 08:19:25,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:19:26,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:19:26,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:19:27,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:19:29,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:19:29,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:19:29,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1587493.3333333333, ans=0.125 2023-10-04 08:19:32,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 08:19:32,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:32,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:32,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:33,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 08:19:35,936 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 08:19:35,940 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 08:19:35,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 08:19:38,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:19:38,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:19:40,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:19:40,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:19:42,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 08:19:44,206 INFO [train.py:1046] (2/4) Epoch 45, batch 4400, loss[loss=0.1622, simple_loss=0.2391, pruned_loss=0.0426, over 23731.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03719, over 4719455.12 frames. ], batch size: 232, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:19:45,645 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 08:19:45,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:46,927 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.000e+02 2.178e+02 2.504e+02 3.543e+02, threshold=4.357e+02, percent-clipped=0.0 2023-10-04 08:19:49,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:19:49,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:52,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 08:19:52,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 08:19:53,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 08:19:53,504 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 08:19:54,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:19:54,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:19:56,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 08:19:56,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1587560.0, ans=0.125 2023-10-04 08:19:58,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:00,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:00,309 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 08:20:03,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:03,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 08:20:05,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 08:20:05,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.89 vs. limit=15.0 2023-10-04 08:20:08,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 08:20:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 08:20:08,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 08:20:09,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.35 vs. limit=10.0 2023-10-04 08:20:09,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:09,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:20:09,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:20:11,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:20:13,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 08:20:13,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 08:20:15,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:16,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:20:16,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:18,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:18,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:18,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 08:20:20,218 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 08:20:23,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:29,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:20:31,445 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 08:20:33,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1587760.0, ans=0.5 2023-10-04 08:20:35,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:20:39,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:20:39,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1587760.0, ans=0.125 2023-10-04 08:20:42,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:20:42,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 08:20:43,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:20:43,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:20:43,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:20:44,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:20:47,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 08:20:51,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 08:20:53,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 08:20:53,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:53,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 08:20:54,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:20:57,384 INFO [train.py:1046] (2/4) Epoch 45, batch 4450, loss[loss=0.1536, simple_loss=0.2426, pruned_loss=0.03229, over 24576.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2352, pruned_loss=0.03747, over 4713729.47 frames. ], batch size: 71, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:20:57,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:21:00,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 08:21:02,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:21:04,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:04,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:21:04,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1587893.3333333333, ans=0.125 2023-10-04 08:21:12,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:12,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:21:15,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:18,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:21:21,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:21:21,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:21:22,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 08:21:22,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:21:23,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:23,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:21:23,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:21:25,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:21:31,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:31,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:21:34,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:21:34,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:21:38,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 08:21:40,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 08:21:40,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 08:21:40,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:21:43,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:43,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 08:21:47,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:21:50,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:50,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 08:21:50,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:50,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:21:50,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:21:50,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:53,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:56,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:21:58,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 08:21:59,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:22:01,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:22:02,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:22:03,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:22:03,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:22:04,646 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.61 vs. limit=6.0 2023-10-04 08:22:07,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:22:08,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1588160.0, ans=10.0 2023-10-04 08:22:10,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 08:22:11,841 INFO [train.py:1046] (2/4) Epoch 45, batch 4500, loss[loss=0.1596, simple_loss=0.2491, pruned_loss=0.03503, over 24543.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2359, pruned_loss=0.03774, over 4710471.24 frames. ], batch size: 71, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:22:13,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:22:15,285 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.063e+02 2.420e+02 3.061e+02 5.300e+02, threshold=4.841e+02, percent-clipped=1.0 2023-10-04 08:22:16,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:22:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 08:22:18,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 08:22:20,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:22:23,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:22:25,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:22:27,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:22:28,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:22:28,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:22:28,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:22:40,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:22:40,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:22:43,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:22:44,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:22:44,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:22:51,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:22:54,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1588426.6666666667, ans=0.125 2023-10-04 08:22:56,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:22:58,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:23:00,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:23:00,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 08:23:02,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:02,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:02,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1588426.6666666667, ans=0.1 2023-10-04 08:23:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:06,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:23:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:23:08,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 08:23:08,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:23:08,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:12,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:23:12,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:23:16,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:18,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:23:20,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:23:20,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 08:23:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 08:23:23,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 08:23:25,636 INFO [train.py:1046] (2/4) Epoch 45, batch 4550, loss[loss=0.1478, simple_loss=0.24, pruned_loss=0.02783, over 24357.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2357, pruned_loss=0.03755, over 4713236.84 frames. ], batch size: 74, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:23:25,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 08:23:29,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 08:23:30,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:23:33,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:23:33,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:23:35,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:23:38,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1588560.0, ans=0.125 2023-10-04 08:23:41,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:23:42,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:44,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:23:44,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:23:44,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:46,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:23:46,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:23:50,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:23:52,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 08:23:52,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1588626.6666666667, ans=0.0 2023-10-04 08:23:54,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 08:23:54,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:23:55,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 08:24:00,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 08:24:01,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:24:03,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 08:24:04,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:24:07,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:07,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:07,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:24:10,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 08:24:13,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:24:16,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:16,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:24:17,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:24:18,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 08:24:19,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 08:24:19,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:24:20,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 08:24:22,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 08:24:24,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:24:25,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:24:25,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:24:27,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:27,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:24:28,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:24:30,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 08:24:31,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:24:31,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 08:24:33,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 08:24:33,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:24:33,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 08:24:35,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:24:36,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:24:38,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:24:38,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:40,213 INFO [train.py:1046] (2/4) Epoch 45, batch 4600, loss[loss=0.1565, simple_loss=0.2512, pruned_loss=0.0309, over 24650.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2343, pruned_loss=0.03697, over 4705628.08 frames. ], batch size: 73, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:24:40,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:24:40,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:24:42,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:24:43,542 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.936e+02 2.256e+02 2.645e+02 3.814e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-04 08:24:46,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:24:46,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:24:49,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:24:49,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:24:50,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:24:52,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 08:24:52,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:24:56,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:24:56,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:24:59,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:00,274 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=22.5 2023-10-04 08:25:04,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=22.5 2023-10-04 08:25:06,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 08:25:06,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:09,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:13,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:25:13,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:25:17,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 08:25:17,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:25:17,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:25:17,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1589026.6666666667, ans=0.5 2023-10-04 08:25:23,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:24,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:25:26,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:25:29,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 08:25:29,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:25:34,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:35,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:25:38,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:38,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 08:25:38,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:39,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 08:25:39,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:39,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:41,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:41,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:25:43,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:43,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 08:25:44,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 08:25:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 08:25:44,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:25:47,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:25:47,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:25:49,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:54,416 INFO [train.py:1046] (2/4) Epoch 45, batch 4650, loss[loss=0.1524, simple_loss=0.2408, pruned_loss=0.03195, over 24644.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03679, over 4698156.19 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:26:00,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:26:03,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:26:04,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:26:04,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:26:04,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:26:04,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:26:06,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:26:09,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 08:26:12,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:26:15,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 08:26:15,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:26:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 08:26:17,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:26:17,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1589293.3333333333, ans=0.125 2023-10-04 08:26:18,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 08:26:18,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 08:26:18,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:18,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:26:21,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:26:21,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:22,983 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 08:26:25,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:26,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.72 vs. limit=15.0 2023-10-04 08:26:27,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 08:26:30,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:30,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:26:32,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 08:26:33,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:26:33,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1589360.0, ans=0.125 2023-10-04 08:26:35,616 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.45 vs. limit=22.5 2023-10-04 08:26:36,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:26:37,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:26:41,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1589426.6666666667, ans=0.125 2023-10-04 08:26:43,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:45,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:47,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:47,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:26:49,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 08:26:49,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 08:26:51,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 08:26:51,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 08:26:54,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:00,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:27:00,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:00,155 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 08:27:00,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:02,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:02,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:27:04,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:27:06,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:27:06,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:07,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:27:08,961 INFO [train.py:1046] (2/4) Epoch 45, batch 4700, loss[loss=0.1691, simple_loss=0.2415, pruned_loss=0.04833, over 22717.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2339, pruned_loss=0.03691, over 4699272.67 frames. ], batch size: 322, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:27:10,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:11,761 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.046e+02 2.404e+02 2.908e+02 6.182e+02, threshold=4.807e+02, percent-clipped=8.0 2023-10-04 08:27:11,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:27:11,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:27:12,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1589560.0, ans=0.1 2023-10-04 08:27:13,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 08:27:15,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:27:16,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 08:27:16,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1589560.0, ans=0.125 2023-10-04 08:27:22,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:23,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:23,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:27:25,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:26,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:27:32,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 08:27:32,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 08:27:34,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:35,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:27:35,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:27:38,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:27:45,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:27:48,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:55,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 08:27:56,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:27:59,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:59,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1589760.0, ans=0.125 2023-10-04 08:28:02,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 08:28:02,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:07,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:28:07,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 08:28:09,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:11,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:15,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:28:15,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:28:15,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 08:28:15,974 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 08:28:16,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1589826.6666666667, ans=0.0 2023-10-04 08:28:17,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:17,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1589826.6666666667, ans=0.0 2023-10-04 08:28:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:20,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:20,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 08:28:21,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:23,201 INFO [train.py:1046] (2/4) Epoch 45, batch 4750, loss[loss=0.1576, simple_loss=0.2477, pruned_loss=0.03381, over 24357.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2348, pruned_loss=0.03701, over 4707176.24 frames. ], batch size: 77, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:28:26,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 08:28:27,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:28:28,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:28:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:28:31,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:28:33,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 08:28:34,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:28:38,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 08:28:39,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:28:39,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:40,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:28:47,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 08:28:52,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:28:54,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 08:28:55,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:28:57,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:57,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:58,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:29:00,174 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 08:29:00,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 08:29:05,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 08:29:07,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:09,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:12,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:29:12,267 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 08:29:12,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:29:15,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:29:15,607 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.50 vs. limit=12.0 2023-10-04 08:29:17,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:29:18,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 08:29:18,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 08:29:18,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:29:19,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:29:19,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:22,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:29:22,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 08:29:23,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 08:29:27,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:29:30,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:29:30,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 08:29:30,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:29:30,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1590160.0, ans=0.125 2023-10-04 08:29:31,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:29:32,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:29:33,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:34,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:29:37,032 INFO [train.py:1046] (2/4) Epoch 45, batch 4800, loss[loss=0.1998, simple_loss=0.272, pruned_loss=0.06378, over 19392.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2357, pruned_loss=0.03763, over 4705725.24 frames. ], batch size: 388, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:29:38,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:29:38,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 08:29:38,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 08:29:40,328 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.033e+02 2.288e+02 2.597e+02 3.954e+02, threshold=4.576e+02, percent-clipped=0.0 2023-10-04 08:29:41,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 08:29:43,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:29:44,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:29:45,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 08:29:49,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:50,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:29:55,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:29:56,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:56,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:56,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 08:29:58,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:29:59,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:30:01,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:30:03,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:05,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:05,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:30:08,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:08,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 08:30:08,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:09,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:12,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:14,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:16,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:17,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:30:18,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:30:20,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:21,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 08:30:21,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 08:30:21,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:21,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:30:21,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:30:21,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:30:21,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:30:23,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:30:25,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:30:29,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:30:29,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:30,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:30:35,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 08:30:35,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:36,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:36,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:30:37,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:42,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:30:43,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:30:43,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:44,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:30:44,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:30:44,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:30:48,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:30:48,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:50,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:51,428 INFO [train.py:1046] (2/4) Epoch 45, batch 4850, loss[loss=0.1631, simple_loss=0.2487, pruned_loss=0.03877, over 24064.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2362, pruned_loss=0.0379, over 4702972.07 frames. ], batch size: 86, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:30:51,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 08:30:52,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 08:30:54,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:54,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:54,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:30:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:57,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:31:02,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1590560.0, ans=0.1 2023-10-04 08:31:03,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 08:31:04,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:31:08,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.84 vs. limit=15.0 2023-10-04 08:31:09,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:31:09,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:31:09,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:31:14,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:31:15,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:31:17,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:31:17,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 08:31:18,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1590626.6666666667, ans=0.0 2023-10-04 08:31:20,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:31:23,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:31:23,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:31:24,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:31:24,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 08:31:27,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:31:27,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:27,881 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.67 vs. limit=15.0 2023-10-04 08:31:30,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1590693.3333333333, ans=0.1 2023-10-04 08:31:31,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:31,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 08:31:31,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 08:31:32,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:31:39,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:31:40,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 08:31:40,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1590760.0, ans=10.0 2023-10-04 08:31:41,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:31:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:31:45,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:31:45,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.48 vs. limit=15.0 2023-10-04 08:31:46,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 08:31:46,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:48,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 08:31:48,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:31:49,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:31:49,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 08:31:58,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:03,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:32:03,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:05,190 INFO [train.py:1046] (2/4) Epoch 45, batch 4900, loss[loss=0.1534, simple_loss=0.2414, pruned_loss=0.03271, over 24662.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2345, pruned_loss=0.03772, over 4706177.03 frames. ], batch size: 73, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:32:08,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 08:32:08,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:32:09,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 2.000e+02 2.208e+02 2.639e+02 4.240e+02, threshold=4.416e+02, percent-clipped=0.0 2023-10-04 08:32:12,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:12,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:32:12,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:32:16,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 08:32:22,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 08:32:25,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 08:32:26,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 08:32:26,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1590960.0, ans=0.1 2023-10-04 08:32:27,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:32:27,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:32:27,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:32:27,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:27,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:32:28,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 08:32:33,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 08:32:33,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:32:34,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:32:36,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:32:37,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:32:39,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:40,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:40,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 08:32:42,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:32:43,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.42 vs. limit=15.0 2023-10-04 08:32:43,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:43,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 08:32:43,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 08:32:47,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 08:32:50,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:32:51,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:32:51,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:32:51,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:51,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:32:51,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:32:53,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 08:32:56,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:57,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:32:59,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:33:00,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 08:33:00,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1591093.3333333333, ans=0.125 2023-10-04 08:33:02,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:33:02,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 08:33:02,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1591093.3333333333, ans=0.0 2023-10-04 08:33:03,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 08:33:09,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:33:11,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:33:13,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 08:33:13,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:33:13,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:33:15,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:33:15,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1591160.0, ans=0.0 2023-10-04 08:33:19,771 INFO [train.py:1046] (2/4) Epoch 45, batch 4950, loss[loss=0.1566, simple_loss=0.2387, pruned_loss=0.0372, over 24306.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2332, pruned_loss=0.03712, over 4709475.80 frames. ], batch size: 61, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:33:19,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:33:19,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:33:19,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:33:21,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 08:33:21,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:33:24,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:33:25,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:33:27,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 08:33:27,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 08:33:27,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:33:28,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 08:33:28,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:28,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:33:28,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:33:28,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:30,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:33:31,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:33:33,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:33:34,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:33:37,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:37,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:33:39,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.60 vs. limit=15.0 2023-10-04 08:33:40,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:33:46,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:47,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:33:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:49,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1591360.0, ans=0.0 2023-10-04 08:33:51,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:51,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1591360.0, ans=0.0 2023-10-04 08:33:52,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:33:54,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 08:33:54,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 08:33:58,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:59,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:33:59,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:34:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:34:01,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:34:02,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:34:02,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1591426.6666666667, ans=0.125 2023-10-04 08:34:05,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:34:06,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:34:09,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:34:09,705 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:34:10,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:34:10,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:11,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1591426.6666666667, ans=0.125 2023-10-04 08:34:12,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 08:34:12,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:34:14,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:34:18,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:34:18,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:34:18,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:34:20,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:21,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:34:21,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:34:25,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:34:25,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:34:25,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:34:26,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 08:34:29,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:34:33,774 INFO [train.py:1046] (2/4) Epoch 45, batch 5000, loss[loss=0.146, simple_loss=0.225, pruned_loss=0.03354, over 24617.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2326, pruned_loss=0.03698, over 4707577.01 frames. ], batch size: 60, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:34:35,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 08:34:35,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:34:39,138 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.112e+02 2.442e+02 2.961e+02 4.557e+02, threshold=4.884e+02, percent-clipped=1.0 2023-10-04 08:34:43,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:43,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:34:43,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 08:34:44,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 08:34:47,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:34:48,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 08:34:49,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:34:49,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:34:49,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 08:34:50,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:34:50,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:34:52,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 08:34:52,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:34:52,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:34:54,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 08:34:56,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 08:34:56,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:34:57,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 08:34:57,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:34:57,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:34:58,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:34:58,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 08:34:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 08:34:59,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1591626.6666666667, ans=0.05 2023-10-04 08:35:00,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 08:35:00,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:35:00,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1591626.6666666667, ans=0.125 2023-10-04 08:35:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:02,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1591693.3333333333, ans=0.125 2023-10-04 08:35:03,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 08:35:03,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:35:04,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.10 vs. limit=6.0 2023-10-04 08:35:04,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:06,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:35:07,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 08:35:09,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 08:35:10,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:35:11,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:35:15,966 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 08:35:18,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:35:19,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:19,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:19,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1591760.0, ans=0.1 2023-10-04 08:35:23,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 08:35:23,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:35:23,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:35:25,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:35:27,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 08:35:29,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:35:30,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:35:32,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1591826.6666666667, ans=0.07 2023-10-04 08:35:33,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:35:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 08:35:41,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:47,217 INFO [train.py:1046] (2/4) Epoch 45, batch 5050, loss[loss=0.165, simple_loss=0.2554, pruned_loss=0.03727, over 23941.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2335, pruned_loss=0.03684, over 4716503.56 frames. ], batch size: 80, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:35:49,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1591893.3333333333, ans=0.0 2023-10-04 08:35:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:35:50,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:51,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:35:51,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:35:53,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:35:53,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:35:53,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:58,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:58,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 08:35:58,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1591893.3333333333, ans=0.1 2023-10-04 08:35:59,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:36:00,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:36:01,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-10-04 08:36:02,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:36:04,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 08:36:04,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:36:05,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:36:07,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:36:08,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:36:08,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:36:16,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 08:36:16,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:36:18,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:36:18,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 08:36:18,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:36:21,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:21,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:36:23,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:36:23,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 08:36:24,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 08:36:25,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:27,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:36:29,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1592026.6666666667, ans=0.125 2023-10-04 08:36:30,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:32,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 08:36:32,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1592093.3333333333, ans=0.1 2023-10-04 08:36:33,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:36:35,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 08:36:36,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:36:36,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:36:38,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:36:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:36:38,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1592093.3333333333, ans=0.2 2023-10-04 08:36:40,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:36:42,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:36:43,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:43,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:36:43,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:36:44,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 08:36:44,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:36:46,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:36:46,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1592160.0, ans=0.0 2023-10-04 08:36:49,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:36:49,238 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 08:36:49,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:36:50,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:36:52,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:52,364 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 08:36:55,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:36:55,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 08:36:55,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:55,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1592160.0, ans=0.125 2023-10-04 08:36:57,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:36:58,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:58,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 08:37:00,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 08:37:01,988 INFO [train.py:1046] (2/4) Epoch 45, batch 5100, loss[loss=0.1935, simple_loss=0.2617, pruned_loss=0.06266, over 19249.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2341, pruned_loss=0.03679, over 4710890.97 frames. ], batch size: 388, lr: 2.24e-03, grad_scale: 8.0 2023-10-04 08:37:03,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:03,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:03,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:37:05,962 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 08:37:07,264 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.979e+02 2.119e+02 2.358e+02 3.619e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-04 08:37:08,311 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.14 vs. limit=22.5 2023-10-04 08:37:08,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:37:08,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1592226.6666666667, ans=0.125 2023-10-04 08:37:08,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1592226.6666666667, ans=0.0 2023-10-04 08:37:10,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 08:37:10,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 08:37:10,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:12,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.91 vs. limit=15.0 2023-10-04 08:37:12,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:37:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:37:14,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 08:37:14,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 08:37:20,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:37:20,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:37:20,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1592293.3333333333, ans=0.125 2023-10-04 08:37:24,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:28,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 08:37:28,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:29,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:37:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:37:32,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:32,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:32,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 08:37:35,723 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 08:37:37,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:37,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 08:37:37,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 08:37:39,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:40,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1592360.0, ans=0.5 2023-10-04 08:37:45,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1592426.6666666667, ans=0.125 2023-10-04 08:37:48,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:37:49,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 08:37:51,303 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 08:37:51,311 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 08:37:52,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 08:37:52,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:55,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 08:38:00,557 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.69 vs. limit=10.0 2023-10-04 08:38:01,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 08:38:03,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:38:06,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:38:07,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1592493.3333333333, ans=0.0 2023-10-04 08:38:10,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 08:38:11,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1592493.3333333333, ans=0.125 2023-10-04 08:38:12,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:38:12,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 08:38:15,191 INFO [train.py:1046] (2/4) Epoch 45, batch 5150, loss[loss=0.1621, simple_loss=0.251, pruned_loss=0.0366, over 24645.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.03702, over 4715398.14 frames. ], batch size: 68, lr: 2.24e-03, grad_scale: 8.0 2023-10-04 08:38:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:38:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:38:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:38:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:38:18,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:38:19,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:38:20,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 08:38:20,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 08:38:22,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 08:38:22,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:38:22,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 08:38:23,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:38:25,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 08:38:26,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:38:26,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:38:34,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:38:34,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 08:38:36,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:38:36,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:38:37,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:38:37,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:38:37,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:38:39,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:38:39,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:38:39,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 08:38:40,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:38:40,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:38:43,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:38:44,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 08:38:46,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:38:50,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:38:53,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 08:38:55,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:38:55,750 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.50 vs. limit=15.0 2023-10-04 08:39:01,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:39:01,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.50 vs. limit=15.0 2023-10-04 08:39:03,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:39:05,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:05,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:39:08,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 08:39:11,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:39:11,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1592760.0, ans=0.0 2023-10-04 08:39:12,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:39:12,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:39:15,026 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=7.92 vs. limit=12.0 2023-10-04 08:39:15,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:15,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1592826.6666666667, ans=0.125 2023-10-04 08:39:17,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:39:18,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 08:39:22,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:39:24,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:39:27,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:39:27,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:39:28,933 INFO [train.py:1046] (2/4) Epoch 45, batch 5200, loss[loss=0.1417, simple_loss=0.2252, pruned_loss=0.0291, over 24283.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2353, pruned_loss=0.03704, over 4720899.00 frames. ], batch size: 56, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:39:29,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:39:29,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:39:29,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:39:29,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:39:32,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:39:33,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:39:35,577 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.121e+02 2.351e+02 2.872e+02 5.392e+02, threshold=4.702e+02, percent-clipped=2.0 2023-10-04 08:39:38,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:41,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 08:39:41,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:39:41,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:39:43,366 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=15.0 2023-10-04 08:39:44,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:44,752 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.72 vs. limit=10.0 2023-10-04 08:39:45,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:39:45,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:39:47,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 08:39:49,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:39:49,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:52,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 08:39:55,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:39:57,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:39:57,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 08:39:57,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 08:40:01,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 08:40:01,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:40:01,993 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 08:40:02,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:40:03,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:03,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:40:05,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 08:40:05,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:40:06,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:40:09,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 08:40:09,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 08:40:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 08:40:15,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 08:40:16,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:40:20,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:40:22,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:22,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 08:40:24,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:40:24,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 08:40:24,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:25,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:40:28,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:40:30,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:40:33,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:40:33,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:40:33,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:39,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:39,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 08:40:41,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:40:41,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:40:42,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:42,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:40:43,932 INFO [train.py:1046] (2/4) Epoch 45, batch 5250, loss[loss=0.1501, simple_loss=0.2318, pruned_loss=0.03424, over 24337.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03724, over 4711062.76 frames. ], batch size: 61, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:40:44,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:40:46,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:40:48,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:40:48,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:40:50,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:40:57,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:57,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:41:00,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:41:01,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:41:03,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 08:41:03,035 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:41:05,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1593293.3333333333, ans=0.1 2023-10-04 08:41:06,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:41:11,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1593293.3333333333, ans=0.1 2023-10-04 08:41:27,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1593426.6666666667, ans=0.125 2023-10-04 08:41:31,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1593426.6666666667, ans=0.125 2023-10-04 08:41:40,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1593493.3333333333, ans=0.2 2023-10-04 08:41:47,422 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:41:52,481 INFO [train.py:1046] (2/4) Epoch 45, batch 5300, loss[loss=0.1555, simple_loss=0.247, pruned_loss=0.03198, over 24636.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2333, pruned_loss=0.03667, over 4704147.29 frames. ], batch size: 68, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:41:57,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1593560.0, ans=0.0 2023-10-04 08:41:58,014 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 2.084e+02 2.270e+02 2.444e+02 3.408e+02, threshold=4.540e+02, percent-clipped=0.0 2023-10-04 08:42:01,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1593560.0, ans=0.125 2023-10-04 08:42:06,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:42:06,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 08:42:06,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 08:42:06,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:06,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:06,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:06,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:06,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:07,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:07,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:07,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:42:07,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:42:07,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 08:42:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 08:42:07,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 08:42:07,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:42:07,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 08:42:07,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 08:42:07,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:08,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:08,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:42:08,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:42:08,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:42:09,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:42:09,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:09,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:09,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:42:09,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:09,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:42:09,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:09,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:42:09,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 08:42:09,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:42:10,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:10,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 08:42:10,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 08:42:10,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:42:10,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:10,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 08:42:10,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 08:42:10,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:42:11,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:42:11,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:42:11,453 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 08:42:11,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 08:42:11,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:42:11,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:11,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 08:42:11,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 08:42:11,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 08:42:11,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:42:13,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1593640.0, ans=0.0 2023-10-04 08:42:13,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-10-04 08:42:16,533 INFO [train.py:1046] (2/4) Epoch 46, batch 0, loss[loss=0.1606, simple_loss=0.2484, pruned_loss=0.03643, over 23995.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2484, pruned_loss=0.03643, over 23995.00 frames. ], batch size: 80, lr: 2.22e-03, grad_scale: 32.0 2023-10-04 08:42:16,534 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 08:42:27,631 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([4.1034, 3.9660, 2.9049, 4.0002, 2.9992, 3.4468, 3.9064, 3.9407], device='cuda:2') 2023-10-04 08:42:28,886 INFO [train.py:1078] (2/4) Epoch 46, validation: loss=0.3372, simple_loss=0.2742, pruned_loss=0.2001, over 1125622.00 frames. 2023-10-04 08:42:28,887 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 08:42:28,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 08:42:29,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:42:32,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:42:36,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:36,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:42:36,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:36,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 08:42:39,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 08:42:40,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:42,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:43,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1593706.6666666667, ans=0.1 2023-10-04 08:42:43,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1593706.6666666667, ans=0.125 2023-10-04 08:42:46,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:47,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:47,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:42:47,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:42:49,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 08:42:51,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:42:59,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:42:59,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:43:01,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 08:43:02,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:43:02,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:43:06,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:43:10,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:43:10,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1593773.3333333333, ans=0.0 2023-10-04 08:43:13,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:43:18,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 08:43:22,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 08:43:24,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:43:24,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:24,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:43:25,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:43:27,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 08:43:30,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:32,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:34,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:43:40,190 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 08:43:41,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:43:42,894 INFO [train.py:1046] (2/4) Epoch 46, batch 50, loss[loss=0.147, simple_loss=0.2246, pruned_loss=0.03465, over 23706.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2391, pruned_loss=0.03824, over 1065239.24 frames. ], batch size: 135, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:43:44,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:43:46,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:43:46,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 08:43:47,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:43:47,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:43:48,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:43:51,477 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:43:52,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:43:55,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 08:43:57,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:01,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:44:03,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 08:44:06,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 08:44:06,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:44:08,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:44:08,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:10,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:44:11,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:44:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:44:12,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:21,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:44:21,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:44:21,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:44:22,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 08:44:24,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:44:25,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:44:25,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 08:44:26,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:44:28,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 08:44:37,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:44:37,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:44:39,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:44:40,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:44:40,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:44:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 08:44:44,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 08:44:45,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:44:47,275 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.970e+02 2.347e+02 2.916e+02 8.307e+02, threshold=4.693e+02, percent-clipped=7.0 2023-10-04 08:44:47,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:44:48,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:44:48,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:44:48,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 08:44:48,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1594240.0, ans=0.125 2023-10-04 08:44:50,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 08:44:51,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 08:44:51,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:44:51,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:44:51,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1594240.0, ans=0.0 2023-10-04 08:44:52,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 08:44:52,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 08:44:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:44:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:44:55,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:44:55,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:44:57,050 INFO [train.py:1046] (2/4) Epoch 46, batch 100, loss[loss=0.147, simple_loss=0.2243, pruned_loss=0.03486, over 23363.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.237, pruned_loss=0.03666, over 1876221.05 frames. ], batch size: 119, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:44:57,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:45:01,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:45:04,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:45:07,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 08:45:07,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:45:10,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:45:10,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:45:10,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:45:10,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:45:10,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:45:13,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 08:45:15,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:45:15,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1594373.3333333333, ans=0.2 2023-10-04 08:45:16,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:16,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:45:16,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:45:20,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 08:45:21,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:21,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:45:23,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:45:24,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:45:28,759 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 08:45:28,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 08:45:30,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:45:30,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:45:34,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:45:37,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:39,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:45,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:47,164 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 08:45:49,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:45:50,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1594506.6666666667, ans=0.125 2023-10-04 08:45:51,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:45:53,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:45:56,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:58,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:01,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:46:02,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:46:05,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:05,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:07,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:07,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:46:07,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:07,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 08:46:08,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 08:46:08,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:09,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:46:10,305 INFO [train.py:1046] (2/4) Epoch 46, batch 150, loss[loss=0.1469, simple_loss=0.2249, pruned_loss=0.03445, over 24620.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2376, pruned_loss=0.03728, over 2506976.04 frames. ], batch size: 60, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:46:10,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:10,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:10,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 08:46:10,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:46:11,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:46:11,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:11,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:13,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:14,434 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.11 vs. limit=15.0 2023-10-04 08:46:14,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:46:14,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:46:17,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:21,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:46:21,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:23,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:25,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:25,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:27,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1594706.6666666667, ans=0.95 2023-10-04 08:46:28,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:46:28,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:28,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1594706.6666666667, ans=0.125 2023-10-04 08:46:32,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 08:46:32,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 08:46:32,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 08:46:35,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:46:35,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:46:36,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:46:36,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:36,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:38,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:38,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:40,049 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 08:46:41,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:47,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:52,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:46:53,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 08:46:57,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:46:57,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:57,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:46:59,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:47:00,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:47:00,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:47:00,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:02,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 08:47:04,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:06,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:06,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:47:06,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:47:08,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:09,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 08:47:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:47:13,934 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.011e+02 2.305e+02 2.782e+02 3.592e+02, threshold=4.611e+02, percent-clipped=0.0 2023-10-04 08:47:14,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:47:15,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:47:17,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:47:17,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 08:47:17,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:47:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 08:47:17,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1594906.6666666667, ans=0.125 2023-10-04 08:47:20,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:47:24,660 INFO [train.py:1046] (2/4) Epoch 46, batch 200, loss[loss=0.1453, simple_loss=0.2296, pruned_loss=0.03046, over 23469.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2382, pruned_loss=0.03767, over 2993881.93 frames. ], batch size: 134, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:47:24,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:47:26,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:47:27,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1594973.3333333333, ans=0.125 2023-10-04 08:47:28,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 08:47:28,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:47:30,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:30,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1594973.3333333333, ans=0.125 2023-10-04 08:47:31,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 08:47:33,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:47:35,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:35,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:39,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:47:39,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:47:39,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:39,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1595040.0, ans=0.1 2023-10-04 08:47:41,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.88 vs. limit=22.5 2023-10-04 08:47:42,638 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:47:43,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1595040.0, ans=0.125 2023-10-04 08:47:58,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:47:58,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:47:59,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:48:00,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1595106.6666666667, ans=0.0 2023-10-04 08:48:01,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:48:01,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 08:48:01,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:48:02,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:04,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:48:04,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:48:04,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:48:05,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 08:48:05,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:48:05,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:06,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.26 vs. limit=15.0 2023-10-04 08:48:10,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:48:13,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1595173.3333333333, ans=0.125 2023-10-04 08:48:16,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:48:25,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:25,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:48:31,885 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.75 vs. limit=15.0 2023-10-04 08:48:32,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:32,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1595240.0, ans=0.125 2023-10-04 08:48:35,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 08:48:35,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:35,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:48:35,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:48:36,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:48:37,883 INFO [train.py:1046] (2/4) Epoch 46, batch 250, loss[loss=0.1707, simple_loss=0.2604, pruned_loss=0.04045, over 24048.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2372, pruned_loss=0.03739, over 3381042.92 frames. ], batch size: 80, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:48:37,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 08:48:38,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:48:39,385 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 08:48:41,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:41,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1595306.6666666667, ans=0.0 2023-10-04 08:48:44,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:48:44,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:46,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:47,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:48:47,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:49,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:48:51,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:48:57,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1595373.3333333333, ans=0.125 2023-10-04 08:48:58,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1595373.3333333333, ans=0.1 2023-10-04 08:49:02,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:49:05,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:49:06,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:49:11,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:49:12,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:49:13,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:49:15,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:49:15,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:49:15,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:49:17,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:49:17,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1595440.0, ans=0.0 2023-10-04 08:49:18,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:49:21,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 08:49:21,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:49:21,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:49:22,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:49:22,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:49:23,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:49:26,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:49:26,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:49:27,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:49:30,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:49:30,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:49:33,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:49:39,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:49:42,118 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.009e+02 2.186e+02 2.461e+02 3.268e+02, threshold=4.371e+02, percent-clipped=0.0 2023-10-04 08:49:42,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:49:42,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1595573.3333333333, ans=0.125 2023-10-04 08:49:46,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:49:48,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:49:49,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1595573.3333333333, ans=0.125 2023-10-04 08:49:51,369 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:49:52,405 INFO [train.py:1046] (2/4) Epoch 46, batch 300, loss[loss=0.135, simple_loss=0.2135, pruned_loss=0.02824, over 24410.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.03726, over 3658878.86 frames. ], batch size: 58, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:49:52,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 08:49:53,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:49:53,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:49:55,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 08:49:55,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:49:56,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:49:56,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 08:50:01,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:50:01,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:50:04,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:50:04,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 08:50:06,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:50:06,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1595706.6666666667, ans=0.0 2023-10-04 08:50:07,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.71 vs. limit=15.0 2023-10-04 08:50:07,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:50:08,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 08:50:09,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:50:12,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1595706.6666666667, ans=0.125 2023-10-04 08:50:13,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:50:14,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1595706.6666666667, ans=0.035 2023-10-04 08:50:17,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:50:17,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 08:50:21,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 08:50:21,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:24,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:50:25,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:25,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 08:50:25,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:50:27,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1595773.3333333333, ans=0.125 2023-10-04 08:50:28,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:50:29,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:50:31,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:50:35,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:50:35,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 08:50:35,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:50:37,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:39,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 08:50:41,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:50:45,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:50:46,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:50:46,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 08:50:49,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:49,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:50:52,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:52,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1595906.6666666667, ans=0.125 2023-10-04 08:50:52,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1595906.6666666667, ans=0.125 2023-10-04 08:50:55,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:50:56,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 08:50:56,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:50:57,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:50:59,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 08:50:59,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:59,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:01,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:51:02,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:02,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:07,707 INFO [train.py:1046] (2/4) Epoch 46, batch 350, loss[loss=0.1354, simple_loss=0.1861, pruned_loss=0.04236, over 19255.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2322, pruned_loss=0.03685, over 3884009.46 frames. ], batch size: 388, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:51:07,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:51:07,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 08:51:11,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:14,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1595973.3333333333, ans=0.125 2023-10-04 08:51:16,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:51:20,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:20,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:23,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 08:51:23,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:51:24,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 08:51:27,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:27,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 08:51:27,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:51:30,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 08:51:33,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:51:33,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:51:35,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.31 vs. limit=15.0 2023-10-04 08:51:35,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:51:37,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:51:39,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:51:39,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:51:39,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:39,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:51:41,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:51:41,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:42,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1596106.6666666667, ans=0.0 2023-10-04 08:51:48,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1596106.6666666667, ans=10.0 2023-10-04 08:51:49,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:51:49,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:51:51,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:51:52,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:56,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 08:51:56,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:56,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1596173.3333333333, ans=0.2 2023-10-04 08:51:59,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:59,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:00,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:52:01,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 08:52:02,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:03,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1596173.3333333333, ans=0.1 2023-10-04 08:52:04,254 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 08:52:06,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 08:52:06,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:09,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:52:09,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 08:52:11,456 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.680e+02 2.124e+02 2.443e+02 2.962e+02 4.613e+02, threshold=4.885e+02, percent-clipped=1.0 2023-10-04 08:52:13,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:15,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:52:16,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:18,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:18,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:19,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:22,618 INFO [train.py:1046] (2/4) Epoch 46, batch 400, loss[loss=0.1718, simple_loss=0.2602, pruned_loss=0.04168, over 24299.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2322, pruned_loss=0.03628, over 4072419.41 frames. ], batch size: 74, lr: 2.22e-03, grad_scale: 32.0 2023-10-04 08:52:22,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:52:25,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:52:27,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 08:52:27,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:27,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:27,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1596306.6666666667, ans=0.05 2023-10-04 08:52:28,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:52:28,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:31,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:31,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:32,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 08:52:35,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 08:52:35,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:37,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 08:52:37,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1596373.3333333333, ans=0.125 2023-10-04 08:52:38,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:41,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:52:41,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:52:41,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 08:52:43,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:52:43,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:43,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:52:43,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:47,198 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 08:52:47,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 08:52:51,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:52,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:52,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 08:52:55,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 08:52:58,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:52:58,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1596440.0, ans=0.2 2023-10-04 08:53:00,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:00,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1596440.0, ans=0.05 2023-10-04 08:53:04,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1596440.0, ans=0.2 2023-10-04 08:53:06,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 08:53:09,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:53:10,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 08:53:12,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1596506.6666666667, ans=0.0 2023-10-04 08:53:14,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:53:14,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:53:15,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 08:53:19,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:53:22,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:53:23,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:53:26,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:26,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 08:53:29,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:53:29,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 08:53:30,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:53:30,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:53:33,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 08:53:36,733 INFO [train.py:1046] (2/4) Epoch 46, batch 450, loss[loss=0.1662, simple_loss=0.2351, pruned_loss=0.04864, over 23768.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2332, pruned_loss=0.03675, over 4191694.30 frames. ], batch size: 179, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:53:36,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:53:36,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:53:36,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:53:38,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 08:53:38,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:53:39,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:53:39,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:53:39,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 08:53:41,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:53:42,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:53:42,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1596640.0, ans=0.125 2023-10-04 08:53:43,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:53:53,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:53,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:53:55,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 08:53:55,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1596706.6666666667, ans=0.1 2023-10-04 08:53:56,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 08:53:59,674 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.27 vs. limit=15.0 2023-10-04 08:54:00,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:54:03,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:54:04,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:08,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:54:08,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:54:10,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 08:54:12,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 08:54:12,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 08:54:12,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:13,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:15,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:54:17,233 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 08:54:17,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 08:54:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:54:18,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:54:20,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:54:24,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:54:24,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:54:25,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 08:54:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 08:54:28,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:54:29,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1596840.0, ans=0.0 2023-10-04 08:54:30,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:54:30,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:54:32,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 08:54:35,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:54:35,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.65 vs. limit=15.0 2023-10-04 08:54:36,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 08:54:36,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 08:54:38,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:54:41,391 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.934e+02 2.148e+02 2.455e+02 3.795e+02, threshold=4.297e+02, percent-clipped=0.0 2023-10-04 08:54:44,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:54:45,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:54:48,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:54:48,300 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 08:54:50,906 INFO [train.py:1046] (2/4) Epoch 46, batch 500, loss[loss=0.1538, simple_loss=0.2372, pruned_loss=0.0352, over 24488.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.03666, over 4317904.19 frames. ], batch size: 66, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:54:52,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:53,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:54:53,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:53,808 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 08:54:54,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.47 vs. limit=15.0 2023-10-04 08:54:55,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 08:54:55,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:58,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:55:02,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:55:02,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:55:05,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:55:05,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:55:05,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:10,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.27 vs. limit=15.0 2023-10-04 08:55:15,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:17,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 08:55:17,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:55:17,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:19,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 08:55:19,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:55:22,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:55:23,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:55:23,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:55:23,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:24,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 08:55:26,799 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 08:55:28,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:28,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1597106.6666666667, ans=0.125 2023-10-04 08:55:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:31,698 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-04 08:55:32,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:32,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:33,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:55:33,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 08:55:37,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:55:39,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:55:43,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:55:46,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:51,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:55,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 08:55:55,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:55:56,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:56,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1597240.0, ans=0.125 2023-10-04 08:55:59,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 08:55:59,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:56:00,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:56:04,258 INFO [train.py:1046] (2/4) Epoch 46, batch 550, loss[loss=0.1569, simple_loss=0.2508, pruned_loss=0.03155, over 24641.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2348, pruned_loss=0.03681, over 4415443.53 frames. ], batch size: 65, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:56:05,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 08:56:06,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 08:56:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:08,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 08:56:08,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:56:08,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:09,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:11,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:11,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:56:13,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:56:14,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:56:14,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 08:56:15,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:56:20,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:20,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:23,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:56:25,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:28,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 08:56:29,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 08:56:30,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-04 08:56:30,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:56:36,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:56:36,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:56:36,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1597440.0, ans=0.125 2023-10-04 08:56:37,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:56:39,835 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.76 vs. limit=12.0 2023-10-04 08:56:40,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:40,562 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 08:56:42,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:45,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 08:56:46,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:56:47,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:56:47,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:56:49,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:49,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 08:56:49,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 08:56:49,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1597506.6666666667, ans=0.125 2023-10-04 08:56:51,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:56:51,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:56:52,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:56:52,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:53,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1597506.6666666667, ans=0.2 2023-10-04 08:56:56,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:56:57,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:57:00,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:57:00,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:00,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:57:01,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:57:03,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:57:03,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:57:04,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:04,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:57:05,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:57:10,264 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.017e+02 2.263e+02 2.749e+02 3.801e+02, threshold=4.526e+02, percent-clipped=0.0 2023-10-04 08:57:11,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 08:57:11,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1597573.3333333333, ans=0.0 2023-10-04 08:57:14,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 08:57:16,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:57:16,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:57:16,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:57:16,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1597640.0, ans=0.1 2023-10-04 08:57:17,310 INFO [train.py:1046] (2/4) Epoch 46, batch 600, loss[loss=0.1376, simple_loss=0.2045, pruned_loss=0.03531, over 23582.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2349, pruned_loss=0.03683, over 4486701.21 frames. ], batch size: 256, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:57:17,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1597640.0, ans=0.1 2023-10-04 08:57:19,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1597640.0, ans=0.0 2023-10-04 08:57:23,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:57:26,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1597640.0, ans=0.125 2023-10-04 08:57:27,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:57:28,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 08:57:31,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:57:33,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:57:35,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:38,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 08:57:38,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:57:40,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1597706.6666666667, ans=0.1 2023-10-04 08:57:44,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 08:57:47,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1597773.3333333333, ans=0.125 2023-10-04 08:57:48,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:57:48,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:48,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:57:49,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1597773.3333333333, ans=0.0 2023-10-04 08:57:54,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:57:54,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:57:54,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:02,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:58:05,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1597840.0, ans=0.0 2023-10-04 08:58:06,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:06,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:58:06,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:58:12,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 08:58:16,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1597906.6666666667, ans=0.04949747468305833 2023-10-04 08:58:19,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:58:19,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:58:21,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 08:58:22,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:58:24,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 08:58:24,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:58:26,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:58:30,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:58:31,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.37 vs. limit=15.0 2023-10-04 08:58:31,711 INFO [train.py:1046] (2/4) Epoch 46, batch 650, loss[loss=0.1638, simple_loss=0.2508, pruned_loss=0.03836, over 24392.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03672, over 4545197.33 frames. ], batch size: 77, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:58:31,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:58:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:58:35,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:58:37,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:58:37,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1597973.3333333333, ans=0.125 2023-10-04 08:58:38,038 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.89 vs. limit=15.0 2023-10-04 08:58:40,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 08:58:41,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:44,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:58:44,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:58:46,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1598040.0, ans=0.04949747468305833 2023-10-04 08:58:48,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:58:52,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 08:58:53,100 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:58:55,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:58:55,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:58:58,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:58:58,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 08:59:01,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:02,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:02,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:59:04,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:04,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1598106.6666666667, ans=0.1 2023-10-04 08:59:05,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:59:08,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:59:08,445 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 08:59:08,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:08,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:59:12,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:13,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:59:15,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:15,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:59:17,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 08:59:18,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:59:18,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:59:19,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 08:59:19,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:59:20,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:59:22,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 08:59:24,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 08:59:24,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:24,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:59:26,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:59:26,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:59:27,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:59:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:32,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:59:34,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:35,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.21 vs. limit=12.0 2023-10-04 08:59:35,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:36,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:59:37,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:38,309 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.015e+02 2.292e+02 2.659e+02 4.120e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 08:59:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:59:43,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:59:43,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:59:43,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:59:45,587 INFO [train.py:1046] (2/4) Epoch 46, batch 700, loss[loss=0.1566, simple_loss=0.246, pruned_loss=0.03361, over 23928.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2327, pruned_loss=0.03668, over 4565900.41 frames. ], batch size: 86, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:59:49,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 08:59:49,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 08:59:53,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 08:59:54,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:55,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:59:57,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 09:00:01,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:00:03,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:00:05,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:00:07,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:00:07,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:00:10,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:00:12,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 09:00:12,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:00:13,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1598440.0, ans=0.0 2023-10-04 09:00:14,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 09:00:17,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 09:00:20,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:00:21,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:00:21,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.04 vs. limit=12.0 2023-10-04 09:00:23,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:00:24,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1598440.0, ans=0.125 2023-10-04 09:00:26,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:00:26,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 09:00:31,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:00:31,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:00:31,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 09:00:35,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:00:36,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:00:39,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:00:44,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:00:44,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 09:00:47,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 09:00:47,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 09:00:50,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:00:51,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:00:53,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:00:53,385 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:00:55,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:00:55,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 09:00:56,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1598573.3333333333, ans=0.125 2023-10-04 09:00:59,917 INFO [train.py:1046] (2/4) Epoch 46, batch 750, loss[loss=0.1521, simple_loss=0.2426, pruned_loss=0.03079, over 24467.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2324, pruned_loss=0.03679, over 4589923.97 frames. ], batch size: 69, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 09:01:01,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 09:01:01,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 09:01:01,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 09:01:03,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 09:01:03,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 09:01:03,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:01:06,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 09:01:06,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:01:07,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:01:10,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:11,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:01:11,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:01:11,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:01:14,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:01:16,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:01:17,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:01:20,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:20,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:01:22,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 09:01:23,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:01:24,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1598706.6666666667, ans=0.0 2023-10-04 09:01:25,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:01:27,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:01:28,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:01:30,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 09:01:30,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:01:32,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 09:01:32,859 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 09:01:34,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 09:01:34,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:01:34,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:01:36,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:01:43,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:01:43,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:01:43,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:01:45,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:45,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:01:47,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 09:01:47,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:01:48,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:01:50,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:01:53,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:01:53,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 09:01:54,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:00,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:01,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:02:01,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:04,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:02:06,747 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.037e+02 2.258e+02 2.551e+02 3.884e+02, threshold=4.516e+02, percent-clipped=0.0 2023-10-04 09:02:08,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 09:02:08,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:02:10,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:12,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:12,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:14,213 INFO [train.py:1046] (2/4) Epoch 46, batch 800, loss[loss=0.1496, simple_loss=0.2288, pruned_loss=0.03523, over 23445.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2331, pruned_loss=0.03666, over 4610678.36 frames. ], batch size: 285, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 09:02:15,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:15,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:02:18,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1598973.3333333333, ans=0.125 2023-10-04 09:02:19,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.99 vs. limit=10.0 2023-10-04 09:02:23,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:23,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:24,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:02:25,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:27,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:27,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:31,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:34,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1599040.0, ans=0.07 2023-10-04 09:02:35,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:35,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:02:38,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 09:02:39,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:39,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:41,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:02:41,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:02:41,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 09:02:42,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:42,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 09:02:45,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:46,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:49,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:49,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:02:51,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:51,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:54,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:02:55,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:02:55,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 09:02:57,389 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 09:02:59,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 09:02:59,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:02:59,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:02,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:02,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:03:04,274 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.09 vs. limit=6.0 2023-10-04 09:03:06,421 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 09:03:06,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 09:03:07,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:03:09,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:03:12,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:03:17,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:03:17,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 09:03:18,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1599240.0, ans=0.0 2023-10-04 09:03:19,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:03:22,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 09:03:22,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1599240.0, ans=0.125 2023-10-04 09:03:28,257 INFO [train.py:1046] (2/4) Epoch 46, batch 850, loss[loss=0.1528, simple_loss=0.2325, pruned_loss=0.03659, over 23576.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2334, pruned_loss=0.0367, over 4640721.72 frames. ], batch size: 256, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 09:03:28,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:03:28,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1599306.6666666667, ans=0.2 2023-10-04 09:03:31,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:03:31,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 09:03:33,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:03:33,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:33,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1599306.6666666667, ans=0.125 2023-10-04 09:03:35,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 09:03:35,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:35,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:03:36,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1599306.6666666667, ans=0.125 2023-10-04 09:03:37,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:03:38,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1599306.6666666667, ans=0.0 2023-10-04 09:03:39,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:03:41,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:03:41,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1599373.3333333333, ans=0.125 2023-10-04 09:03:42,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 09:03:42,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 09:03:42,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 09:03:47,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:03:47,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:03:47,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:03:48,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:48,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:03:53,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:53,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:03:54,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 09:03:55,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 09:03:57,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:58,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 09:04:04,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 09:04:05,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 09:04:08,608 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 09:04:08,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:04:08,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:04:08,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:04:11,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:11,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:11,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 09:04:14,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:04:14,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:04:15,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:04:16,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:04:17,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:04:20,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:04:21,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 09:04:24,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:04:24,463 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:04:24,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:04:24,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:04:25,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:04:29,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:30,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:04:32,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:04:32,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:04:34,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:04:35,621 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 2.012e+02 2.248e+02 2.524e+02 3.712e+02, threshold=4.497e+02, percent-clipped=0.0 2023-10-04 09:04:42,568 INFO [train.py:1046] (2/4) Epoch 46, batch 900, loss[loss=0.1586, simple_loss=0.2274, pruned_loss=0.0449, over 23826.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2342, pruned_loss=0.03694, over 4662769.76 frames. ], batch size: 195, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:04:42,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:04:42,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:04:42,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 09:04:44,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:04:44,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:04:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 09:04:49,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.56 vs. limit=15.0 2023-10-04 09:04:51,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:04:53,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1599640.0, ans=0.125 2023-10-04 09:04:55,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:04:55,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 09:04:58,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:04:58,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 09:04:58,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1599706.6666666667, ans=10.0 2023-10-04 09:04:59,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:05:01,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:05:01,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:01,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:05:01,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:05:10,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:10,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:05:11,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:05:13,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:18,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 09:05:19,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:05:23,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:05:24,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:05:24,937 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 09:05:26,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 09:05:32,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:05:32,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:05:32,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:05:37,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:37,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:05:40,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 09:05:40,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1599906.6666666667, ans=0.2 2023-10-04 09:05:41,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:42,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1599906.6666666667, ans=0.0 2023-10-04 09:05:43,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 09:05:44,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:05:46,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:47,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:05:47,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:05:49,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1599906.6666666667, ans=0.125 2023-10-04 09:05:51,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 09:05:51,863 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 09:05:53,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 09:05:53,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 09:05:54,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1599973.3333333333, ans=0.0 2023-10-04 09:05:55,827 INFO [train.py:1046] (2/4) Epoch 46, batch 950, loss[loss=0.1484, simple_loss=0.2402, pruned_loss=0.02836, over 24585.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2349, pruned_loss=0.03722, over 4661445.79 frames. ], batch size: 71, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:05:56,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1599973.3333333333, ans=0.0 2023-10-04 09:05:57,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:06:04,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 09:06:07,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:10,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:10,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:10,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:06:13,486 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 09:06:18,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:18,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:06:18,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:19,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:06:19,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 09:06:19,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:06:19,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1600040.0, ans=0.125 2023-10-04 09:06:21,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:22,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 09:06:22,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1600040.0, ans=0.125 2023-10-04 09:06:23,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:06:27,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:27,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:06:27,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1600106.6666666667, ans=0.2 2023-10-04 09:06:28,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:06:29,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 09:06:32,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 09:06:34,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:06:36,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:06:40,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:06:40,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:42,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 09:06:44,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 09:06:44,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:06:45,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:06:47,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:47,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:06:51,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 09:06:51,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:06:54,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:06:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:54,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 09:06:54,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:54,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:06:55,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 09:06:59,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:07:02,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:07:05,366 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.058e+02 2.335e+02 2.951e+02 5.020e+02, threshold=4.671e+02, percent-clipped=3.0 2023-10-04 09:07:06,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:07:08,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 09:07:08,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 09:07:12,944 INFO [train.py:1046] (2/4) Epoch 46, batch 1000, loss[loss=0.1463, simple_loss=0.2221, pruned_loss=0.03527, over 23560.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.234, pruned_loss=0.03679, over 4684971.46 frames. ], batch size: 134, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:07:13,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:07:17,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 09:07:18,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:21,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:07:21,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 09:07:21,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 09:07:28,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:28,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:07:30,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:30,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1600373.3333333333, ans=0.1 2023-10-04 09:07:33,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 09:07:36,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 09:07:38,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 09:07:38,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:07:38,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 09:07:39,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 09:07:39,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 09:07:41,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:41,916 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:07:43,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:44,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1600440.0, ans=0.0 2023-10-04 09:07:47,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1600440.0, ans=0.125 2023-10-04 09:07:51,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:51,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:07:54,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:54,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:54,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 09:07:54,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:07:55,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:07:57,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:57,239 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 09:07:58,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1600506.6666666667, ans=0.2 2023-10-04 09:07:58,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1600506.6666666667, ans=0.1 2023-10-04 09:08:00,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 09:08:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 09:08:02,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 09:08:05,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:08:11,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:11,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:08:11,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:13,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.23 vs. limit=15.0 2023-10-04 09:08:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:08:15,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 09:08:16,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:08:17,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 09:08:18,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 09:08:19,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:08:19,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:08:22,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:08:24,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:08:25,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:08:26,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1600640.0, ans=0.1 2023-10-04 09:08:27,226 INFO [train.py:1046] (2/4) Epoch 46, batch 1050, loss[loss=0.1618, simple_loss=0.2498, pruned_loss=0.03691, over 24382.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2322, pruned_loss=0.03606, over 4694738.14 frames. ], batch size: 77, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:08:28,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:08:28,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:08:30,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1600640.0, ans=0.0 2023-10-04 09:08:31,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 09:08:31,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:33,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:08:34,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:08:36,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:08:39,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:08:39,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:08:40,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:08:40,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:08:42,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 09:08:43,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:08:43,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 09:08:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:08:47,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 09:08:47,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:08:51,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:53,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:08:53,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:08:55,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1600773.3333333333, ans=0.125 2023-10-04 09:08:56,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 09:08:56,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 09:08:56,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:09:01,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 09:09:02,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 09:09:04,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:07,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 09:09:10,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:09:10,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:09:12,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:09:16,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:09:18,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 09:09:21,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 09:09:21,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 09:09:21,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:09:22,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.61 vs. limit=22.5 2023-10-04 09:09:23,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:09:23,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1600840.0, ans=0.125 2023-10-04 09:09:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 09:09:27,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:09:29,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:09:29,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:09:30,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:09:30,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:35,599 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.988e+02 2.157e+02 2.390e+02 3.023e+02, threshold=4.315e+02, percent-clipped=0.0 2023-10-04 09:09:35,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:37,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 09:09:37,496 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.30 vs. limit=15.0 2023-10-04 09:09:38,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:09:38,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 09:09:39,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 09:09:39,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:09:41,060 INFO [train.py:1046] (2/4) Epoch 46, batch 1100, loss[loss=0.1664, simple_loss=0.2493, pruned_loss=0.04173, over 24347.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2319, pruned_loss=0.03585, over 4704320.02 frames. ], batch size: 77, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:09:41,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1600973.3333333333, ans=0.1 2023-10-04 09:09:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:09:46,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1600973.3333333333, ans=0.1 2023-10-04 09:09:48,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:09:54,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:09:55,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:09:55,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:09:55,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 09:09:57,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:00,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 09:10:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:10:06,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:10:06,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 09:10:08,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:10:08,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:10:08,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:10:08,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1601040.0, ans=0.035 2023-10-04 09:10:10,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:10:12,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:10:17,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1601106.6666666667, ans=0.125 2023-10-04 09:10:18,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:10:19,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 09:10:20,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.88 vs. limit=15.0 2023-10-04 09:10:21,207 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 09:10:21,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:22,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:24,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:10:24,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:10:26,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 09:10:26,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:10:26,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:10:26,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:10:27,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:28,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 09:10:34,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:10:34,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 09:10:36,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:10:40,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:10:42,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 09:10:42,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:10:43,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:47,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1601240.0, ans=0.1 2023-10-04 09:10:48,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:10:48,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:49,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 09:10:49,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1601240.0, ans=0.125 2023-10-04 09:10:50,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:10:50,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:52,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 09:10:52,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:10:53,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 09:10:54,823 INFO [train.py:1046] (2/4) Epoch 46, batch 1150, loss[loss=0.1247, simple_loss=0.2033, pruned_loss=0.02302, over 24291.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2319, pruned_loss=0.03563, over 4696826.65 frames. ], batch size: 56, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:10:54,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:10:54,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:10:55,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:10:59,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:01,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:11:04,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:11:04,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:11:04,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 09:11:06,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:11:08,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 09:11:10,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:11,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:11:11,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1601373.3333333333, ans=0.0 2023-10-04 09:11:15,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 09:11:17,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:11:21,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:21,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:21,702 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:11:22,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 09:11:22,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:11:22,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:11:27,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 09:11:27,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:11:29,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:11:31,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1601440.0, ans=0.04949747468305833 2023-10-04 09:11:40,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:44,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1601506.6666666667, ans=0.1 2023-10-04 09:11:45,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:45,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 09:11:46,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:11:47,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:11:53,481 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 09:11:54,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:00,489 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 09:12:01,744 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.018e+02 2.194e+02 2.403e+02 3.349e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-04 09:12:05,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:06,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:12:06,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:12:06,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:12:08,529 INFO [train.py:1046] (2/4) Epoch 46, batch 1200, loss[loss=0.1681, simple_loss=0.2437, pruned_loss=0.04627, over 23548.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2328, pruned_loss=0.03581, over 4708945.06 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:12:11,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:12:15,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:12:15,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:12:16,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:16,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:16,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:12:18,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.19 vs. limit=15.0 2023-10-04 09:12:19,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:12:21,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:12:21,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:12:22,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:24,428 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 09:12:27,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 09:12:28,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1601706.6666666667, ans=0.2 2023-10-04 09:12:29,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:12:30,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1601706.6666666667, ans=0.0 2023-10-04 09:12:32,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:12:34,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:37,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:12:37,856 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 09:12:37,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:38,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1601773.3333333333, ans=0.0 2023-10-04 09:12:45,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:12:45,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:12:45,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 09:12:45,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:12:45,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1601773.3333333333, ans=0.125 2023-10-04 09:12:47,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.05 vs. limit=22.5 2023-10-04 09:12:48,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 09:12:50,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1601773.3333333333, ans=0.125 2023-10-04 09:12:52,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 09:12:52,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:53,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:54,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:12:54,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1601840.0, ans=0.0 2023-10-04 09:12:55,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:12:55,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:55,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:12:57,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:12:58,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 09:12:58,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:12:58,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:12:58,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:13:01,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:13:01,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:13:04,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:13:07,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:13:09,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 09:13:13,784 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 09:13:15,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:13:16,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:13:17,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:13:19,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:13:21,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 09:13:22,537 INFO [train.py:1046] (2/4) Epoch 46, batch 1250, loss[loss=0.1453, simple_loss=0.2297, pruned_loss=0.03048, over 24546.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.03618, over 4710182.51 frames. ], batch size: 60, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:13:25,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:13:27,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:13:27,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 09:13:29,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:13:31,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:13:35,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:13:35,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:13:37,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:13:37,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:13:40,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:13:44,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:13:44,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:13:44,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:13:46,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:13:47,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:13:51,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:13:52,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:13:56,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 09:13:57,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:13:59,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:14:00,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 09:14:00,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:14:00,979 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 09:14:02,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:02,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:06,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:14:09,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:14:11,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:14:11,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1602173.3333333333, ans=0.0 2023-10-04 09:14:13,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 09:14:13,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 09:14:13,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 09:14:15,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:14:17,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1602173.3333333333, ans=0.1 2023-10-04 09:14:18,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 09:14:18,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:21,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 09:14:21,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:14:22,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 09:14:22,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:14:23,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:14:25,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:14:25,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:14:26,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 09:14:28,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:14:29,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:14:31,291 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.091e+02 2.329e+02 2.674e+02 3.922e+02, threshold=4.659e+02, percent-clipped=0.0 2023-10-04 09:14:31,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:14:34,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:14:35,887 INFO [train.py:1046] (2/4) Epoch 46, batch 1300, loss[loss=0.1498, simple_loss=0.2192, pruned_loss=0.04025, over 22638.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2348, pruned_loss=0.03684, over 4704302.59 frames. ], batch size: 322, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:14:36,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:14:36,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 09:14:41,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:14:42,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=1602306.6666666667, ans=15.0 2023-10-04 09:14:43,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:14:45,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:14:46,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:46,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:14:48,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 09:14:52,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1602373.3333333333, ans=0.0 2023-10-04 09:14:54,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:14:55,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:14:58,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 09:14:59,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:15:03,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:04,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:15:05,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:15:07,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:08,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:15:08,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1602440.0, ans=0.125 2023-10-04 09:15:09,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:15:10,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 09:15:16,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:15:16,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:15:18,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 09:15:18,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:15:19,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:15:19,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1602506.6666666667, ans=0.125 2023-10-04 09:15:20,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-10-04 09:15:21,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:15:22,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 09:15:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:15:24,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 09:15:25,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:15:29,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:15:29,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:15:29,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1602506.6666666667, ans=0.125 2023-10-04 09:15:32,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 09:15:33,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 09:15:35,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 09:15:39,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:15:43,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 09:15:44,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:50,503 INFO [train.py:1046] (2/4) Epoch 46, batch 1350, loss[loss=0.1184, simple_loss=0.1838, pruned_loss=0.02649, over 22721.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.234, pruned_loss=0.03696, over 4694695.64 frames. ], batch size: 322, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:15:50,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1602640.0, ans=0.0 2023-10-04 09:15:52,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 09:15:55,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:15:58,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:00,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:16:00,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:16:02,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:16:02,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:16:05,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:16:05,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 09:16:07,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1602706.6666666667, ans=0.0 2023-10-04 09:16:08,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:16:10,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:16:11,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1602706.6666666667, ans=0.2 2023-10-04 09:16:13,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 09:16:14,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:16:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:16:14,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 09:16:14,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1602706.6666666667, ans=0.1 2023-10-04 09:16:15,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 09:16:17,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 09:16:19,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:19,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 09:16:29,812 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.66 vs. limit=15.0 2023-10-04 09:16:32,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:40,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:40,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:16:41,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1602840.0, ans=0.2 2023-10-04 09:16:42,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 09:16:43,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:16:46,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 09:16:46,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:16:46,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:16:47,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:16:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 09:16:53,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:16:59,673 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.063e+02 2.245e+02 2.755e+02 4.029e+02, threshold=4.491e+02, percent-clipped=0.0 2023-10-04 09:16:59,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 09:17:00,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1602906.6666666667, ans=0.125 2023-10-04 09:17:01,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 09:17:03,911 INFO [train.py:1046] (2/4) Epoch 46, batch 1400, loss[loss=0.1465, simple_loss=0.2262, pruned_loss=0.03344, over 23445.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.234, pruned_loss=0.03667, over 4709800.60 frames. ], batch size: 120, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:17:05,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 09:17:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:17:09,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1602973.3333333333, ans=0.125 2023-10-04 09:17:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:17:11,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:17:17,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 09:17:17,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 09:17:27,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:17:29,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:17:29,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:17:31,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:17:34,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:17:35,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 09:17:44,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:17:44,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:17:49,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 09:17:50,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:17:52,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:17:52,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:17:52,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:17:52,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1603173.3333333333, ans=0.125 2023-10-04 09:17:53,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:17:53,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:17:53,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:17:55,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 09:17:55,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:17:58,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:01,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:18:10,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 09:18:11,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 09:18:12,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:18:14,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 09:18:15,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:17,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:18:17,959 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:18:19,018 INFO [train.py:1046] (2/4) Epoch 46, batch 1450, loss[loss=0.1569, simple_loss=0.2488, pruned_loss=0.03247, over 24632.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2336, pruned_loss=0.03659, over 4702971.14 frames. ], batch size: 68, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:18:19,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:18:20,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:18:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:22,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 09:18:26,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:26,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:18:28,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:18:29,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 09:18:31,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:18:32,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 09:18:32,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:32,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 09:18:34,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:18:34,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:18:35,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 09:18:35,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:37,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:18:38,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:40,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:43,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:18:43,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:18:44,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:45,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:47,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1603440.0, ans=0.5 2023-10-04 09:18:48,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:48,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:18:49,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:49,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:18:54,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 09:18:57,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:19:00,792 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 09:19:00,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:19:02,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:19:02,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:03,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 09:19:08,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:09,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 09:19:10,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 09:19:11,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1603506.6666666667, ans=0.125 2023-10-04 09:19:14,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:19,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:19:19,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:19:22,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 09:19:25,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 09:19:25,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 09:19:27,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:28,537 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.040e+02 2.275e+02 2.758e+02 4.535e+02, threshold=4.550e+02, percent-clipped=1.0 2023-10-04 09:19:28,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:19:33,286 INFO [train.py:1046] (2/4) Epoch 46, batch 1500, loss[loss=0.1487, simple_loss=0.2321, pruned_loss=0.03261, over 24617.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.0363, over 4702340.79 frames. ], batch size: 60, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:19:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 09:19:38,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:19:38,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:19:40,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:40,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:19:41,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:19:42,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 09:19:43,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:19:43,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:19:43,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:19:43,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1603640.0, ans=0.125 2023-10-04 09:19:44,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:19:46,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:19:48,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:19:48,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1603706.6666666667, ans=0.125 2023-10-04 09:19:51,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:19:51,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 09:19:52,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:19:52,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:19:52,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:57,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 09:20:01,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 09:20:03,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:20:03,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 09:20:04,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1603773.3333333333, ans=0.2 2023-10-04 09:20:06,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:20:09,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:20:09,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:20:09,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:20:12,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 09:20:12,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:20:12,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:20:13,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 09:20:13,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:20:18,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:20:18,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 09:20:18,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1603840.0, ans=0.125 2023-10-04 09:20:23,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:20:25,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:20:29,862 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 09:20:29,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:29,912 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 09:20:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:20:31,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:20:32,822 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 09:20:34,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:20:36,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 09:20:37,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:40,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:20:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:40,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:20:42,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:42,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:20:44,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 09:20:44,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 09:20:44,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:20:46,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 09:20:46,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 09:20:48,077 INFO [train.py:1046] (2/4) Epoch 46, batch 1550, loss[loss=0.1544, simple_loss=0.2368, pruned_loss=0.03603, over 24484.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.0364, over 4713670.80 frames. ], batch size: 66, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:20:49,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:20:50,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:50,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:20:52,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:20:53,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:53,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:57,109 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 09:20:57,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:20:57,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:20:58,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:21:00,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:21:00,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 09:21:02,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:21:02,795 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 09:21:04,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 09:21:04,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 09:21:04,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:09,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:21:11,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 09:21:11,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 09:21:20,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:21,358 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.25 vs. limit=15.0 2023-10-04 09:21:24,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:21:24,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:21:25,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:21:26,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 09:21:31,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:21:32,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:35,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:21:38,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:21:38,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:38,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 09:21:38,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:21:41,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:21:42,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:43,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 09:21:43,395 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 09:21:44,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:21:50,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 09:21:56,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:21:57,463 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.703e+02 2.083e+02 2.296e+02 2.597e+02 3.892e+02, threshold=4.592e+02, percent-clipped=0.0 2023-10-04 09:21:57,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:57,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 09:21:58,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:22:00,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:22:00,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:22:00,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:22:00,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:22:02,116 INFO [train.py:1046] (2/4) Epoch 46, batch 1600, loss[loss=0.159, simple_loss=0.246, pruned_loss=0.03599, over 24572.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2351, pruned_loss=0.03642, over 4711746.57 frames. ], batch size: 71, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:22:04,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:05,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 09:22:06,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 09:22:06,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 09:22:09,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:22:11,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 09:22:12,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:22:15,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:22:15,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1604373.3333333333, ans=0.125 2023-10-04 09:22:18,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:22:21,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 09:22:25,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:22:25,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 09:22:25,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:27,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 09:22:31,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 09:22:37,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:22:39,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 09:22:40,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:22:41,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:22:41,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:22:41,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1604440.0, ans=0.2 2023-10-04 09:22:44,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 09:22:48,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 09:22:49,457 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.95 vs. limit=15.0 2023-10-04 09:22:51,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:22:51,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:52,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:52,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:22:55,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:22:55,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:22:58,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:23:03,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:23:03,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:23:07,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 09:23:07,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:23:07,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 09:23:13,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:23:14,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:23:15,853 INFO [train.py:1046] (2/4) Epoch 46, batch 1650, loss[loss=0.1643, simple_loss=0.2476, pruned_loss=0.0405, over 23407.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2353, pruned_loss=0.03664, over 4713277.91 frames. ], batch size: 93, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:23:15,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:23:15,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 09:23:15,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 09:23:15,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 09:23:17,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 09:23:22,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:23:22,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:23:22,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:23:22,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:23:23,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:23:26,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 09:23:29,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:23:29,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:23:29,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:23:29,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:23:29,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 09:23:30,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 09:23:30,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1604706.6666666667, ans=0.125 2023-10-04 09:23:34,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:23:38,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:23:39,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1604706.6666666667, ans=0.125 2023-10-04 09:23:44,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 09:23:46,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:23:48,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 09:23:51,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:23:53,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1604773.3333333333, ans=0.025 2023-10-04 09:23:54,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:23:54,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:23:54,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:23:55,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:23:55,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:00,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:00,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:00,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:24:01,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:24:01,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:01,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:24:04,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:24:07,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 09:24:08,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:24:09,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 09:24:11,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 09:24:11,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 09:24:11,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:13,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:24:13,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:24:13,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:13,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 09:24:18,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:24:19,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:24:19,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:24:22,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 09:24:25,065 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.077e+02 2.252e+02 2.648e+02 5.011e+02, threshold=4.504e+02, percent-clipped=3.0 2023-10-04 09:24:26,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:24:26,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:24:26,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 09:24:26,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:24:26,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:24:26,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:29,297 INFO [train.py:1046] (2/4) Epoch 46, batch 1700, loss[loss=0.1512, simple_loss=0.2312, pruned_loss=0.03558, over 24302.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2352, pruned_loss=0.03673, over 4719710.34 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:24:29,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:24:29,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:24:29,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 09:24:30,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:24:38,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:40,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1604973.3333333333, ans=0.0 2023-10-04 09:24:41,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:24:43,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1605040.0, ans=0.5 2023-10-04 09:24:45,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1605040.0, ans=0.0 2023-10-04 09:24:47,049 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:24:48,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:24:48,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:24:48,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-10-04 09:24:49,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:24:49,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:24:49,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1605040.0, ans=0.125 2023-10-04 09:24:52,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 09:24:53,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:24:55,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:56,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:24:56,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1605040.0, ans=0.125 2023-10-04 09:24:57,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:24:59,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 09:24:59,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1605106.6666666667, ans=0.125 2023-10-04 09:25:00,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 09:25:03,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:04,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 09:25:06,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:25:14,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:16,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:16,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:25:19,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:25:19,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 09:25:19,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:25:22,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.07 vs. limit=15.0 2023-10-04 09:25:22,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:22,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 09:25:22,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:25:22,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:25:22,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:22,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:25:24,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1605173.3333333333, ans=0.125 2023-10-04 09:25:25,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:25:25,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:25:27,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:28,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:25:28,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:32,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:25:33,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 09:25:35,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:36,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:25:39,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 09:25:42,509 INFO [train.py:1046] (2/4) Epoch 46, batch 1750, loss[loss=0.1499, simple_loss=0.2193, pruned_loss=0.04029, over 23656.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2334, pruned_loss=0.03638, over 4732150.66 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:25:45,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-10-04 09:25:47,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:50,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:25:50,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:25:52,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 09:25:52,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:54,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:25:54,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:57,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 09:25:58,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1605373.3333333333, ans=0.1 2023-10-04 09:25:59,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:01,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 09:26:01,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:26:02,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:26:05,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:26:08,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 09:26:09,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:26:09,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 09:26:17,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:26:22,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:26:22,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:26:24,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:24,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:26:26,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:26:27,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:28,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1605506.6666666667, ans=0.0 2023-10-04 09:26:30,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:26:31,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:26:32,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 09:26:34,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:26:34,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1605506.6666666667, ans=0.2 2023-10-04 09:26:36,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 09:26:37,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:26:39,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:40,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:26:42,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1605573.3333333333, ans=0.0 2023-10-04 09:26:43,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:26:44,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 09:26:45,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:47,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:26:53,140 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.053e+02 2.343e+02 2.960e+02 5.357e+02, threshold=4.686e+02, percent-clipped=4.0 2023-10-04 09:26:53,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:55,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:26:56,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:26:56,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1605640.0, ans=0.2 2023-10-04 09:26:57,937 INFO [train.py:1046] (2/4) Epoch 46, batch 1800, loss[loss=0.1581, simple_loss=0.2466, pruned_loss=0.03482, over 24655.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03609, over 4737516.65 frames. ], batch size: 73, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:26:58,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 09:26:58,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:26:58,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:26:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:26:58,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:26:59,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:26:59,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:27:02,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:27:02,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:27:04,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:27:06,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:27:09,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:27:09,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:27:12,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:27:15,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:15,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:17,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:27:20,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:27:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 09:27:20,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:23,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:28,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 09:27:30,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 09:27:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 09:27:30,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:27:31,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:31,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:27:32,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:27:38,078 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 09:27:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:27:41,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:42,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 09:27:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 09:27:44,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:27:45,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:27:45,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:27:51,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 09:27:58,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:27:58,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 09:27:58,920 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.51 vs. limit=22.5 2023-10-04 09:27:59,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:27:59,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:00,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:28:00,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 09:28:05,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:28:05,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:28:06,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 09:28:06,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:08,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1605906.6666666667, ans=0.125 2023-10-04 09:28:09,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:28:10,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:28:10,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:28:11,938 INFO [train.py:1046] (2/4) Epoch 46, batch 1850, loss[loss=0.1853, simple_loss=0.25, pruned_loss=0.06028, over 19468.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2337, pruned_loss=0.03635, over 4732786.73 frames. ], batch size: 388, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:28:12,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:28:12,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:28:14,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:28:14,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:28:15,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:28:16,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:28:18,731 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:28:23,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:28:23,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 09:28:25,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1605973.3333333333, ans=0.0 2023-10-04 09:28:27,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 09:28:30,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 09:28:33,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:28:33,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 09:28:33,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 09:28:37,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1606040.0, ans=0.125 2023-10-04 09:28:43,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:28:46,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 09:28:50,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:28:50,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:28:51,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1606106.6666666667, ans=0.125 2023-10-04 09:28:55,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 09:28:55,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:55,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:28:58,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:29:00,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:29:01,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:29:04,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:29:06,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:06,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:29:06,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:07,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:29:08,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:29:11,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 09:29:13,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:29:14,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:29:16,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:29:16,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 09:29:16,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 09:29:19,084 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 09:29:19,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 09:29:20,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:29:20,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:29:20,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:29:20,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:20,758 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 09:29:20,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1606240.0, ans=10.0 2023-10-04 09:29:21,808 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 1.971e+02 2.198e+02 2.495e+02 3.601e+02, threshold=4.397e+02, percent-clipped=0.0 2023-10-04 09:29:21,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:29:21,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:23,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:29:23,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:29:26,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:29:26,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 09:29:27,239 INFO [train.py:1046] (2/4) Epoch 46, batch 1900, loss[loss=0.1573, simple_loss=0.2395, pruned_loss=0.03751, over 24095.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2333, pruned_loss=0.03607, over 4721515.85 frames. ], batch size: 86, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:29:28,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:28,707 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 09:29:29,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:29:30,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:34,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:29:36,258 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 09:29:37,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 09:29:37,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1606306.6666666667, ans=0.125 2023-10-04 09:29:38,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:29:38,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:29:39,005 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 09:29:39,029 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 09:29:42,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 09:29:44,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:29:49,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 09:29:52,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 09:30:03,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 09:30:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 09:30:04,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:06,063 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 09:30:06,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 09:30:06,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 09:30:07,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 09:30:07,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:30:11,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 09:30:13,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:30:16,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:30:16,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 09:30:17,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:30:22,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.19 vs. limit=15.0 2023-10-04 09:30:23,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 09:30:23,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:30:29,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:30:29,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:30:29,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:30:29,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:30:30,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1606573.3333333333, ans=0.0 2023-10-04 09:30:31,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:30:32,254 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.20 vs. limit=12.0 2023-10-04 09:30:32,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:30:32,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:30:35,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:30:35,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:30:35,886 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:30:38,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:30:38,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:30:39,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:30:39,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1606640.0, ans=0.05 2023-10-04 09:30:41,081 INFO [train.py:1046] (2/4) Epoch 46, batch 1950, loss[loss=0.1485, simple_loss=0.2319, pruned_loss=0.03254, over 24480.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03667, over 4715110.02 frames. ], batch size: 63, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:30:41,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:30:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:30:45,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:30:45,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:46,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:30:48,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 09:30:50,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:30:51,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:51,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:54,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:30:56,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:30:56,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:30:57,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:31:00,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:31:00,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:31:00,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:31:02,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:05,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:07,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1606706.6666666667, ans=0.0 2023-10-04 09:31:08,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:31:08,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:31:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 09:31:08,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:31:09,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:31:09,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:13,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:16,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:31:18,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:31:21,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:31:21,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:31:22,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 09:31:22,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:31:23,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1606773.3333333333, ans=0.125 2023-10-04 09:31:27,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:31:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:31:30,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:31:31,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1606840.0, ans=0.1 2023-10-04 09:31:38,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:39,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:42,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:43,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:46,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:31:46,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:46,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 09:31:46,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:31:47,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:49,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 09:31:50,522 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.065e+02 2.286e+02 2.732e+02 4.457e+02, threshold=4.573e+02, percent-clipped=1.0 2023-10-04 09:31:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:31:55,467 INFO [train.py:1046] (2/4) Epoch 46, batch 2000, loss[loss=0.1475, simple_loss=0.2213, pruned_loss=0.0369, over 23650.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2358, pruned_loss=0.03736, over 4702230.02 frames. ], batch size: 149, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:31:55,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:31:55,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:31:56,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:31:58,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:31:59,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:02,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 09:32:02,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:32:06,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:32:07,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 09:32:08,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:32:10,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:32:11,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:32:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 09:32:13,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-10-04 09:32:14,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 09:32:18,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:32:19,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 09:32:19,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:32:23,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:32:25,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:32:25,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:25,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:32:26,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:32:27,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 09:32:28,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1607106.6666666667, ans=0.125 2023-10-04 09:32:28,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1607106.6666666667, ans=0.125 2023-10-04 09:32:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 09:32:30,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:32:30,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:35,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1607106.6666666667, ans=0.0 2023-10-04 09:32:38,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:39,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:32:39,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:32:41,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:32:42,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:32:42,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:42,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:32:42,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:44,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:46,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1607173.3333333333, ans=0.125 2023-10-04 09:32:48,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:32:48,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 09:32:52,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:32:53,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:59,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:59,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:33:01,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1607240.0, ans=0.1 2023-10-04 09:33:02,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:04,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:33:04,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:04,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:33:05,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:33:08,409 INFO [train.py:1046] (2/4) Epoch 46, batch 2050, loss[loss=0.1742, simple_loss=0.2435, pruned_loss=0.05247, over 23760.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.03746, over 4691556.57 frames. ], batch size: 179, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:33:08,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:10,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:12,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:33:14,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:14,973 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.94 vs. limit=15.0 2023-10-04 09:33:18,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:33:20,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:33:20,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:21,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:33:22,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 09:33:22,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:33:24,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:33:25,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:33:37,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:33:37,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:39,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 09:33:41,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:43,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 09:33:43,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:33:44,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:33:46,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:33:47,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:33:47,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:33:50,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:33:52,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:33:52,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:33:54,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:33:55,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:33:59,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:33:59,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:34:03,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:34:05,702 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.76 vs. limit=8.0 2023-10-04 09:34:08,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:34:09,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 09:34:15,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:34:16,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:34:18,172 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.010e+02 2.178e+02 2.574e+02 3.822e+02, threshold=4.355e+02, percent-clipped=0.0 2023-10-04 09:34:18,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:34:18,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1607573.3333333333, ans=0.125 2023-10-04 09:34:19,192 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.10 vs. limit=22.5 2023-10-04 09:34:19,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 09:34:21,979 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.20 vs. limit=15.0 2023-10-04 09:34:22,378 INFO [train.py:1046] (2/4) Epoch 46, batch 2100, loss[loss=0.1576, simple_loss=0.2462, pruned_loss=0.03444, over 24438.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03675, over 4707919.15 frames. ], batch size: 69, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:34:23,818 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 09:34:23,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:23,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1607640.0, ans=0.125 2023-10-04 09:34:25,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:34:25,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:34:26,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:34:26,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 09:34:26,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 09:34:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:34:31,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:34:33,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:34:35,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:35,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:34:35,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 09:34:37,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:34:37,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 09:34:37,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 09:34:40,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:34:40,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:34:40,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 09:34:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 09:34:46,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 09:34:46,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:34:48,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1607706.6666666667, ans=0.0 2023-10-04 09:34:49,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:34:50,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:34:53,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:34:54,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 09:34:54,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:34:54,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 09:34:57,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 09:34:57,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:57,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 09:34:57,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 09:34:57,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 09:34:58,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.89 vs. limit=6.0 2023-10-04 09:35:00,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:35:02,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:35:05,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:35:05,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1607840.0, ans=0.125 2023-10-04 09:35:06,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:35:07,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:11,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:11,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 09:35:11,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:11,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:12,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:12,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 09:35:13,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 09:35:13,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 09:35:18,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:35:20,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:35:20,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 09:35:21,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1607906.6666666667, ans=0.125 2023-10-04 09:35:23,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1607906.6666666667, ans=0.125 2023-10-04 09:35:26,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:29,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:35:29,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:35:29,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:35:29,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 09:35:31,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:35:32,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:32,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:35:33,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:35:33,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:35,163 INFO [train.py:1046] (2/4) Epoch 46, batch 2150, loss[loss=0.1385, simple_loss=0.2221, pruned_loss=0.02744, over 24580.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.03617, over 4719570.26 frames. ], batch size: 60, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:35:35,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 09:35:36,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 09:35:36,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:35:39,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:39,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:35:41,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:35:41,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:35:45,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:35:48,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:35:49,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:51,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:35:51,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:35:51,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:35:52,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1608040.0, ans=0.125 2023-10-04 09:35:54,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:54,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:35:54,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:35:58,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:35:58,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 09:36:03,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:06,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:36:07,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:07,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:08,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:08,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:36:08,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:36:08,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:36:10,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:36:10,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 09:36:13,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:36:14,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:15,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:15,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:36:17,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:36:18,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1608173.3333333333, ans=0.125 2023-10-04 09:36:19,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:19,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:36:21,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:21,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 09:36:21,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:36:23,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:25,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:26,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:26,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:36:27,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:29,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:29,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 09:36:31,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 09:36:31,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:36:33,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 09:36:33,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:34,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:36:34,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 09:36:34,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:36:34,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 09:36:36,188 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 09:36:36,189 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 09:36:36,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 09:36:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:38,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:36:38,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:36:39,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:40,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:36:40,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:40,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:43,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1608240.0, ans=0.05 2023-10-04 09:36:48,293 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.972e+02 2.248e+02 2.510e+02 3.914e+02, threshold=4.495e+02, percent-clipped=0.0 2023-10-04 09:36:48,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:36:48,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 09:36:49,741 INFO [train.py:1046] (2/4) Epoch 46, batch 2200, loss[loss=0.1693, simple_loss=0.2551, pruned_loss=0.04178, over 23979.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2333, pruned_loss=0.036, over 4721437.56 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:36:52,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:36:55,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:56,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:36:56,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:58,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:36:58,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.43 vs. limit=15.0 2023-10-04 09:37:00,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:37:02,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:37:02,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 09:37:05,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1608373.3333333333, ans=0.0 2023-10-04 09:37:06,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 09:37:08,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:37:09,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1608373.3333333333, ans=0.125 2023-10-04 09:37:15,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 09:37:18,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:37:19,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:37:19,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:37:25,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:37:25,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 09:37:29,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:37:30,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:37:31,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 09:37:33,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:37:35,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:37:36,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:37:39,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:40,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 09:37:40,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:42,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 09:37:46,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:46,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:37:46,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:37:47,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:37:47,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:47,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:48,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:37:49,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1608573.3333333333, ans=0.1 2023-10-04 09:37:50,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:37:51,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:37:54,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:37:54,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:37:55,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:37:57,353 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 09:38:00,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:38:00,113 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 09:38:01,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:38:01,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 09:38:03,662 INFO [train.py:1046] (2/4) Epoch 46, batch 2250, loss[loss=0.154, simple_loss=0.2479, pruned_loss=0.03001, over 24654.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.234, pruned_loss=0.03632, over 4716989.80 frames. ], batch size: 68, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:38:03,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:05,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:38:06,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:07,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 09:38:10,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:38:13,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:38:18,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:38:19,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:38:23,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:23,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:38:25,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:38:26,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 09:38:26,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:38:26,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:38:28,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 09:38:29,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:38:29,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:29,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1608706.6666666667, ans=0.125 2023-10-04 09:38:31,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1608773.3333333333, ans=0.0 2023-10-04 09:38:32,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:38:37,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:38:38,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:38:38,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:38:40,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 09:38:41,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:43,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:38:49,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:38:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:38:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:52,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:38:53,127 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-10-04 09:38:53,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:38:55,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:38:56,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:38:58,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.57 vs. limit=22.5 2023-10-04 09:38:59,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:39:05,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:39:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:39:05,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:39:12,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:39:15,085 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.142e+02 2.478e+02 2.835e+02 4.262e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-04 09:39:15,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:39:15,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 09:39:15,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:15,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:39:17,285 INFO [train.py:1046] (2/4) Epoch 46, batch 2300, loss[loss=0.1371, simple_loss=0.2181, pruned_loss=0.0281, over 20681.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03662, over 4717299.67 frames. ], batch size: 45, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:39:18,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 09:39:21,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:39:21,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:27,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:27,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:39:28,713 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 09:39:30,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:39:37,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:39:37,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:39:38,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:39:38,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:39:38,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 09:39:39,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1609040.0, ans=0.125 2023-10-04 09:39:40,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:39:42,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1609040.0, ans=0.0 2023-10-04 09:39:43,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:39:44,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:39:48,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:39:52,347 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.04 vs. limit=22.5 2023-10-04 09:39:52,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:39:54,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:39:58,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:39:59,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:40:01,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:40:04,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:40:07,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:40:08,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:40:08,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:40:08,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 09:40:12,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:40:12,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:12,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:12,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:40:13,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:40:13,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 09:40:13,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:40:15,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 09:40:15,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:40:15,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:15,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 09:40:23,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:40:27,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:40:27,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.36 vs. limit=15.0 2023-10-04 09:40:31,423 INFO [train.py:1046] (2/4) Epoch 46, batch 2350, loss[loss=0.16, simple_loss=0.2291, pruned_loss=0.04541, over 23766.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.03722, over 4712616.25 frames. ], batch size: 164, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:40:31,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:40:31,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:40:31,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:40:31,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1609306.6666666667, ans=0.125 2023-10-04 09:40:33,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:40:33,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:40:34,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:40:35,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 09:40:38,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1609306.6666666667, ans=0.0 2023-10-04 09:40:40,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1609306.6666666667, ans=0.0 2023-10-04 09:40:41,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:40:42,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 09:40:46,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 09:40:48,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:51,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:51,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:51,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:40:51,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:40:52,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 09:40:56,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:41:02,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 09:41:02,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:41:07,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:41:07,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:41:09,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:41:10,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 09:41:10,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1609440.0, ans=0.125 2023-10-04 09:41:11,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:41:12,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:41:12,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:41:12,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:41:17,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:41:20,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 09:41:20,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:41:23,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:41:23,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:41:24,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 09:41:26,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:41:27,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 09:41:27,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:41:32,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 09:41:35,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 09:41:35,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:41:35,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 09:41:35,364 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 09:41:35,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 09:41:38,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1609573.3333333333, ans=0.125 2023-10-04 09:41:39,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 09:41:42,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:41:43,485 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.219e+02 2.483e+02 2.970e+02 4.725e+02, threshold=4.966e+02, percent-clipped=0.0 2023-10-04 09:41:43,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1609640.0, ans=0.09899494936611666 2023-10-04 09:41:44,914 INFO [train.py:1046] (2/4) Epoch 46, batch 2400, loss[loss=0.152, simple_loss=0.2404, pruned_loss=0.03178, over 24540.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2361, pruned_loss=0.03718, over 4715972.37 frames. ], batch size: 71, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:41:46,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:41:49,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:41:50,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:41:51,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 09:41:51,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 09:41:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:41:57,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:41:59,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 09:41:59,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:42:00,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:00,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 09:42:06,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:08,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 09:42:08,335 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:42:14,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:42:17,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=15.0 2023-10-04 09:42:20,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 09:42:22,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:42:22,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:26,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:42:27,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 09:42:27,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:42:33,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:36,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:42:38,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:42:39,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:42:39,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:42:39,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:42:39,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:39,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:42:40,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:42:42,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:42:44,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:42:44,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 09:42:46,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 09:42:48,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:42:49,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:49,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 09:42:51,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 09:42:51,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 09:42:51,743 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 09:42:53,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 09:42:54,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:42:55,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:55,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:42:56,058 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 09:42:57,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:57,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:43:00,347 INFO [train.py:1046] (2/4) Epoch 46, batch 2450, loss[loss=0.1532, simple_loss=0.225, pruned_loss=0.04066, over 23823.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2345, pruned_loss=0.03719, over 4704746.18 frames. ], batch size: 179, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:43:00,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:43:00,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:43:03,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1609973.3333333333, ans=0.025 2023-10-04 09:43:05,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:05,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:05,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 09:43:09,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:43:09,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:12,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:43:12,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:43:12,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:43:14,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 09:43:19,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:21,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:43:21,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:43:24,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:43:25,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:25,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:27,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:43:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 09:43:29,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:43:37,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:37,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:37,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:43:37,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:43:38,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:39,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:43:40,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 09:43:44,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:44,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:43:48,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:43:48,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:43:54,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:43:54,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 09:43:55,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:43:55,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:43:55,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 09:43:57,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:43:57,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:44:01,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:44:02,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:44:04,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:44:07,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 09:44:08,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:44:10,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1610240.0, ans=0.125 2023-10-04 09:44:13,039 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.042e+02 2.322e+02 2.681e+02 4.445e+02, threshold=4.643e+02, percent-clipped=0.0 2023-10-04 09:44:14,566 INFO [train.py:1046] (2/4) Epoch 46, batch 2500, loss[loss=0.156, simple_loss=0.2437, pruned_loss=0.03416, over 24456.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2339, pruned_loss=0.03697, over 4696429.90 frames. ], batch size: 66, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:44:14,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:44:25,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:44:25,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1610306.6666666667, ans=0.1 2023-10-04 09:44:25,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1610306.6666666667, ans=0.125 2023-10-04 09:44:26,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:44:27,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:44:27,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 09:44:34,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:44:34,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1610373.3333333333, ans=0.1 2023-10-04 09:44:35,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:44:37,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:44:37,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 09:44:38,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 09:44:39,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:39,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:44:39,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 09:44:40,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:40,844 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 09:44:40,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:44:46,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:44:48,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:44:48,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1610440.0, ans=0.2 2023-10-04 09:44:50,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:44:50,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 09:44:52,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:44:54,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:01,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:04,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:45:04,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1610506.6666666667, ans=0.0 2023-10-04 09:45:10,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:45:11,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 09:45:13,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:45:13,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:45:14,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:45:14,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:45:16,050 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 09:45:16,051 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 09:45:16,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 09:45:19,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:45:20,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 09:45:20,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 09:45:21,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1610573.3333333333, ans=0.2 2023-10-04 09:45:22,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:45:23,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 09:45:26,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 09:45:28,207 INFO [train.py:1046] (2/4) Epoch 46, batch 2550, loss[loss=0.1545, simple_loss=0.231, pruned_loss=0.03903, over 23770.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2344, pruned_loss=0.03699, over 4702230.30 frames. ], batch size: 212, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:45:28,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:45:29,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:45:29,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:45:31,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:45:32,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 09:45:32,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:45:34,788 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-10-04 09:45:36,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 09:45:38,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:45:38,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1610640.0, ans=0.125 2023-10-04 09:45:40,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:42,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:45:42,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 09:45:44,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:45:44,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:45:44,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:45:47,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:45:47,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 09:45:48,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:45:48,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:48,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 09:45:59,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:46:04,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:06,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:06,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:46:06,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:46:12,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:46:15,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:46:15,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:46:15,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:46:15,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:46:15,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:46:18,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:18,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:24,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:46:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 09:46:25,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:46:27,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:29,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:46:30,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:46:31,245 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=12.0 2023-10-04 09:46:31,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:46:37,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1610906.6666666667, ans=0.0 2023-10-04 09:46:39,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:46:40,241 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 1.961e+02 2.101e+02 2.394e+02 3.747e+02, threshold=4.203e+02, percent-clipped=0.0 2023-10-04 09:46:40,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:46:41,748 INFO [train.py:1046] (2/4) Epoch 46, batch 2600, loss[loss=0.1545, simple_loss=0.2457, pruned_loss=0.0317, over 24441.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2345, pruned_loss=0.03646, over 4720436.17 frames. ], batch size: 69, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:46:43,237 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 09:46:47,818 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 09:46:47,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:46:47,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 09:46:47,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 09:46:47,962 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 09:46:51,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:51,386 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 09:46:51,491 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 09:46:52,780 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 09:46:55,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:46:56,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 09:46:59,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 09:47:01,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:47:01,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 09:47:04,136 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 09:47:04,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 09:47:10,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.51 vs. limit=22.5 2023-10-04 09:47:12,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:12,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:12,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:47:12,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 09:47:14,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:47:14,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1611106.6666666667, ans=0.0 2023-10-04 09:47:19,697 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 09:47:23,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:24,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:24,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 09:47:25,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:47:25,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:47:27,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 09:47:29,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1611173.3333333333, ans=0.0 2023-10-04 09:47:29,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.67 vs. limit=22.5 2023-10-04 09:47:30,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:47:30,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:47:32,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:47:35,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 09:47:35,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1611173.3333333333, ans=0.2 2023-10-04 09:47:36,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:47:36,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:47:38,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1611173.3333333333, ans=0.0 2023-10-04 09:47:42,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:47:42,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:47:42,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 09:47:43,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:45,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:47:47,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:47:50,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1611240.0, ans=0.125 2023-10-04 09:47:51,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 09:47:51,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:54,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.18 vs. limit=10.0 2023-10-04 09:47:54,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:47:56,005 INFO [train.py:1046] (2/4) Epoch 46, batch 2650, loss[loss=0.1548, simple_loss=0.2465, pruned_loss=0.03151, over 24462.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2354, pruned_loss=0.0373, over 4713324.29 frames. ], batch size: 66, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:47:57,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 09:47:57,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:59,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:47:59,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1611306.6666666667, ans=0.2 2023-10-04 09:48:00,618 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 09:48:00,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:04,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:48:06,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:48:07,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:48:09,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:48:10,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 09:48:10,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:48:12,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:48:14,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 09:48:16,811 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 09:48:19,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:48:21,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1611373.3333333333, ans=0.125 2023-10-04 09:48:23,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 09:48:23,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:23,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 09:48:28,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:28,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:48:28,625 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:28,857 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:48:29,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:30,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1611440.0, ans=0.125 2023-10-04 09:48:34,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 09:48:34,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 09:48:35,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:48:40,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 09:48:42,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:42,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:43,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:48:43,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:43,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:48:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:46,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:48:48,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:48:48,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:48:49,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:48:50,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:52,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:48:52,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:53,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:48:54,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:48:58,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:58,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:48:59,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=22.5 2023-10-04 09:48:59,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:59,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 09:49:01,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:49:03,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:03,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:03,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:04,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:49:04,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:07,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:49:07,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 09:49:09,084 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.031e+02 2.301e+02 2.644e+02 3.771e+02, threshold=4.602e+02, percent-clipped=0.0 2023-10-04 09:49:10,569 INFO [train.py:1046] (2/4) Epoch 46, batch 2700, loss[loss=0.1689, simple_loss=0.2591, pruned_loss=0.03937, over 24064.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2361, pruned_loss=0.0373, over 4716918.38 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:49:10,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:49:12,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 09:49:14,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:49:14,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:14,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:15,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:49:15,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:49:16,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:49:16,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:49:16,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 09:49:18,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:49:20,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:49:22,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:49:22,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:49:26,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 09:49:28,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:49:33,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:49:33,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:49:40,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:49:40,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:49:40,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:49:40,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:49:44,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:49:47,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:49:47,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:49:47,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:49:52,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:52,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:50:00,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:50:00,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:50:03,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:50:03,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:05,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1611840.0, ans=0.0 2023-10-04 09:50:07,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:50:08,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:09,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:50:11,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:11,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1611906.6666666667, ans=0.0 2023-10-04 09:50:13,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:50:13,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:50:16,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:50:16,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:50:16,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:50:20,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 09:50:20,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:23,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:50:23,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1611973.3333333333, ans=0.2 2023-10-04 09:50:24,547 INFO [train.py:1046] (2/4) Epoch 46, batch 2750, loss[loss=0.1576, simple_loss=0.244, pruned_loss=0.03562, over 24040.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2355, pruned_loss=0.03707, over 4707682.32 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:50:24,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 09:50:25,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 09:50:26,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:31,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:31,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:32,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:33,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:50:33,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:36,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:50:37,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:50:37,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:50:37,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:37,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 09:50:37,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:50:38,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:41,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1612040.0, ans=0.2 2023-10-04 09:50:43,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 09:50:45,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:50:46,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:48,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:50:48,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:50:49,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:51,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:50:52,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:52,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:55,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:50:55,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:50:56,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=15.0 2023-10-04 09:50:57,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:50:57,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:59,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:51:04,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:51:07,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:51:07,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:11,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:51:11,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:51:12,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:51:14,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1612173.3333333333, ans=0.125 2023-10-04 09:51:18,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:51:18,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:51:18,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 09:51:22,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-10-04 09:51:23,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:25,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 09:51:25,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1612240.0, ans=0.0 2023-10-04 09:51:29,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:51:31,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:51:31,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 09:51:32,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:51:33,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:51:33,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 09:51:35,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:51:37,870 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 1.982e+02 2.271e+02 2.856e+02 5.103e+02, threshold=4.543e+02, percent-clipped=1.0 2023-10-04 09:51:37,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 09:51:38,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:51:39,232 INFO [train.py:1046] (2/4) Epoch 46, batch 2800, loss[loss=0.1359, simple_loss=0.2024, pruned_loss=0.03466, over 23535.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2349, pruned_loss=0.03671, over 4706446.59 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:51:39,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:51:39,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 09:51:39,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:51:40,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:51:44,013 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 09:51:44,013 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 09:51:47,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:49,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:51:49,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:51:50,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1612306.6666666667, ans=0.125 2023-10-04 09:51:53,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:51:54,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 09:51:56,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:51:58,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 09:51:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:51:59,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:51:59,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:01,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:02,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:52:02,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:52:02,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:52:09,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:52:10,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:52:11,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1612440.0, ans=0.0 2023-10-04 09:52:12,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:14,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:52:14,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:21,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:52:21,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 09:52:21,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:52:22,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:22,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:52:26,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:52:26,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:26,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1612506.6666666667, ans=0.0 2023-10-04 09:52:30,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:52:32,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:52:32,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:32,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:52:33,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:52:33,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:52:35,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:52:35,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1612506.6666666667, ans=0.2 2023-10-04 09:52:36,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 09:52:36,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:52:37,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:52:38,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1612573.3333333333, ans=0.125 2023-10-04 09:52:39,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:52:41,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 09:52:42,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:42,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:52:42,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:52:45,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 09:52:46,915 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:52:51,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:53,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:52:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:52:54,410 INFO [train.py:1046] (2/4) Epoch 46, batch 2850, loss[loss=0.1413, simple_loss=0.228, pruned_loss=0.02727, over 24440.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.03637, over 4708405.59 frames. ], batch size: 63, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:52:54,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:52:59,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:52:59,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:00,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:53:03,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:03,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:53:03,755 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.24 vs. limit=22.5 2023-10-04 09:53:06,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:53:06,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 09:53:13,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 09:53:13,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:13,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1612706.6666666667, ans=0.125 2023-10-04 09:53:15,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 09:53:16,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:17,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 09:53:19,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 09:53:21,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:21,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1612773.3333333333, ans=0.0 2023-10-04 09:53:33,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:33,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:53:33,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:53:34,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:53:34,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:53:35,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:53:37,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:53:37,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 09:53:38,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:53:38,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:53:38,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:40,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:43,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:44,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:47,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:53:48,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:53:48,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:50,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:51,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1612906.6666666667, ans=0.1 2023-10-04 09:53:53,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:53:59,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:53:59,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 09:54:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 09:54:02,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:54:02,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:04,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 09:54:04,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:54:05,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:05,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:05,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:54:05,615 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 09:54:05,646 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 09:54:05,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:54:06,877 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.728e+02 2.024e+02 2.306e+02 2.714e+02 5.189e+02, threshold=4.613e+02, percent-clipped=2.0 2023-10-04 09:54:07,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:08,217 INFO [train.py:1046] (2/4) Epoch 46, batch 2900, loss[loss=0.1527, simple_loss=0.2335, pruned_loss=0.03595, over 23323.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.03629, over 4713877.68 frames. ], batch size: 119, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:54:11,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:54:11,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:11,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:54:13,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 09:54:13,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1612973.3333333333, ans=0.125 2023-10-04 09:54:17,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:54:17,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 09:54:19,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 09:54:19,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:54:19,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:54:22,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:54:22,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:54:27,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:54:27,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:54:30,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1613040.0, ans=0.125 2023-10-04 09:54:31,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:54:31,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 09:54:33,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:54:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:36,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 09:54:36,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 09:54:39,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:39,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 09:54:39,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:54:41,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:54:41,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:54:42,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1613106.6666666667, ans=0.125 2023-10-04 09:54:43,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:54:45,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:49,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:51,765 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.05 vs. limit=15.0 2023-10-04 09:54:52,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:54:53,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 09:54:53,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 09:54:53,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:54:58,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:55:00,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 09:55:02,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:55:08,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:55:16,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:55:16,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:55:17,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 09:55:19,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:19,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 09:55:20,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:55:21,296 INFO [train.py:1046] (2/4) Epoch 46, batch 2950, loss[loss=0.1606, simple_loss=0.2455, pruned_loss=0.03785, over 23729.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2346, pruned_loss=0.03642, over 4716634.13 frames. ], batch size: 85, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:55:21,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:55:26,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:55:28,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 09:55:30,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:55:30,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:33,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:55:33,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:55:34,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 09:55:35,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 09:55:36,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:55:38,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:55:43,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:55:44,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:55:46,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:55:46,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:55:49,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:55:49,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:55:51,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:52,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:52,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:55:53,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1613440.0, ans=0.125 2023-10-04 09:55:55,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 09:55:59,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 09:55:59,506 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 09:56:01,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:56:03,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.08 vs. limit=10.0 2023-10-04 09:56:04,078 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 09:56:04,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 09:56:05,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:56:05,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:56:05,672 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 09:56:05,676 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:56:08,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 09:56:08,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:56:09,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:56:10,093 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.09 vs. limit=15.0 2023-10-04 09:56:10,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1613506.6666666667, ans=0.0 2023-10-04 09:56:11,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:56:13,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:56:13,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:14,595 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 09:56:14,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:56:15,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 09:56:20,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:21,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:56:22,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 09:56:22,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:56:24,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=12.0 2023-10-04 09:56:24,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 09:56:27,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:56:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:56:28,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:56:32,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:32,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 09:56:32,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:56:33,946 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.976e+02 2.238e+02 2.495e+02 4.043e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-04 09:56:34,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:34,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:56:35,346 INFO [train.py:1046] (2/4) Epoch 46, batch 3000, loss[loss=0.1621, simple_loss=0.2353, pruned_loss=0.04445, over 23610.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.03659, over 4728104.01 frames. ], batch size: 134, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:56:35,347 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 09:56:47,856 INFO [train.py:1078] (2/4) Epoch 46, validation: loss=0.3542, simple_loss=0.2819, pruned_loss=0.2132, over 1125622.00 frames. 2023-10-04 09:56:47,857 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 09:56:47,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:56:49,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:56:50,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:56:52,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:52,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 09:56:52,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:56,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:56:56,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:56:59,167 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 09:57:00,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 09:57:02,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:57:03,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:57:03,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 09:57:03,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:57:10,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:57:20,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:57:26,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 09:57:27,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:57:28,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:57:28,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:57:29,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:57:31,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:57:31,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 09:57:33,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 09:57:35,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:57:36,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1613840.0, ans=0.1 2023-10-04 09:57:37,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:57:39,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:57:40,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:57:40,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:40,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:57:44,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:57:44,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:57:44,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:57:44,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1613840.0, ans=0.1 2023-10-04 09:57:47,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:57:48,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 09:57:49,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:57:50,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:57:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:57:53,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1613906.6666666667, ans=0.0 2023-10-04 09:57:54,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:54,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:56,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:57:56,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 09:57:56,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:57:56,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 09:57:56,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1613906.6666666667, ans=0.0 2023-10-04 09:57:57,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:57:57,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 09:58:00,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:58:02,146 INFO [train.py:1046] (2/4) Epoch 46, batch 3050, loss[loss=0.2096, simple_loss=0.2789, pruned_loss=0.0702, over 19718.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2355, pruned_loss=0.03661, over 4739100.87 frames. ], batch size: 388, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:58:02,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 09:58:02,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 09:58:04,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 09:58:04,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:58:04,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:58:05,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:58:05,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:58:07,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:07,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:58:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 09:58:10,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:58:10,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1613973.3333333333, ans=0.04949747468305833 2023-10-04 09:58:13,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:13,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:58:17,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:20,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 09:58:24,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 09:58:26,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 09:58:26,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:58:28,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:58:32,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:32,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:32,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:35,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:58:35,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:58:36,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:58:36,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:36,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:40,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:41,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:58:43,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:58:44,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 09:58:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:45,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:58:47,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:58:47,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:58:48,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:58:48,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:58:50,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1614173.3333333333, ans=0.0 2023-10-04 09:58:53,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:53,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:58:59,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:01,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:59:01,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:59:02,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:59:02,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:59:03,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:59:04,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 09:59:05,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:59:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:06,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 09:59:06,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1614240.0, ans=0.125 2023-10-04 09:59:11,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:59:15,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:59:16,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.988e+02 2.298e+02 2.757e+02 4.239e+02, threshold=4.595e+02, percent-clipped=0.0 2023-10-04 09:59:16,595 INFO [train.py:1046] (2/4) Epoch 46, batch 3100, loss[loss=0.1675, simple_loss=0.2489, pruned_loss=0.04303, over 24311.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2355, pruned_loss=0.03654, over 4732444.24 frames. ], batch size: 77, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 09:59:16,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:59:18,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1614306.6666666667, ans=0.125 2023-10-04 09:59:19,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:59:20,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 09:59:24,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 09:59:25,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 09:59:28,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:59:31,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:59:31,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:33,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:59:37,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:39,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.92 vs. limit=22.5 2023-10-04 09:59:41,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.69 vs. limit=15.0 2023-10-04 09:59:41,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 09:59:46,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 09:59:47,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:59:47,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:59:47,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:59:49,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 09:59:51,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:59:51,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 09:59:51,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:59:53,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:55,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 09:59:56,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:59:59,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:59:59,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 09:59:59,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1614506.6666666667, ans=0.1 2023-10-04 10:00:01,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 10:00:02,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:02,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:00:05,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:06,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:06,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:00:07,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:00:07,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:00:08,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:00:08,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:00:08,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:00:11,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:00:11,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 10:00:14,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:00:15,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 10:00:16,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:16,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:16,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 10:00:26,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1614573.3333333333, ans=0.0 2023-10-04 10:00:27,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 10:00:29,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:30,999 INFO [train.py:1046] (2/4) Epoch 46, batch 3150, loss[loss=0.1372, simple_loss=0.1888, pruned_loss=0.04282, over 19394.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.03613, over 4725205.61 frames. ], batch size: 388, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:00:31,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:32,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.11 vs. limit=15.0 2023-10-04 10:00:34,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:00:34,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:00:34,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 10:00:35,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:35,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:00:38,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 10:00:40,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:41,593 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 10:00:46,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 10:00:46,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:00:47,583 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 10:00:47,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:00:50,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 10:00:51,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 10:00:51,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 10:00:51,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:51,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:00:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 10:00:56,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:56,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:57,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:01:00,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:01:02,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 10:01:03,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:01:07,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:01:08,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:01:08,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 10:01:08,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1614773.3333333333, ans=0.1 2023-10-04 10:01:11,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 10:01:11,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:01:12,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:01:12,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:01:13,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.30 vs. limit=10.0 2023-10-04 10:01:13,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:01:13,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:01:15,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:01:15,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:01:15,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 10:01:16,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:01:16,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:18,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:01:18,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:01:20,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 10:01:20,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:20,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1614840.0, ans=0.5 2023-10-04 10:01:21,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 10:01:21,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:22,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 10:01:24,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 10:01:26,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:01:26,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:27,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 10:01:28,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 10:01:28,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:01:29,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.30 vs. limit=12.0 2023-10-04 10:01:32,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:01:33,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:33,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:01:39,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:01:39,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:40,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1614906.6666666667, ans=0.125 2023-10-04 10:01:41,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 10:01:45,122 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.157e+02 2.549e+02 3.186e+02 5.490e+02, threshold=5.099e+02, percent-clipped=3.0 2023-10-04 10:01:45,147 INFO [train.py:1046] (2/4) Epoch 46, batch 3200, loss[loss=0.1577, simple_loss=0.2422, pruned_loss=0.03656, over 23452.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2335, pruned_loss=0.03593, over 4724737.22 frames. ], batch size: 105, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:01:46,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:01:46,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 10:01:50,209 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.79 vs. limit=6.0 2023-10-04 10:01:51,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:52,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:01:52,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 10:01:53,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.85 vs. limit=15.0 2023-10-04 10:01:53,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:57,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1614973.3333333333, ans=0.2 2023-10-04 10:01:58,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:02:01,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:02:10,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:02:10,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1615040.0, ans=0.0 2023-10-04 10:02:10,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1615040.0, ans=0.125 2023-10-04 10:02:20,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 10:02:22,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:02:23,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1615106.6666666667, ans=0.0 2023-10-04 10:02:24,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 10:02:25,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:02:27,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1615106.6666666667, ans=0.0 2023-10-04 10:02:28,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:02:28,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:02:29,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:02:34,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 10:02:37,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 10:02:38,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 10:02:40,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 10:02:43,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:02:47,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:02:47,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:02:49,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:02:50,512 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 10:02:50,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:02:53,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:02:55,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 10:02:55,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 10:02:56,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 10:02:58,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 10:02:59,763 INFO [train.py:1046] (2/4) Epoch 46, batch 3250, loss[loss=0.1628, simple_loss=0.236, pruned_loss=0.04478, over 23788.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2325, pruned_loss=0.03604, over 4715027.22 frames. ], batch size: 179, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:03:01,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:03:04,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:03:04,380 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 10:03:04,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:04,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:04,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1615306.6666666667, ans=0.0 2023-10-04 10:03:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 10:03:09,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:03:13,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:03:19,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:03:19,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 10:03:20,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:03:21,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:03:21,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:03:22,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:03:22,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:03:24,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1615373.3333333333, ans=0.125 2023-10-04 10:03:25,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:25,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:03:27,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:27,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:27,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:27,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:03:28,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:30,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:03:32,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:32,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:34,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:35,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:03:35,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:03:41,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 10:03:41,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:03:41,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:03:42,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:03:44,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:03:49,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.34 vs. limit=15.0 2023-10-04 10:03:49,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:03:56,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:03:56,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:56,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 10:03:58,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:03:58,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 10:03:58,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:01,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 10:04:01,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 10:04:03,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:04:04,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:05,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:04:07,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 10:04:07,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:04:09,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:04:09,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:04:10,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 10:04:10,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:13,232 INFO [train.py:1046] (2/4) Epoch 46, batch 3300, loss[loss=0.1508, simple_loss=0.2214, pruned_loss=0.04013, over 23763.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2334, pruned_loss=0.03602, over 4726685.34 frames. ], batch size: 164, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:04:13,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:04:13,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 10:04:15,083 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.072e+02 2.407e+02 3.224e+02 5.952e+02, threshold=4.814e+02, percent-clipped=1.0 2023-10-04 10:04:17,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:04:17,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 10:04:19,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 10:04:19,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 10:04:19,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:21,430 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.69 vs. limit=15.0 2023-10-04 10:04:23,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:04:23,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:04:24,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:26,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:04:26,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:04:28,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:30,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:04:34,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 10:04:34,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:04:34,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:36,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:37,010 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 10:04:38,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:04:39,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:04:40,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:04:40,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:04:40,089 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 10:04:42,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:42,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:04:46,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:46,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 10:04:46,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 10:04:46,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:46,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.36 vs. limit=15.0 2023-10-04 10:04:47,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=12.0 2023-10-04 10:04:48,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:04:51,513 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 10:04:52,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 10:04:52,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:04:55,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 10:04:59,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:05:02,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:05:02,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:05:03,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1615840.0, ans=0.125 2023-10-04 10:05:04,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:05,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:05:05,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:05:06,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:05:06,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:05:06,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:05:08,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:05:11,082 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 10:05:12,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 10:05:15,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:05:16,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:05:16,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:17,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:05:17,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:19,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:05:21,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:21,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:05:22,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:05:23,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:05:25,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 10:05:25,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:26,613 INFO [train.py:1046] (2/4) Epoch 46, batch 3350, loss[loss=0.1802, simple_loss=0.2529, pruned_loss=0.0537, over 23380.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03631, over 4726911.72 frames. ], batch size: 285, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:05:26,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:28,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:05:29,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:05:29,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:30,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:30,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:34,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:05:37,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:37,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:05:40,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:41,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:05:43,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:44,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:05:46,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 10:05:46,130 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 10:05:46,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:49,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 10:05:50,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 10:05:52,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:05:52,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:05:53,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:05:54,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 10:05:54,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:54,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:05:57,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:58,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1616106.6666666667, ans=0.125 2023-10-04 10:05:59,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:59,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:00,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:06:02,546 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.26 vs. limit=22.5 2023-10-04 10:06:03,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:04,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1616106.6666666667, ans=0.0 2023-10-04 10:06:06,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:06,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:09,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:06:10,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:13,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:13,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:15,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:18,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 10:06:18,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:06:18,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 10:06:18,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:06:20,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 10:06:21,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:22,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:30,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 10:06:31,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:06:33,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:06:33,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:06:37,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:06:39,983 INFO [train.py:1046] (2/4) Epoch 46, batch 3400, loss[loss=0.1562, simple_loss=0.244, pruned_loss=0.03418, over 24065.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2347, pruned_loss=0.03643, over 4734368.36 frames. ], batch size: 80, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:06:40,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 10:06:40,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:06:40,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:06:40,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1616306.6666666667, ans=0.125 2023-10-04 10:06:41,416 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.680e+02 2.075e+02 2.237e+02 2.507e+02 4.047e+02, threshold=4.473e+02, percent-clipped=0.0 2023-10-04 10:06:41,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:41,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 10:06:41,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1616306.6666666667, ans=0.125 2023-10-04 10:06:43,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 10:06:46,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:06:46,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:06:46,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:06:46,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1616306.6666666667, ans=0.0 2023-10-04 10:06:47,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:06:47,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 10:06:52,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 10:06:52,401 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 10:06:52,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:55,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:06:55,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:06:55,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1616373.3333333333, ans=0.125 2023-10-04 10:06:56,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:06:57,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:07:04,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:07:05,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 10:07:07,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1616373.3333333333, ans=0.07 2023-10-04 10:07:10,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:07:11,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:07:11,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:07:13,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:07:14,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-10-04 10:07:18,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:07:23,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 10:07:28,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:07:28,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:07:30,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 10:07:30,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:07:30,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:07:30,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:07:31,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:07:34,452 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.15 vs. limit=15.0 2023-10-04 10:07:36,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:07:38,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:07:39,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:07:42,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1616573.3333333333, ans=0.0 2023-10-04 10:07:45,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:07:46,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 10:07:51,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:07:54,245 INFO [train.py:1046] (2/4) Epoch 46, batch 3450, loss[loss=0.1598, simple_loss=0.2506, pruned_loss=0.03453, over 24464.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2342, pruned_loss=0.03646, over 4738703.67 frames. ], batch size: 69, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:07:54,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 10:07:57,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 10:07:57,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:08:00,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:08:00,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 10:08:01,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:08:04,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:08:09,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:08:11,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:08:12,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:08:12,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:15,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:17,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1616706.6666666667, ans=0.125 2023-10-04 10:08:19,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 10:08:22,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1616773.3333333333, ans=0.1 2023-10-04 10:08:25,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 10:08:26,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1616773.3333333333, ans=0.125 2023-10-04 10:08:27,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:08:27,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:08:27,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:08:29,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1616773.3333333333, ans=15.0 2023-10-04 10:08:31,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 10:08:32,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:08:34,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1616773.3333333333, ans=0.125 2023-10-04 10:08:36,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1616773.3333333333, ans=0.125 2023-10-04 10:08:37,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:08:39,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:08:39,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:08:40,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:08:42,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 10:08:43,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:08:44,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:46,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:08:50,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 10:08:53,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:08:59,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:09:00,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:01,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:06,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:06,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:09:06,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:09:06,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:09:07,944 INFO [train.py:1046] (2/4) Epoch 46, batch 3500, loss[loss=0.1292, simple_loss=0.1945, pruned_loss=0.032, over 22732.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2334, pruned_loss=0.03633, over 4736541.68 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:09:11,223 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.024e+02 2.165e+02 2.480e+02 4.406e+02, threshold=4.331e+02, percent-clipped=0.0 2023-10-04 10:09:11,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:14,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:09:15,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 10:09:16,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:09:20,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:09:23,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:23,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 10:09:27,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:09:28,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:09:29,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:09:29,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:09:31,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:09:31,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:32,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:09:32,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 10:09:32,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1617040.0, ans=0.125 2023-10-04 10:09:35,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:35,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:09:37,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:09:40,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:42,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 10:09:42,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:09:45,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:09:45,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1617106.6666666667, ans=0.0 2023-10-04 10:09:45,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1617106.6666666667, ans=0.125 2023-10-04 10:09:46,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:09:47,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:49,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:09:49,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:09:52,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 10:09:53,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 10:09:53,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 10:09:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:09:55,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:09:56,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:09:59,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:09:59,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:10:05,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:10:06,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1617240.0, ans=0.2 2023-10-04 10:10:07,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 10:10:07,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 10:10:07,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:08,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:10:08,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:10:10,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:10:13,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 10:10:13,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:10:15,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:10:16,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 10:10:19,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 10:10:20,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:10:21,783 INFO [train.py:1046] (2/4) Epoch 46, batch 3550, loss[loss=0.1521, simple_loss=0.2312, pruned_loss=0.03648, over 23298.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2325, pruned_loss=0.03618, over 4709092.89 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:10:21,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:10:21,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:23,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:25,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:10:32,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1617306.6666666667, ans=0.09899494936611666 2023-10-04 10:10:35,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:37,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 10:10:39,332 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.23 vs. limit=15.0 2023-10-04 10:10:39,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:10:39,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:10:41,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:41,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:10:42,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1617373.3333333333, ans=0.1 2023-10-04 10:10:43,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:10:46,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:47,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:10:48,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:48,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:10:49,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:10:49,734 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.00 vs. limit=22.5 2023-10-04 10:10:51,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1617440.0, ans=0.0 2023-10-04 10:10:54,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:10:54,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:57,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:10:57,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:57,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:10:57,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 10:10:57,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:58,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1617440.0, ans=0.04949747468305833 2023-10-04 10:10:59,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:11:00,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 10:11:05,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1617506.6666666667, ans=0.125 2023-10-04 10:11:06,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:06,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:11:06,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1617506.6666666667, ans=0.0 2023-10-04 10:11:08,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:08,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1617506.6666666667, ans=0.1 2023-10-04 10:11:10,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 10:11:10,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:11:10,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 10:11:12,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:11:15,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:11:15,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:11:18,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 10:11:18,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:11:24,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:11:24,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 10:11:25,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:30,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:11:30,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 10:11:36,580 INFO [train.py:1046] (2/4) Epoch 46, batch 3600, loss[loss=0.1651, simple_loss=0.2515, pruned_loss=0.0393, over 24685.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2322, pruned_loss=0.03606, over 4710043.01 frames. ], batch size: 73, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:11:36,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 10:11:36,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:11:38,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:11:39,478 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.969e+02 2.172e+02 2.569e+02 3.736e+02, threshold=4.344e+02, percent-clipped=0.0 2023-10-04 10:11:40,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:41,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:44,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:11:45,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1617640.0, ans=0.0 2023-10-04 10:11:46,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:11:49,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:49,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:11:50,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:11:50,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:50,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 10:11:53,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:11:53,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1617706.6666666667, ans=0.125 2023-10-04 10:11:53,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1617706.6666666667, ans=0.125 2023-10-04 10:11:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:57,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:12:00,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:12:01,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:12:03,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:12:03,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 10:12:05,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:12:06,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:12:07,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:12:09,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:10,070 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.68 vs. limit=15.0 2023-10-04 10:12:10,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:12:12,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:12:14,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 10:12:20,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:12:21,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:12:21,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 10:12:25,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:12:31,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:32,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1617840.0, ans=0.0 2023-10-04 10:12:34,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:40,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:12:40,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:12:40,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 10:12:42,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 10:12:42,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 10:12:45,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:12:45,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:12:48,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 10:12:48,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:12:49,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:12:49,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:12:49,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 10:12:50,888 INFO [train.py:1046] (2/4) Epoch 46, batch 3650, loss[loss=0.1426, simple_loss=0.2212, pruned_loss=0.03202, over 24338.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2323, pruned_loss=0.03607, over 4691163.02 frames. ], batch size: 61, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:12:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 10:12:54,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:55,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.57 vs. limit=5.0 2023-10-04 10:12:55,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 10:12:55,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1617973.3333333333, ans=0.0 2023-10-04 10:12:59,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 10:13:01,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:13:03,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1617973.3333333333, ans=0.1 2023-10-04 10:13:04,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 10:13:07,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 10:13:10,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1618040.0, ans=0.07 2023-10-04 10:13:11,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:13:11,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:13:11,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:13:15,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:13:15,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:13:16,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 10:13:17,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:13:17,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:13:18,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 10:13:19,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:13:20,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:13:20,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:22,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1618106.6666666667, ans=0.125 2023-10-04 10:13:23,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:13:24,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 10:13:26,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 10:13:28,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:13:29,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 10:13:29,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1618106.6666666667, ans=0.2 2023-10-04 10:13:30,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:13:32,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:13:32,863 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=22.5 2023-10-04 10:13:36,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:13:38,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:38,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:13:40,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:13:41,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:13:42,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:13:43,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1618173.3333333333, ans=0.0 2023-10-04 10:13:45,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1618173.3333333333, ans=10.0 2023-10-04 10:13:46,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:13:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:13:47,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:13:50,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:13:50,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:51,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:13:56,938 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 10:14:01,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:14:01,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:03,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:14:03,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:04,397 INFO [train.py:1046] (2/4) Epoch 46, batch 3700, loss[loss=0.1638, simple_loss=0.2365, pruned_loss=0.04552, over 23445.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.03627, over 4709691.40 frames. ], batch size: 285, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:14:04,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:14:06,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:06,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 10:14:06,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:07,880 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.030e+02 2.313e+02 2.831e+02 4.310e+02, threshold=4.627e+02, percent-clipped=0.0 2023-10-04 10:14:08,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:14:08,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1618306.6666666667, ans=0.2 2023-10-04 10:14:09,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.30 vs. limit=15.0 2023-10-04 10:14:10,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:14:11,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:14:14,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:14,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 10:14:14,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:15,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:14:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:14:20,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:14:23,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:14:23,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:24,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:14:24,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:25,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:14:28,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:30,324 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 10:14:30,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1618373.3333333333, ans=0.125 2023-10-04 10:14:34,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:14:34,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:14:34,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1618440.0, ans=0.0 2023-10-04 10:14:37,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:14:37,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 10:14:37,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:14:41,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:42,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 10:14:43,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:45,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:14:46,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:46,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:14:50,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:14:54,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:14:54,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 10:14:54,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:55,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 10:15:00,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:15:01,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:15:04,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:04,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 10:15:08,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:15:08,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:15:08,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:15:08,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:11,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1618573.3333333333, ans=0.1 2023-10-04 10:15:12,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:15:12,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 10:15:14,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 10:15:14,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:15:14,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:15,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:15:17,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:15:17,785 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.84 vs. limit=10.0 2023-10-04 10:15:18,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:15:20,322 INFO [train.py:1046] (2/4) Epoch 46, batch 3750, loss[loss=0.1602, simple_loss=0.2354, pruned_loss=0.04254, over 22790.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.03638, over 4711248.13 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:15:20,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:15:20,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:15:23,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 10:15:23,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 10:15:23,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1618640.0, ans=0.0 2023-10-04 10:15:27,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:15:27,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 10:15:29,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:15:29,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:31,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:32,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:15:35,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:15:37,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1618706.6666666667, ans=0.0 2023-10-04 10:15:40,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:15:41,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:15:43,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:47,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:15:47,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 10:15:48,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:15:48,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:15:48,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:15:53,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 10:15:56,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 10:15:59,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:15:59,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:16:00,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:05,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:05,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 10:16:10,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 10:16:13,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:13,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1618840.0, ans=0.0 2023-10-04 10:16:15,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1618840.0, ans=0.125 2023-10-04 10:16:18,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:16:18,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:16:18,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1618906.6666666667, ans=0.0 2023-10-04 10:16:22,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:16:22,364 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:16:24,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:16:26,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:16:28,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:16:29,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:16:31,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:16:35,119 INFO [train.py:1046] (2/4) Epoch 46, batch 3800, loss[loss=0.168, simple_loss=0.2436, pruned_loss=0.04624, over 23793.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2343, pruned_loss=0.03663, over 4695686.65 frames. ], batch size: 164, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:16:39,644 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.060e+02 2.402e+02 2.784e+02 4.041e+02, threshold=4.803e+02, percent-clipped=0.0 2023-10-04 10:16:40,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:16:41,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1618973.3333333333, ans=0.2 2023-10-04 10:16:43,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:44,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:16:44,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 10:16:45,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:47,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:16:48,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:16:50,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 10:16:50,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:51,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:16:52,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:53,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:16:53,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:16:54,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 10:16:57,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 10:16:58,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:17:01,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:17:01,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1619040.0, ans=0.1 2023-10-04 10:17:04,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:17:06,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:17:06,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:17:06,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.65 vs. limit=6.0 2023-10-04 10:17:07,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:17:09,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:11,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:17:15,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:17:15,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 10:17:17,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:17:24,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:17:29,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:17:30,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1619173.3333333333, ans=0.2 2023-10-04 10:17:33,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 10:17:34,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 10:17:36,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:17:37,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:17:38,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:38,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 10:17:43,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 10:17:43,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 10:17:44,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:45,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:17:49,784 INFO [train.py:1046] (2/4) Epoch 46, batch 3850, loss[loss=0.1389, simple_loss=0.2148, pruned_loss=0.03155, over 23659.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2332, pruned_loss=0.03634, over 4707385.05 frames. ], batch size: 149, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:17:51,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:17:51,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:17:52,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.50 vs. limit=15.0 2023-10-04 10:17:57,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:17:58,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 10:17:58,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:17:59,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:18:02,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:18:05,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:06,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:18:08,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 10:18:13,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:16,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:18:18,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:18:18,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:18:18,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1619440.0, ans=0.125 2023-10-04 10:18:23,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:23,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:18:24,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:24,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:18:25,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:18:27,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:18:28,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:28,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:18:28,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 10:18:28,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 10:18:30,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:18:30,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:31,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:32,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:32,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 10:18:35,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 10:18:37,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:38,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 10:18:41,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:18:46,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:48,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:50,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:52,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 10:18:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 10:18:57,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:58,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:59,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:18:59,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:18:59,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:01,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:01,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:19:01,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 10:19:02,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:19:03,858 INFO [train.py:1046] (2/4) Epoch 46, batch 3900, loss[loss=0.1535, simple_loss=0.2427, pruned_loss=0.03215, over 24651.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2326, pruned_loss=0.0361, over 4706426.03 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:19:03,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 10:19:03,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:03,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:19:05,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:19:05,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:07,972 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.999e+02 2.284e+02 2.578e+02 4.359e+02, threshold=4.569e+02, percent-clipped=0.0 2023-10-04 10:19:08,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:19:08,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1619640.0, ans=0.0 2023-10-04 10:19:09,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:19:09,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:19:09,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:19:09,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 10:19:11,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:14,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:19:16,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:19:17,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:19:17,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:19:20,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:19:20,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:21,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:19:23,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 10:19:23,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:19:25,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 10:19:25,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:26,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 10:19:29,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 10:19:32,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:19:33,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:19:33,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:19:34,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:19:38,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:19:39,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1619773.3333333333, ans=0.125 2023-10-04 10:19:41,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:19:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:19:43,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:19:44,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:19:48,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:19:49,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:19:55,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:19:56,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:20:03,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:20:06,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:20:06,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 10:20:06,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 10:20:06,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:20:08,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 10:20:10,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:20:10,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 10:20:11,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1619906.6666666667, ans=0.125 2023-10-04 10:20:17,366 INFO [train.py:1046] (2/4) Epoch 46, batch 3950, loss[loss=0.1663, simple_loss=0.2406, pruned_loss=0.04604, over 23810.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2316, pruned_loss=0.03599, over 4694772.07 frames. ], batch size: 179, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:20:18,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:20:19,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 10:20:20,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:20:22,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.96 vs. limit=22.5 2023-10-04 10:20:23,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:20:23,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:20:30,398 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 10:20:30,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:20:32,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 10:20:32,067 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 10:20:32,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:20:35,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:20:35,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:20:35,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:20:36,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 10:20:37,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.23 vs. limit=22.5 2023-10-04 10:20:39,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:20:39,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:20:39,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:20:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:20:41,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:20:44,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1620040.0, ans=0.125 2023-10-04 10:20:52,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:20:52,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:20:55,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1620106.6666666667, ans=0.125 2023-10-04 10:20:59,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 10:21:04,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 10:21:04,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 10:21:05,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:21:07,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:21:13,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:21:14,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:21:14,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:21:15,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:21:15,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 10:21:19,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:21:21,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:21:26,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 10:21:30,841 INFO [train.py:1046] (2/4) Epoch 46, batch 4000, loss[loss=0.1517, simple_loss=0.2339, pruned_loss=0.03481, over 24325.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2324, pruned_loss=0.03616, over 4692282.30 frames. ], batch size: 61, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:21:35,612 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.007e+02 2.186e+02 2.659e+02 3.847e+02, threshold=4.373e+02, percent-clipped=0.0 2023-10-04 10:21:35,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:38,015 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.89 vs. limit=15.0 2023-10-04 10:21:40,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:47,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:21:47,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:21:47,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:48,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 10:21:49,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:21:49,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 10:21:49,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:21:49,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 10:21:52,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:21:56,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:21:56,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:21:56,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:21:57,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:21:57,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:21:59,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:22:00,637 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 10:22:00,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:22:01,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:05,349 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 10:22:05,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:22:05,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:22:12,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 10:22:14,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:22:16,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:22:16,949 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 10:22:17,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:22:18,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 10:22:18,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:22:20,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:22,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:22:23,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:22:23,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:22:23,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:22:24,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 10:22:26,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:27,445 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 10:22:31,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:22:36,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 10:22:37,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:22:39,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:22:39,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:22:40,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:22:45,494 INFO [train.py:1046] (2/4) Epoch 46, batch 4050, loss[loss=0.1572, simple_loss=0.2432, pruned_loss=0.03559, over 24631.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03623, over 4710856.06 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:22:45,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:22:48,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:22:48,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 10:22:49,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:22:51,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:22:53,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:22:54,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:22:54,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:22:58,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:23:01,766 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:23:01,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:23:03,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:23:03,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1620706.6666666667, ans=0.1 2023-10-04 10:23:04,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:23:04,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1620706.6666666667, ans=0.125 2023-10-04 10:23:07,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:23:08,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1620706.6666666667, ans=0.025 2023-10-04 10:23:10,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:23:13,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 10:23:13,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 10:23:14,801 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 10:23:16,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:23:16,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1620773.3333333333, ans=0.0 2023-10-04 10:23:16,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1620773.3333333333, ans=0.09899494936611666 2023-10-04 10:23:22,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 10:23:24,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:23:27,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:23:31,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:23:32,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:23:32,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:23:34,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:23:36,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1620840.0, ans=0.0 2023-10-04 10:23:37,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 10:23:37,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:23:39,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:23:39,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 10:23:43,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:23:51,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 10:23:52,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:23:52,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:23:55,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 10:23:55,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 10:23:55,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:23:57,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:23:58,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:23:58,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:23:59,808 INFO [train.py:1046] (2/4) Epoch 46, batch 4100, loss[loss=0.167, simple_loss=0.2572, pruned_loss=0.03839, over 24547.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2343, pruned_loss=0.03622, over 4724997.32 frames. ], batch size: 71, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:24:04,012 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.073e+02 2.274e+02 2.591e+02 3.994e+02, threshold=4.547e+02, percent-clipped=0.0 2023-10-04 10:24:07,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 10:24:09,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1620973.3333333333, ans=0.2 2023-10-04 10:24:10,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 10:24:11,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 10:24:13,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 10:24:13,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:24:13,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:14,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:14,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:24:16,012 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 10:24:18,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:24:18,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:24:18,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:24:19,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:24:23,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:24:23,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:24:23,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1621040.0, ans=0.125 2023-10-04 10:24:24,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:24:24,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 10:24:25,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:25,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1621040.0, ans=0.125 2023-10-04 10:24:26,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:24:26,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:24:26,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:24:26,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 10:24:31,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:24:32,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 10:24:33,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:24:36,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:24:36,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 10:24:37,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:24:37,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:24:39,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:24:41,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 10:24:42,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:24:42,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:24:45,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 10:24:45,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:45,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:24:47,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1621173.3333333333, ans=0.0 2023-10-04 10:24:48,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:24:54,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:24:59,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:24:59,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:25:06,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:06,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:25:09,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:25:09,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:25:13,859 INFO [train.py:1046] (2/4) Epoch 46, batch 4150, loss[loss=0.1522, simple_loss=0.2298, pruned_loss=0.03735, over 23248.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2342, pruned_loss=0.03618, over 4725576.69 frames. ], batch size: 93, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:25:15,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:25:16,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:25:17,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:25:17,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:25:21,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 10:25:21,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:25:21,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 10:25:22,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 10:25:22,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 10:25:22,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1621306.6666666667, ans=0.95 2023-10-04 10:25:24,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:25:27,804 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.94 vs. limit=15.0 2023-10-04 10:25:28,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:25:28,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:29,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.50 vs. limit=22.5 2023-10-04 10:25:33,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:25:34,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:25:34,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:25:36,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:25:37,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:25:37,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:25:39,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1621373.3333333333, ans=0.0 2023-10-04 10:25:42,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:42,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1621440.0, ans=0.125 2023-10-04 10:25:44,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:25:46,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 10:25:47,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 10:25:47,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:25:49,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 10:25:49,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:25:49,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:25:52,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:25:54,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:25:57,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 10:26:00,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:26:01,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:03,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 10:26:03,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:26:04,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 10:26:07,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:26:08,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:26:09,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:13,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 10:26:13,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:13,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:26:14,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:26:16,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 10:26:16,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:16,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:26:17,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:26:17,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 10:26:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:26:17,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:26:18,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:26:20,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 10:26:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:26:22,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1621573.3333333333, ans=0.125 2023-10-04 10:26:25,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:26:26,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 10:26:28,281 INFO [train.py:1046] (2/4) Epoch 46, batch 4200, loss[loss=0.1586, simple_loss=0.2461, pruned_loss=0.03561, over 23779.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2334, pruned_loss=0.03603, over 4719699.58 frames. ], batch size: 85, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:26:30,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:26:32,764 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 2.054e+02 2.218e+02 2.491e+02 3.350e+02, threshold=4.435e+02, percent-clipped=0.0 2023-10-04 10:26:32,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:26:34,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:26:35,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:26:35,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:26:37,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 10:26:39,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 10:26:39,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:43,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:26:47,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:26:48,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:26:48,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:49,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 10:26:49,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:50,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1621706.6666666667, ans=0.125 2023-10-04 10:26:51,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:51,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:26:53,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:26:54,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:26:56,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 10:26:56,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:27:01,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:27:03,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:27:05,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:27:06,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:27:08,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:27:08,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 10:27:09,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:27:11,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:27:14,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:27:15,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:27:23,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:27:25,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 10:27:28,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:27:32,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:27:34,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:35,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 10:27:40,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:27:41,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1621973.3333333333, ans=0.1 2023-10-04 10:27:42,955 INFO [train.py:1046] (2/4) Epoch 46, batch 4250, loss[loss=0.1485, simple_loss=0.222, pruned_loss=0.03747, over 23342.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03578, over 4694831.28 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:27:44,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:27:44,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:27:45,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:50,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:27:51,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 10:27:51,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:27:54,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:58,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:03,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:03,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:04,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:28:04,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:28:07,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:07,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:08,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:09,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:28:10,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:10,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1622106.6666666667, ans=0.2 2023-10-04 10:28:12,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 10:28:14,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1622106.6666666667, ans=0.1 2023-10-04 10:28:16,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 10:28:16,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:18,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:18,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:28:18,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:19,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:22,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:28:22,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:28:26,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1622173.3333333333, ans=0.2 2023-10-04 10:28:27,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:28:29,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1622173.3333333333, ans=0.0 2023-10-04 10:28:29,615 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.15 vs. limit=15.0 2023-10-04 10:28:30,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:31,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 10:28:31,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:28:32,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 10:28:33,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:28:34,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:28:35,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:35,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:28:39,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 10:28:39,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1622173.3333333333, ans=0.2 2023-10-04 10:28:40,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:28:41,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:28:46,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:47,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:49,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:28:49,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1622240.0, ans=0.125 2023-10-04 10:28:50,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:28:52,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:28:52,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:28:53,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:28:53,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 10:28:55,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:57,094 INFO [train.py:1046] (2/4) Epoch 46, batch 4300, loss[loss=0.1475, simple_loss=0.2256, pruned_loss=0.03474, over 23414.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2322, pruned_loss=0.03599, over 4692432.35 frames. ], batch size: 285, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:29:01,507 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.043e+02 2.222e+02 2.546e+02 3.786e+02, threshold=4.445e+02, percent-clipped=0.0 2023-10-04 10:29:01,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:29:01,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:29:05,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:29:10,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1622373.3333333333, ans=0.125 2023-10-04 10:29:10,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=1622373.3333333333, ans=12.0 2023-10-04 10:29:11,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:29:11,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 10:29:14,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:29:16,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:29:16,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:29:16,094 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 10:29:17,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1622373.3333333333, ans=0.125 2023-10-04 10:29:20,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:29:21,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:29:24,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 10:29:24,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:29:24,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 10:29:26,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1622440.0, ans=0.09899494936611666 2023-10-04 10:29:27,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:29:29,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:29:33,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:29:33,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:29:35,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:29:36,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:29:36,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:29:36,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 10:29:38,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 10:29:41,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:29:42,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:42,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:29:42,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:43,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:29:43,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 10:29:43,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 10:29:43,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 10:29:45,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:29:47,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 10:29:47,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 10:29:51,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:29:53,976 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 10:29:54,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:29:55,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:29:56,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:29:58,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 10:29:58,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:29:58,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:58,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:29:59,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:29:59,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:30:01,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:30:03,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:04,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:04,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:30:10,219 INFO [train.py:1046] (2/4) Epoch 46, batch 4350, loss[loss=0.1616, simple_loss=0.2395, pruned_loss=0.04184, over 23803.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.03628, over 4707655.35 frames. ], batch size: 195, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:30:10,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 10:30:10,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:30:16,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:30:17,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:18,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1622640.0, ans=0.0 2023-10-04 10:30:20,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:30:20,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:30:20,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1622640.0, ans=0.1 2023-10-04 10:30:23,607 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:30:25,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:30:26,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1622706.6666666667, ans=0.0 2023-10-04 10:30:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:30,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:30:30,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:30:33,080 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:30:34,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:30:37,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:30:38,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:30:40,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1622773.3333333333, ans=0.125 2023-10-04 10:30:41,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 10:30:42,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:30:44,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:50,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:51,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 10:30:54,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:30:55,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:31:01,945 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 10:31:02,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:03,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:31:04,048 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 10:31:05,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 10:31:05,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:31:05,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:07,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:31:08,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:10,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:31:10,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:31:12,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 10:31:12,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:12,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:31:12,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:14,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 10:31:15,571 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 10:31:15,575 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 10:31:15,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 10:31:18,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:31:18,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:31:18,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:20,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:31:21,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 10:31:23,072 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 10:31:23,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:24,236 INFO [train.py:1046] (2/4) Epoch 46, batch 4400, loss[loss=0.1623, simple_loss=0.2485, pruned_loss=0.03806, over 23573.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2347, pruned_loss=0.0364, over 4706414.12 frames. ], batch size: 94, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:31:24,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1622973.3333333333, ans=0.1 2023-10-04 10:31:27,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:31:27,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:28,567 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.967e+02 2.244e+02 2.506e+02 3.290e+02, threshold=4.488e+02, percent-clipped=0.0 2023-10-04 10:31:28,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:31:31,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 10:31:31,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 10:31:32,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 10:31:32,829 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 10:31:34,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:31:34,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:31:37,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 10:31:41,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:42,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:42,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 10:31:45,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:45,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 10:31:45,268 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 10:31:48,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 10:31:48,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 10:31:48,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 10:31:49,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:51,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:52,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:31:53,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 10:31:53,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 10:31:53,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:55,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:31:55,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:56,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:58,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:58,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 10:31:58,401 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 10:32:02,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:09,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:32:10,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 10:32:14,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:32:17,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1623173.3333333333, ans=0.125 2023-10-04 10:32:19,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:32:21,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:32:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 10:32:21,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:32:21,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:32:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:32:23,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:32:27,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1623240.0, ans=0.0 2023-10-04 10:32:28,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 10:32:30,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 10:32:30,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 10:32:30,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:32:30,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 10:32:32,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:32:34,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1623240.0, ans=0.1 2023-10-04 10:32:35,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:32:35,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 10:32:37,701 INFO [train.py:1046] (2/4) Epoch 46, batch 4450, loss[loss=0.144, simple_loss=0.2328, pruned_loss=0.02763, over 24323.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2348, pruned_loss=0.03629, over 4707981.67 frames. ], batch size: 74, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:32:41,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:32:44,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:44,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:32:50,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:32:50,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:32:52,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1623373.3333333333, ans=0.2 2023-10-04 10:32:53,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1623373.3333333333, ans=0.0 2023-10-04 10:32:54,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:57,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:32:58,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:32:58,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:32:59,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 10:32:59,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:33:00,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:01,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:33:01,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:33:04,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:33:05,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1623440.0, ans=0.2 2023-10-04 10:33:08,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:08,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:10,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:33:11,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:33:12,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:33:16,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:33:17,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 10:33:19,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 10:33:19,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:33:19,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.15 vs. limit=6.0 2023-10-04 10:33:22,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:33:23,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 10:33:26,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:33:30,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:30,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 10:33:31,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:31,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:33:31,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:33:31,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:33:34,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:37,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:33:37,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 10:33:39,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:33:39,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1623573.3333333333, ans=0.0 2023-10-04 10:33:40,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:33:42,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:33:42,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:44,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:33:45,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:33:47,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 10:33:48,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:33:49,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.33 vs. limit=10.0 2023-10-04 10:33:51,989 INFO [train.py:1046] (2/4) Epoch 46, batch 4500, loss[loss=0.1423, simple_loss=0.2247, pruned_loss=0.02991, over 24679.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2347, pruned_loss=0.03633, over 4710177.45 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:33:53,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:33:54,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 10:33:54,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 10:33:56,101 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.108e+02 2.336e+02 2.689e+02 4.416e+02, threshold=4.672e+02, percent-clipped=0.0 2023-10-04 10:33:56,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:34:00,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:34:01,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:34:03,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:34:03,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:34:03,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:04,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:17,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:34:18,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:34:20,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:34:20,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:34:21,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:34:23,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1623773.3333333333, ans=0.1 2023-10-04 10:34:28,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:34:30,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:34:34,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:34:36,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:34:38,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 10:34:38,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:39,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:34:41,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:34:43,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:34:46,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:46,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 10:34:46,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:34:46,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:51,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:34:51,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:34:55,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:55,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:34:55,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:34:55,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1623906.6666666667, ans=0.025 2023-10-04 10:34:57,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 10:34:58,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1623906.6666666667, ans=0.0 2023-10-04 10:34:59,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 10:34:59,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 10:34:59,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1623906.6666666667, ans=0.0 2023-10-04 10:35:02,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 10:35:05,110 INFO [train.py:1046] (2/4) Epoch 46, batch 4550, loss[loss=0.1513, simple_loss=0.2312, pruned_loss=0.0357, over 23358.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03629, over 4700943.77 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:35:06,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 10:35:07,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:35:11,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:35:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:35:14,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:35:18,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1623973.3333333333, ans=0.125 2023-10-04 10:35:19,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:35:21,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:35:23,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:35:23,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:35:23,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:27,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:35:27,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:35:28,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1624040.0, ans=0.125 2023-10-04 10:35:29,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:35:32,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 10:35:33,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 10:35:34,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:35:36,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 10:35:39,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 10:35:39,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:35:42,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 10:35:45,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:35:48,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:48,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:49,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:35:51,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 10:35:52,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1624173.3333333333, ans=0.125 2023-10-04 10:35:52,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.71 vs. limit=15.0 2023-10-04 10:35:53,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:35:53,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1624173.3333333333, ans=0.125 2023-10-04 10:35:54,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:54,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:35:56,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:35:57,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 10:35:57,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1624173.3333333333, ans=0.125 2023-10-04 10:35:58,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 10:35:58,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:35:59,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 10:36:00,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 10:36:01,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:36:03,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:03,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:36:03,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1624240.0, ans=0.125 2023-10-04 10:36:04,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:36:04,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:36:07,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:36:07,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 10:36:10,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:36:10,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 10:36:10,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 10:36:10,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:36:10,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 10:36:13,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:36:13,503 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:36:15,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:36:16,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:36:16,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:36:19,472 INFO [train.py:1046] (2/4) Epoch 46, batch 4600, loss[loss=0.1433, simple_loss=0.2174, pruned_loss=0.03463, over 24004.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03598, over 4704166.12 frames. ], batch size: 196, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:36:19,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:36:21,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:36:24,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:25,550 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.983e+02 2.157e+02 2.520e+02 3.849e+02, threshold=4.313e+02, percent-clipped=0.0 2023-10-04 10:36:25,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:36:27,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1624306.6666666667, ans=0.0 2023-10-04 10:36:28,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:36:28,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:36:29,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:31,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 10:36:32,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:36:34,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:36:34,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:35,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:40,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1624373.3333333333, ans=0.2 2023-10-04 10:36:42,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 10:36:44,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:46,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:48,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:36:48,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:54,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 10:36:54,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:36:55,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:00,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:00,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:37:01,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:37:06,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 10:37:07,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:37:09,438 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-04 10:37:10,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:11,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:37:13,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:13,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 10:37:13,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:14,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 10:37:14,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:14,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:16,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:18,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:37:20,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:20,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 10:37:20,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.65 vs. limit=12.0 2023-10-04 10:37:21,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 10:37:21,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 10:37:21,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:21,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1624573.3333333333, ans=0.0 2023-10-04 10:37:22,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:37:24,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:24,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:24,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1624573.3333333333, ans=0.125 2023-10-04 10:37:25,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1624573.3333333333, ans=0.2 2023-10-04 10:37:32,677 INFO [train.py:1046] (2/4) Epoch 46, batch 4650, loss[loss=0.1367, simple_loss=0.2147, pruned_loss=0.02941, over 24305.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2325, pruned_loss=0.03536, over 4706813.14 frames. ], batch size: 56, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:37:34,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:37:35,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:36,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:36,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:37:36,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:36,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:37:38,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:39,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 10:37:46,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:37:48,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 10:37:48,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:50,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 10:37:51,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:37:51,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 10:37:51,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 10:37:51,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:52,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:37:55,525 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:37:56,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:37:58,128 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 10:38:00,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:01,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 10:38:03,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:03,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:38:04,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1624773.3333333333, ans=0.04949747468305833 2023-10-04 10:38:05,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 10:38:06,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:38:09,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:38:12,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:20,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:21,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1624840.0, ans=0.0 2023-10-04 10:38:22,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:22,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:22,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:38:23,347 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.93 vs. limit=15.0 2023-10-04 10:38:24,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1624840.0, ans=0.125 2023-10-04 10:38:24,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1624840.0, ans=0.0 2023-10-04 10:38:25,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 10:38:25,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 10:38:25,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1624840.0, ans=0.05 2023-10-04 10:38:26,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 10:38:26,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 10:38:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:34,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:38:34,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:38:34,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 10:38:34,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:37,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:38:37,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:38:37,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:38:41,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:38:41,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:38:41,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:43,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:44,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:38:44,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:38:44,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 10:38:46,695 INFO [train.py:1046] (2/4) Epoch 46, batch 4700, loss[loss=0.1662, simple_loss=0.2516, pruned_loss=0.04041, over 23895.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2332, pruned_loss=0.0357, over 4709539.21 frames. ], batch size: 86, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:38:48,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:38:50,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 10:38:53,569 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.144e+02 2.474e+02 2.959e+02 4.299e+02, threshold=4.947e+02, percent-clipped=0.0 2023-10-04 10:38:57,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:59,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:59,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:02,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:39:03,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:39:05,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1625040.0, ans=0.0 2023-10-04 10:39:06,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 10:39:06,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 10:39:09,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:09,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:39:10,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:39:13,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:18,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:39:20,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:39:23,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:39:24,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1625106.6666666667, ans=15.0 2023-10-04 10:39:27,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 10:39:29,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:39:30,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:34,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 10:39:34,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:39:38,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:39:38,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 10:39:40,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:40,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:42,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1625173.3333333333, ans=0.125 2023-10-04 10:39:44,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:44,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:39:46,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 10:39:47,491 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 10:39:49,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:51,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 10:39:51,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:53,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1625240.0, ans=0.125 2023-10-04 10:39:56,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 10:40:00,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:40:00,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:01,522 INFO [train.py:1046] (2/4) Epoch 46, batch 4750, loss[loss=0.1508, simple_loss=0.2259, pruned_loss=0.03787, over 23646.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03633, over 4709396.39 frames. ], batch size: 256, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:40:03,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.37 vs. limit=22.5 2023-10-04 10:40:05,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:05,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:40:06,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1625306.6666666667, ans=0.125 2023-10-04 10:40:07,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 10:40:07,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:10,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 10:40:11,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:40:12,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:40:12,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:40:16,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1625373.3333333333, ans=0.07 2023-10-04 10:40:17,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 10:40:22,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:40:24,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 10:40:24,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:40:28,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:40:28,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:40:28,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:29,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.69 vs. limit=22.5 2023-10-04 10:40:29,805 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 10:40:29,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 10:40:30,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1625440.0, ans=0.125 2023-10-04 10:40:35,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1625440.0, ans=0.0 2023-10-04 10:40:36,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 10:40:36,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1625440.0, ans=0.125 2023-10-04 10:40:39,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:40,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:40:43,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:40:43,584 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 10:40:43,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:40:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:40:48,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1625506.6666666667, ans=0.125 2023-10-04 10:40:49,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:40:52,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 10:40:52,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 10:40:53,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:54,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:40:54,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:57,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:40:57,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 10:40:59,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 10:41:03,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:04,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:41:04,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 10:41:05,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:41:07,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:41:10,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:10,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:41:12,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:41:12,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 10:41:14,302 INFO [train.py:1046] (2/4) Epoch 46, batch 4800, loss[loss=0.1645, simple_loss=0.2535, pruned_loss=0.03777, over 24463.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2354, pruned_loss=0.03655, over 4707442.62 frames. ], batch size: 69, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:41:14,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 10:41:14,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 10:41:15,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:41:17,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:41:18,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 10:41:21,365 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.989e+02 2.249e+02 2.577e+02 3.690e+02, threshold=4.497e+02, percent-clipped=0.0 2023-10-04 10:41:22,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:22,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:28,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:41:29,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:41:29,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:30,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1625706.6666666667, ans=0.1 2023-10-04 10:41:31,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 10:41:31,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:41:32,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:41:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:41:35,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1625706.6666666667, ans=0.0 2023-10-04 10:41:36,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1625706.6666666667, ans=0.1 2023-10-04 10:41:38,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:41:40,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:40,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:41:42,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:42,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 10:41:42,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:43,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:41:45,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:48,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:49,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:49,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:41:51,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:41:51,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1625773.3333333333, ans=0.0 2023-10-04 10:41:52,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:55,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 10:41:55,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 10:41:55,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:57,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:41:57,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:41:57,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:41:57,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:41:58,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:42:00,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:42:05,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:42:06,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1625840.0, ans=0.125 2023-10-04 10:42:07,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:09,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:12,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 10:42:12,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:42:13,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:13,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:42:15,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:42:16,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1625906.6666666667, ans=0.2 2023-10-04 10:42:18,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:42:20,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:42:20,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:20,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:42:22,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:42:22,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:42:25,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:25,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:25,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:42:26,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 10:42:28,549 INFO [train.py:1046] (2/4) Epoch 46, batch 4850, loss[loss=0.1491, simple_loss=0.2334, pruned_loss=0.03243, over 23267.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2354, pruned_loss=0.03676, over 4699568.30 frames. ], batch size: 105, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:42:29,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 10:42:29,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:42:29,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:42:30,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:42:30,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:35,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:42:36,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1625973.3333333333, ans=0.1 2023-10-04 10:42:40,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 10:42:42,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:47,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:42:47,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:42:48,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:52,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:53,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:42:55,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:42:55,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 10:42:58,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:43:00,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:43:01,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:43:01,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:43:01,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 10:43:04,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1626106.6666666667, ans=0.125 2023-10-04 10:43:06,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:43:06,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:10,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:10,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 10:43:12,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 10:43:12,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:43:17,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:43:18,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 10:43:19,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:43:19,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:43:20,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:43:22,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 10:43:22,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:24,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 10:43:24,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:43:26,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:43:26,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 10:43:32,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.71 vs. limit=15.0 2023-10-04 10:43:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:35,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=12.0 2023-10-04 10:43:39,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:43:39,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:43:42,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1626306.6666666667, ans=0.0 2023-10-04 10:43:43,381 INFO [train.py:1046] (2/4) Epoch 46, batch 4900, loss[loss=0.1483, simple_loss=0.2166, pruned_loss=0.03996, over 23847.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03659, over 4708576.25 frames. ], batch size: 212, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:43:46,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 10:43:46,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:43:50,262 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.004e+02 2.251e+02 2.578e+02 5.064e+02, threshold=4.503e+02, percent-clipped=3.0 2023-10-04 10:43:50,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:43:50,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1626306.6666666667, ans=0.1 2023-10-04 10:43:51,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:43:51,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:43:56,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 10:43:58,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1626373.3333333333, ans=0.0 2023-10-04 10:44:01,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 10:44:04,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 10:44:05,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 10:44:07,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:44:07,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:44:07,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:44:07,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:44:07,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:44:07,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 10:44:08,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.98 vs. limit=12.0 2023-10-04 10:44:11,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 10:44:11,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:44:13,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:44:14,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:44:14,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:44:16,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:44:16,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1626440.0, ans=0.125 2023-10-04 10:44:17,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:17,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 10:44:18,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:44:18,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:44:20,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 10:44:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 10:44:24,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 10:44:26,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:44:26,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1626506.6666666667, ans=0.2 2023-10-04 10:44:27,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:44:27,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:44:28,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:44:29,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 10:44:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:44:29,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 10:44:32,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:32,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1626506.6666666667, ans=0.0 2023-10-04 10:44:35,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:44:36,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:44:39,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 10:44:39,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:44:39,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1626506.6666666667, ans=0.125 2023-10-04 10:44:40,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 10:44:42,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 10:44:47,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:44:49,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:44:49,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 10:44:50,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:44:50,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:44:51,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:55,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:44:55,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:44:56,686 INFO [train.py:1046] (2/4) Epoch 46, batch 4950, loss[loss=0.1369, simple_loss=0.1904, pruned_loss=0.04168, over 19271.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2331, pruned_loss=0.03663, over 4697048.06 frames. ], batch size: 388, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:44:56,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:44:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 10:44:58,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:45:00,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:45:00,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:45:00,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1626640.0, ans=0.1 2023-10-04 10:45:05,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 10:45:06,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 10:45:06,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:45:08,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 10:45:08,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1626640.0, ans=0.125 2023-10-04 10:45:09,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:09,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:45:09,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:45:10,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:12,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:45:12,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:45:14,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:45:14,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1626706.6666666667, ans=0.1 2023-10-04 10:45:15,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:45:17,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:17,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:45:20,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1626706.6666666667, ans=0.125 2023-10-04 10:45:21,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:45:24,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:25,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:45:25,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1626706.6666666667, ans=0.0 2023-10-04 10:45:29,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:30,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:31,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:45:32,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 10:45:32,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 10:45:33,490 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.72 vs. limit=6.0 2023-10-04 10:45:34,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:35,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:45:35,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:45:35,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:45:35,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:45:37,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:45:37,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1626773.3333333333, ans=10.0 2023-10-04 10:45:39,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:45:42,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:45:44,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:45:44,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1626840.0, ans=0.125 2023-10-04 10:45:47,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:47,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:48,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 10:45:48,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:45:49,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:45:52,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:45:54,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:45:54,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:45:54,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:54,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1626840.0, ans=0.025 2023-10-04 10:45:54,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.16 vs. limit=10.0 2023-10-04 10:45:56,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:45:57,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:45:57,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:45:58,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:46:00,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:46:00,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 10:46:02,748 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=15.0 2023-10-04 10:46:05,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:11,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 10:46:11,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:46:14,441 INFO [train.py:1046] (2/4) Epoch 46, batch 5000, loss[loss=0.1445, simple_loss=0.233, pruned_loss=0.02803, over 24607.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.233, pruned_loss=0.03635, over 4688862.74 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:46:16,482 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=15.0 2023-10-04 10:46:17,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:46:17,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:46:18,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 10:46:18,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 10:46:21,352 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.198e+02 2.645e+02 3.298e+02 6.003e+02, threshold=5.290e+02, percent-clipped=6.0 2023-10-04 10:46:21,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:46:22,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 10:46:22,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:46:24,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:46:24,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 10:46:24,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:25,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:46:27,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 10:46:27,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:27,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:46:27,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1627040.0, ans=0.125 2023-10-04 10:46:28,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 10:46:30,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 10:46:31,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:46:31,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 10:46:31,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:46:32,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:33,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:46:33,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 10:46:33,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 10:46:34,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 10:46:34,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:36,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:37,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 10:46:37,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:46:40,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:41,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:41,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:46:41,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1627106.6666666667, ans=0.125 2023-10-04 10:46:43,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 10:46:43,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1627106.6666666667, ans=0.0 2023-10-04 10:46:44,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:46:45,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1627106.6666666667, ans=0.125 2023-10-04 10:46:46,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:46:48,922 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 10:46:52,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:46:54,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:54,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:46:57,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 10:46:57,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:57,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:46:59,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:47:01,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 10:47:01,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:47:05,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:47:07,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:11,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 10:47:14,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:14,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1627240.0, ans=0.125 2023-10-04 10:47:18,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1627240.0, ans=0.125 2023-10-04 10:47:21,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:47:21,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1627240.0, ans=0.1 2023-10-04 10:47:23,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:23,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:47:23,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:47:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:47:24,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:47:24,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:27,695 INFO [train.py:1046] (2/4) Epoch 46, batch 5050, loss[loss=0.1799, simple_loss=0.2634, pruned_loss=0.04816, over 24009.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2337, pruned_loss=0.03636, over 4698152.20 frames. ], batch size: 80, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:47:27,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:27,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 10:47:29,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:47:30,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:47:30,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1627306.6666666667, ans=0.0 2023-10-04 10:47:31,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:47:32,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 10:47:33,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:35,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:47:36,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1627306.6666666667, ans=0.0 2023-10-04 10:47:39,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:47:40,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:47:41,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:47:49,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 10:47:49,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1627373.3333333333, ans=0.0 2023-10-04 10:47:49,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1627373.3333333333, ans=0.125 2023-10-04 10:47:51,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:47:51,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:47:52,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 10:47:52,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:47:53,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:47:53,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:55,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:47:55,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 10:47:56,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 10:47:58,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:47:59,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:03,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:48:03,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 10:48:05,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:48:07,905 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.48 vs. limit=15.0 2023-10-04 10:48:09,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 10:48:09,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:48:11,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:48:11,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:11,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:48:12,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:48:14,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:48:15,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:15,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:48:16,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:48:16,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 10:48:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:48:20,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:48:22,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:48:22,933 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 10:48:22,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:48:25,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:48:27,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 10:48:29,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:29,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 10:48:29,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:33,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:34,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:34,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 10:48:36,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 10:48:38,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:48:38,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:48:39,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:48:41,337 INFO [train.py:1046] (2/4) Epoch 46, batch 5100, loss[loss=0.1592, simple_loss=0.2413, pruned_loss=0.03857, over 23286.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03668, over 4691318.23 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:48:41,462 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 10:48:44,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:45,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 10:48:46,000 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.37 vs. limit=15.0 2023-10-04 10:48:46,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 10:48:48,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:48:49,335 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.978e+02 2.173e+02 2.426e+02 3.698e+02, threshold=4.345e+02, percent-clipped=0.0 2023-10-04 10:48:49,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:48:51,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:48:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 10:48:52,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 10:48:58,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:58,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:49:04,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:49:05,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 10:49:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:49:07,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:49:08,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:49:09,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:11,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:11,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 10:49:12,607 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 10:49:13,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:13,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 10:49:15,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 10:49:16,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:49:24,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:49:27,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 10:49:27,464 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 10:49:27,472 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 10:49:28,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 10:49:28,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:31,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 10:49:35,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 10:49:39,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:49:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:49:42,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 10:49:43,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:49:43,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 10:49:49,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:49:49,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:49:49,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:49:49,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:49:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:49:51,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:49:52,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 10:49:52,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 10:49:53,436 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.15 vs. limit=22.5 2023-10-04 10:49:54,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 10:49:54,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:49:54,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 10:49:55,478 INFO [train.py:1046] (2/4) Epoch 46, batch 5150, loss[loss=0.141, simple_loss=0.2293, pruned_loss=0.02631, over 24491.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2355, pruned_loss=0.03732, over 4697129.53 frames. ], batch size: 63, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:49:56,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:49:57,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 10:49:59,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:00,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1627973.3333333333, ans=0.125 2023-10-04 10:50:01,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:05,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1627973.3333333333, ans=0.0 2023-10-04 10:50:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:50:07,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 10:50:09,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:10,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:50:10,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1628040.0, ans=0.05 2023-10-04 10:50:11,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:50:11,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:50:12,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=15.0 2023-10-04 10:50:13,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:50:13,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:50:13,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:50:13,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 10:50:15,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:50:16,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:50:17,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:50:19,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 10:50:20,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:50:25,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:50:28,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 10:50:32,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:50:38,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:50:40,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:42,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1628173.3333333333, ans=0.2 2023-10-04 10:50:43,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:50:43,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:50:44,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 10:50:49,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:50,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:50:50,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:50:53,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:50:53,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:50:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 10:50:58,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:59,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:51:01,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:51:01,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:51:03,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:51:04,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:51:04,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:51:04,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:51:07,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:51:10,507 INFO [train.py:1046] (2/4) Epoch 46, batch 5200, loss[loss=0.1518, simple_loss=0.2346, pruned_loss=0.03452, over 23733.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2351, pruned_loss=0.03716, over 4699639.79 frames. ], batch size: 149, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:51:10,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:51:13,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:15,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1628306.6666666667, ans=0.125 2023-10-04 10:51:16,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1628306.6666666667, ans=0.09899494936611666 2023-10-04 10:51:17,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 10:51:19,044 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.096e+02 2.291e+02 2.477e+02 4.065e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 10:51:19,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:51:19,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:21,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:23,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:51:23,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:24,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 10:51:26,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:51:26,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1628373.3333333333, ans=0.125 2023-10-04 10:51:27,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:51:30,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 10:51:30,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:51:32,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:51:34,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 10:51:34,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 10:51:36,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 10:51:37,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1628373.3333333333, ans=0.09899494936611666 2023-10-04 10:51:38,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:51:38,115 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 10:51:38,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:39,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:51:41,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:51:42,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 10:51:43,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:51:45,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:48,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 10:51:48,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 10:51:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 10:51:50,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1628440.0, ans=0.0 2023-10-04 10:51:54,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 10:51:54,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:51:57,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:51:57,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:51:58,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 10:51:58,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:58,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 10:51:58,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:00,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:52:00,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1628506.6666666667, ans=0.0 2023-10-04 10:52:03,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:52:04,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:52:07,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:52:08,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:08,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:14,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:52:14,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 10:52:16,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:52:16,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:52:18,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:18,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1628573.3333333333, ans=0.1 2023-10-04 10:52:19,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:52:20,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:52:23,573 INFO [train.py:1046] (2/4) Epoch 46, batch 5250, loss[loss=0.1523, simple_loss=0.2416, pruned_loss=0.03153, over 24359.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2347, pruned_loss=0.03702, over 4702436.39 frames. ], batch size: 74, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:52:23,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:52:27,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:27,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:52:29,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:52:33,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:52:37,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:52:39,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:52:39,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:52:42,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 10:52:42,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:43,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1628706.6666666667, ans=0.0 2023-10-04 10:52:44,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:46,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1628706.6666666667, ans=0.0 2023-10-04 10:52:57,266 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-10-04 10:53:00,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1628773.3333333333, ans=0.0 2023-10-04 10:53:03,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1628773.3333333333, ans=0.0 2023-10-04 10:53:32,594 INFO [train.py:1046] (2/4) Epoch 46, batch 5300, loss[loss=0.1406, simple_loss=0.22, pruned_loss=0.03055, over 24603.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2335, pruned_loss=0.03667, over 4687737.78 frames. ], batch size: 60, lr: 2.19e-03, grad_scale: 16.0 2023-10-04 10:53:40,495 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.050e+02 2.264e+02 2.625e+02 3.522e+02, threshold=4.529e+02, percent-clipped=0.0 2023-10-04 10:53:41,370 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-10-04 10:53:46,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:53:47,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 10:53:47,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 10:53:47,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:47,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:47,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:47,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:47,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:47,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:53:47,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:47,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:53:47,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:53:47,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 10:53:48,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 10:53:48,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 10:53:48,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:53:48,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 10:53:48,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 10:53:48,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:48,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:48,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:53:48,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:53:49,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:53:49,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:53:49,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:49,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:49,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:53:49,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:49,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:53:49,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:49,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:53:50,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 10:53:50,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:53:50,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:50,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 10:53:50,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 10:53:50,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:53:50,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:53:50,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 10:53:50,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 10:53:50,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:53:51,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:53:51,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:53:51,945 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 10:53:52,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 10:53:52,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:53:52,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:52,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 10:53:52,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 10:53:52,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 10:53:52,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:53:58,615 INFO [train.py:1046] (2/4) Epoch 47, batch 0, loss[loss=0.1489, simple_loss=0.2279, pruned_loss=0.03498, over 23454.00 frames. ], tot_loss[loss=0.1489, simple_loss=0.2279, pruned_loss=0.03498, over 23454.00 frames. ], batch size: 106, lr: 2.17e-03, grad_scale: 32.0 2023-10-04 10:53:58,616 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 10:54:09,054 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.4366, 5.0441, 4.6373, 5.0236], device='cuda:2') 2023-10-04 10:54:11,059 INFO [train.py:1078] (2/4) Epoch 47, validation: loss=0.3566, simple_loss=0.2776, pruned_loss=0.2178, over 1125622.00 frames. 2023-10-04 10:54:11,060 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 10:54:12,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 10:54:12,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:54:14,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:54:14,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1629053.3333333333, ans=0.125 2023-10-04 10:54:15,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1629053.3333333333, ans=0.125 2023-10-04 10:54:19,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:19,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:54:21,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:21,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 10:54:23,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 10:54:25,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:25,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:29,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:30,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:30,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:54:30,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:54:33,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 10:54:35,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:54:42,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:54:42,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:46,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 10:54:49,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:54:49,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:54:52,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:54:56,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:54:59,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:03,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 10:55:08,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 10:55:09,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:55:09,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:11,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:55:11,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:55:12,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 10:55:14,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:16,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:17,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.68 vs. limit=22.5 2023-10-04 10:55:19,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:55:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 10:55:24,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:55:25,781 INFO [train.py:1046] (2/4) Epoch 47, batch 50, loss[loss=0.1548, simple_loss=0.2413, pruned_loss=0.03409, over 24282.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2356, pruned_loss=0.03668, over 1053850.08 frames. ], batch size: 61, lr: 2.17e-03, grad_scale: 32.0 2023-10-04 10:55:27,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:55:27,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1629386.6666666667, ans=0.025 2023-10-04 10:55:29,504 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-10-04 10:55:30,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:55:30,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 10:55:30,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:55:30,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:55:31,736 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:55:32,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:55:34,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:55:36,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1629386.6666666667, ans=0.125 2023-10-04 10:55:38,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:55:41,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 10:55:41,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:47,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:55:47,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 10:55:50,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 10:55:50,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1629453.3333333333, ans=0.2 2023-10-04 10:55:51,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:55:53,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:55:53,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:53,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1629520.0, ans=0.2 2023-10-04 10:55:55,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:55:56,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:55:56,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:55:56,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:59,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1629520.0, ans=0.0 2023-10-04 10:56:02,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:56:02,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:03,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:56:03,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 10:56:05,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:56:06,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:56:06,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 10:56:07,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.23 vs. limit=22.5 2023-10-04 10:56:07,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:56:08,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1629586.6666666667, ans=0.1 2023-10-04 10:56:11,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 10:56:12,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1629586.6666666667, ans=0.025 2023-10-04 10:56:14,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1629586.6666666667, ans=0.025 2023-10-04 10:56:17,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:56:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:56:18,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:19,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:56:19,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:56:21,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 10:56:23,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 10:56:24,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:24,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:56:27,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:56:28,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:56:28,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 10:56:28,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 10:56:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:56:31,479 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 2.113e+02 2.383e+02 2.822e+02 6.328e+02, threshold=4.766e+02, percent-clipped=7.0 2023-10-04 10:56:31,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:56:31,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:56:31,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 10:56:31,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 10:56:33,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:56:33,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:35,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:56:35,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:56:38,365 INFO [train.py:1046] (2/4) Epoch 47, batch 100, loss[loss=0.1612, simple_loss=0.2414, pruned_loss=0.04045, over 23498.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2359, pruned_loss=0.03716, over 1871443.10 frames. ], batch size: 106, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:56:40,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:56:42,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:56:47,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:56:48,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 10:56:48,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:53,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:56:53,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:56:54,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:54,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:56:54,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:56:55,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 10:56:59,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:56:59,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:00,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:00,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:57:04,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 10:57:04,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:06,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:07,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:57:09,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:57:13,173 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 10:57:13,187 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 10:57:13,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1629853.3333333333, ans=0.0 2023-10-04 10:57:14,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:14,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:57:17,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:57:19,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:19,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1629853.3333333333, ans=0.125 2023-10-04 10:57:20,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:25,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:27,219 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 10:57:28,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 10:57:31,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:57:33,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:57:33,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1629920.0, ans=0.05 2023-10-04 10:57:35,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:38,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.54 vs. limit=15.0 2023-10-04 10:57:38,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:40,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:57:41,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:57:44,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:44,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:47,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:47,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:57:47,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:49,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 10:57:49,284 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 10:57:49,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:49,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1629986.6666666667, ans=0.1 2023-10-04 10:57:50,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:57:50,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:50,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:50,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:57:50,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:57:50,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:57:50,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:52,184 INFO [train.py:1046] (2/4) Epoch 47, batch 150, loss[loss=0.1675, simple_loss=0.2387, pruned_loss=0.04812, over 23833.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2348, pruned_loss=0.03611, over 2520463.61 frames. ], batch size: 164, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:57:52,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:52,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1630053.3333333333, ans=0.1 2023-10-04 10:57:53,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:53,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:57:53,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1630053.3333333333, ans=0.2 2023-10-04 10:57:55,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:57:57,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:58,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:57:58,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:57:58,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:03,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:58:04,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:06,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1630120.0, ans=0.1 2023-10-04 10:58:08,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:58:08,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:11,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 10:58:11,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 10:58:11,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 10:58:14,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:58:14,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:58:15,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:58:17,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:58:17,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:58:17,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:19,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:20,535 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 10:58:23,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:58:29,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:58:33,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:58:33,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 10:58:35,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1630253.3333333333, ans=0.0 2023-10-04 10:58:38,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:58:38,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:58:38,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:58:38,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1630253.3333333333, ans=0.125 2023-10-04 10:58:41,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1630253.3333333333, ans=0.125 2023-10-04 10:58:42,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:58:44,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:58:45,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:58:45,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:46,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 10:58:47,876 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.70 vs. limit=22.5 2023-10-04 10:58:51,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:53,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:58:53,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:58:53,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:58:54,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:57,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 10:58:57,457 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:58:58,419 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.983e+02 2.136e+02 2.354e+02 3.118e+02, threshold=4.272e+02, percent-clipped=0.0 2023-10-04 10:58:58,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:59:01,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:59:03,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:04,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:59:04,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 10:59:04,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:59:04,814 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 10:59:06,082 INFO [train.py:1046] (2/4) Epoch 47, batch 200, loss[loss=0.1589, simple_loss=0.2414, pruned_loss=0.03817, over 23388.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2356, pruned_loss=0.0367, over 3015777.02 frames. ], batch size: 119, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:59:09,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:59:12,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:59:12,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:59:14,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 10:59:16,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:16,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:18,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.83 vs. limit=15.0 2023-10-04 10:59:19,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 10:59:20,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:59:22,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:22,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:59:23,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1630453.3333333333, ans=0.125 2023-10-04 10:59:25,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:59:25,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:59:25,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:34,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1630520.0, ans=0.125 2023-10-04 10:59:40,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1630520.0, ans=0.0 2023-10-04 10:59:44,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:59:44,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:59:45,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:59:47,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:59:47,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:59:47,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:59:50,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:59:50,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:59:51,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:52,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:59:53,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 10:59:54,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:59:54,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:57,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:00:03,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:00:10,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:12,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:00:18,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:19,532 INFO [train.py:1046] (2/4) Epoch 47, batch 250, loss[loss=0.1467, simple_loss=0.2119, pruned_loss=0.04077, over 22638.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2361, pruned_loss=0.03717, over 3386933.33 frames. ], batch size: 322, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:00:21,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 11:00:21,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:00:21,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:00:21,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:00:21,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:00:22,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 11:00:23,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=22.5 2023-10-04 11:00:24,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:00:24,400 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 11:00:25,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:28,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:00:29,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:29,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:00:29,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1630720.0, ans=0.2 2023-10-04 11:00:30,977 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.01 vs. limit=10.0 2023-10-04 11:00:31,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:00:31,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:32,435 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=15.0 2023-10-04 11:00:33,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:00:33,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1630786.6666666667, ans=0.125 2023-10-04 11:00:35,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:00:46,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:00:48,879 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.72 vs. limit=12.0 2023-10-04 11:00:49,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:00:49,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:00:50,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1630853.3333333333, ans=0.125 2023-10-04 11:00:52,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1630853.3333333333, ans=0.0 2023-10-04 11:00:56,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:00:57,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:00:57,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:00:59,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:01:00,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:01:00,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:01:00,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:01:01,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1630853.3333333333, ans=0.2 2023-10-04 11:01:01,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.45 vs. limit=6.0 2023-10-04 11:01:02,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:01:05,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 11:01:05,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:01:07,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:01:07,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:01:07,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:01:08,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:01:10,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:01:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:01:11,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:12,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1630920.0, ans=0.1 2023-10-04 11:01:13,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:01:14,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:14,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1630920.0, ans=0.0 2023-10-04 11:01:16,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:01:21,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:25,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:01:25,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1630986.6666666667, ans=0.0 2023-10-04 11:01:28,127 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.109e+02 2.410e+02 2.789e+02 4.244e+02, threshold=4.821e+02, percent-clipped=0.0 2023-10-04 11:01:29,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:31,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:01:33,981 INFO [train.py:1046] (2/4) Epoch 47, batch 300, loss[loss=0.1354, simple_loss=0.212, pruned_loss=0.02943, over 24264.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03691, over 3672492.85 frames. ], batch size: 56, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:01:34,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 11:01:34,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:01:34,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:01:35,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 11:01:35,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:01:37,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:01:37,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 11:01:42,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:42,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:01:45,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:01:45,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 11:01:46,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.88 vs. limit=6.0 2023-10-04 11:01:49,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:50,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:01:50,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 11:01:50,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:01:54,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:01:57,436 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.59 vs. limit=8.0 2023-10-04 11:01:59,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:02:01,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 11:02:01,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1631120.0, ans=0.1 2023-10-04 11:02:03,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 11:02:03,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:06,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:02:08,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:08,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 11:02:08,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:02:11,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:02:14,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:02:14,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:02:15,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1631186.6666666667, ans=0.2 2023-10-04 11:02:16,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:02:16,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 11:02:18,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:02:21,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:22,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 11:02:24,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:02:27,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:02:29,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:02:29,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 11:02:29,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1631253.3333333333, ans=0.2 2023-10-04 11:02:34,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:34,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:02:38,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:39,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:02:39,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 11:02:39,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:02:40,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:02:42,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 11:02:42,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:43,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:02:45,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:02:46,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:46,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1631386.6666666667, ans=0.125 2023-10-04 11:02:48,375 INFO [train.py:1046] (2/4) Epoch 47, batch 350, loss[loss=0.1473, simple_loss=0.2263, pruned_loss=0.03417, over 23559.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2332, pruned_loss=0.03635, over 3907403.69 frames. ], batch size: 149, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:02:48,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:02:48,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 11:02:51,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:56,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1631386.6666666667, ans=0.125 2023-10-04 11:02:57,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:03:01,246 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=15.0 2023-10-04 11:03:02,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:02,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:04,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 11:03:04,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1631453.3333333333, ans=0.125 2023-10-04 11:03:06,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:03:06,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 11:03:08,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:09,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 11:03:09,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:03:12,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 11:03:13,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:03:15,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:03:15,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:03:16,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:16,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:18,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:03:18,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:18,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:03:21,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:03:21,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:28,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:03:28,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:03:29,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:03:29,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:33,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 11:03:33,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:39,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:39,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:39,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:03:39,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1631586.6666666667, ans=0.0 2023-10-04 11:03:40,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 11:03:43,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:43,728 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 11:03:43,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 11:03:45,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:46,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:03:48,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 11:03:48,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:51,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:03:53,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:53,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:53,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:54,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:57,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1631653.3333333333, ans=0.1 2023-10-04 11:03:58,116 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.078e+02 2.298e+02 2.651e+02 3.732e+02, threshold=4.596e+02, percent-clipped=0.0 2023-10-04 11:03:58,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:04:00,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:04:02,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 11:04:02,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:04:03,546 INFO [train.py:1046] (2/4) Epoch 47, batch 400, loss[loss=0.155, simple_loss=0.2307, pruned_loss=0.0396, over 23526.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2327, pruned_loss=0.03667, over 4071393.59 frames. ], batch size: 134, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:04:03,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:04,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:04:06,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:06,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1631720.0, ans=0.125 2023-10-04 11:04:06,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1631720.0, ans=0.125 2023-10-04 11:04:08,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:04:09,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:12,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 11:04:14,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 11:04:14,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:15,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 11:04:16,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:21,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:04:21,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:21,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 11:04:21,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:04:21,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1631786.6666666667, ans=0.125 2023-10-04 11:04:22,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:22,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:22,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:04:24,834 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 11:04:24,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 11:04:30,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:32,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:04:32,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 11:04:33,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 11:04:36,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:04:39,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:04:44,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 11:04:46,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:04:48,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 11:04:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:51,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:04:53,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 11:04:56,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1631920.0, ans=0.125 2023-10-04 11:04:56,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1631920.0, ans=0.2 2023-10-04 11:04:57,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:05:00,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:05:01,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:05:03,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1631986.6666666667, ans=0.1 2023-10-04 11:05:04,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:04,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 11:05:07,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 11:05:07,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 11:05:10,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:05:10,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:05:12,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 11:05:13,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:05:14,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:05:14,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:05:17,425 INFO [train.py:1046] (2/4) Epoch 47, batch 450, loss[loss=0.1596, simple_loss=0.2437, pruned_loss=0.03774, over 23976.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2332, pruned_loss=0.03637, over 4228684.45 frames. ], batch size: 86, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:05:17,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 11:05:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:05:17,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:05:18,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:05:18,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 11:05:19,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:05:19,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1632053.3333333333, ans=0.0 2023-10-04 11:05:20,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:05:21,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:05:31,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:31,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.75 vs. limit=22.5 2023-10-04 11:05:32,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:05:34,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 11:05:35,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 11:05:36,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1632120.0, ans=0.1 2023-10-04 11:05:37,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:05:40,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:41,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:05:43,658 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.52 vs. limit=15.0 2023-10-04 11:05:44,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:05:45,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:05:47,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 11:05:48,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 11:05:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 11:05:50,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:05:50,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:05:52,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:05:53,983 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 11:05:53,992 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 11:05:54,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:56,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:05:56,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 11:05:57,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1632186.6666666667, ans=0.1 2023-10-04 11:05:59,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:05:59,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:06:00,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:06:01,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 11:06:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:06:04,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:06:04,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:06:06,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 11:06:11,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:06:12,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 11:06:13,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 11:06:14,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:06:18,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:06:21,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:06:21,989 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.19 vs. limit=15.0 2023-10-04 11:06:23,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:06:23,121 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 11:06:23,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1632320.0, ans=0.125 2023-10-04 11:06:23,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.78 vs. limit=15.0 2023-10-04 11:06:27,507 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.974e+02 2.147e+02 2.474e+02 4.734e+02, threshold=4.294e+02, percent-clipped=1.0 2023-10-04 11:06:27,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:06:28,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:06:30,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:06:30,656 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 11:06:31,997 INFO [train.py:1046] (2/4) Epoch 47, batch 500, loss[loss=0.1576, simple_loss=0.2453, pruned_loss=0.03501, over 24623.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2343, pruned_loss=0.03682, over 4338031.44 frames. ], batch size: 73, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:06:32,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 11:06:32,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:06:33,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:06:37,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:06:39,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:06:41,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:06:41,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:06:42,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:06:52,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:53,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:06:53,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:06:55,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:55,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 11:06:55,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:06:58,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:06:58,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:06:58,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:06:58,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:59,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 11:07:04,499 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 11:07:07,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:07,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:08,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:08,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:10,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:07:12,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 11:07:14,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:07:14,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:16,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1632586.6666666667, ans=0.0 2023-10-04 11:07:20,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:20,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1632586.6666666667, ans=0.0 2023-10-04 11:07:20,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1632586.6666666667, ans=0.125 2023-10-04 11:07:21,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:27,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:30,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 11:07:30,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:30,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:34,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 11:07:34,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:07:35,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:36,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1632653.3333333333, ans=0.2 2023-10-04 11:07:40,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 11:07:41,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 11:07:41,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:41,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 11:07:41,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:07:42,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:43,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:44,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:44,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:07:45,985 INFO [train.py:1046] (2/4) Epoch 47, batch 550, loss[loss=0.146, simple_loss=0.2253, pruned_loss=0.03341, over 24578.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2349, pruned_loss=0.03699, over 4419707.02 frames. ], batch size: 60, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:07:46,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:07:46,934 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=15.0 2023-10-04 11:07:48,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:48,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 11:07:50,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:07:51,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1632720.0, ans=0.1 2023-10-04 11:07:51,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1632720.0, ans=0.0 2023-10-04 11:07:56,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:07:56,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:58,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:08:00,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:08:04,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 11:08:06,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 11:08:07,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=12.0 2023-10-04 11:08:09,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:08:15,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:08:15,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:08:17,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:08:19,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:19,936 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 11:08:20,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:08:21,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:08:22,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.59 vs. limit=15.0 2023-10-04 11:08:24,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:08:25,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:08:25,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:08:26,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:28,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 11:08:30,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 11:08:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:30,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:08:31,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:08:31,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:08:34,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:08:36,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:08:37,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.17 vs. limit=15.0 2023-10-04 11:08:38,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:08:38,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:40,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 11:08:40,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:08:42,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:44,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:08:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:46,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:08:46,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 11:08:52,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 11:08:55,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.110e+02 2.356e+02 2.673e+02 4.561e+02, threshold=4.712e+02, percent-clipped=1.0 2023-10-04 11:08:55,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 11:08:56,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:08:57,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:08:57,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:59,186 INFO [train.py:1046] (2/4) Epoch 47, batch 600, loss[loss=0.145, simple_loss=0.2365, pruned_loss=0.02676, over 24658.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2348, pruned_loss=0.03682, over 4484497.06 frames. ], batch size: 68, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:09:02,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1633053.3333333333, ans=0.05 2023-10-04 11:09:04,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:09:07,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:09:08,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 11:09:10,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1633053.3333333333, ans=0.125 2023-10-04 11:09:11,523 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:09:13,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:09:14,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:15,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1633120.0, ans=0.2 2023-10-04 11:09:17,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 11:09:17,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:09:23,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 11:09:27,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:09:27,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:27,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:09:33,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:09:33,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:09:34,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:09:40,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:09:45,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:09:45,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:09:45,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:54,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 11:09:55,855 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:09:58,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:09:58,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:10:01,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 11:10:03,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:10:05,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 11:10:05,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:10:06,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:10:08,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1633320.0, ans=0.125 2023-10-04 11:10:12,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 11:10:13,951 INFO [train.py:1046] (2/4) Epoch 47, batch 650, loss[loss=0.1403, simple_loss=0.2297, pruned_loss=0.02551, over 24494.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2343, pruned_loss=0.03663, over 4546387.72 frames. ], batch size: 66, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:10:14,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:10:14,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1633386.6666666667, ans=0.025 2023-10-04 11:10:16,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:10:17,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:10:20,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:22,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 11:10:24,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:10:24,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1633386.6666666667, ans=0.125 2023-10-04 11:10:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:10:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:10:32,638 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:36,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 11:10:38,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:10:38,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:10:43,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:10:43,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:10:45,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:46,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:46,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:10:48,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:49,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:10:50,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1633520.0, ans=0.125 2023-10-04 11:10:51,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:10:51,700 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 11:10:51,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:51,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:10:51,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1633520.0, ans=0.125 2023-10-04 11:10:54,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:56,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:10:56,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:10:57,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:10:58,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 11:11:00,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:11:00,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:11:01,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:11:01,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:11:02,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:11:04,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 11:11:04,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1633586.6666666667, ans=0.0 2023-10-04 11:11:05,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 11:11:05,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:05,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:11:05,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:11:07,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:11:09,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:11:16,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.99 vs. limit=22.5 2023-10-04 11:11:16,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:16,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:11:17,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:11:21,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:11:21,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:11:21,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:11:24,156 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.080e+02 2.396e+02 2.903e+02 4.504e+02, threshold=4.792e+02, percent-clipped=0.0 2023-10-04 11:11:27,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:11:27,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:11:28,583 INFO [train.py:1046] (2/4) Epoch 47, batch 700, loss[loss=0.1608, simple_loss=0.2418, pruned_loss=0.03991, over 23343.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2334, pruned_loss=0.03656, over 4578936.83 frames. ], batch size: 93, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:11:28,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:11:28,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:11:32,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 11:11:34,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 11:11:35,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 11:11:35,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:37,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:11:39,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 11:11:43,952 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:11:46,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:11:48,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:48,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:11:48,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:11:50,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1633786.6666666667, ans=0.125 2023-10-04 11:11:51,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:54,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 11:11:54,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:11:54,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 11:11:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 11:12:02,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:12:02,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:12:03,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:12:06,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:12:08,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 11:12:12,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:13,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:12:13,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 11:12:18,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:12:18,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:20,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:12:27,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:12:27,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 11:12:30,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 11:12:30,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 11:12:32,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:33,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:12:35,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:12:38,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:38,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 11:12:43,403 INFO [train.py:1046] (2/4) Epoch 47, batch 750, loss[loss=0.1471, simple_loss=0.2365, pruned_loss=0.02881, over 24516.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2331, pruned_loss=0.03612, over 4612274.37 frames. ], batch size: 63, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:12:44,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 11:12:44,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 11:12:44,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 11:12:46,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 11:12:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 11:12:47,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:12:47,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 11:12:49,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:49,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:12:51,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:12:52,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:52,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:12:52,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:12:56,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:12:56,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:12:57,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.40 vs. limit=10.0 2023-10-04 11:12:59,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:13:03,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:13:03,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:13:03,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 11:13:04,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:13:06,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:13:07,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:13:07,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:13:09,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 11:13:09,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:13:12,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 11:13:12,331 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 11:13:13,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1634186.6666666667, ans=0.125 2023-10-04 11:13:14,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 11:13:14,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:13:14,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:13:15,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:13:18,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1634186.6666666667, ans=0.0 2023-10-04 11:13:24,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:13:24,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:24,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:13:25,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:13:27,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:13:28,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 11:13:28,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:13:29,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 11:13:29,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:13:32,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:13:32,928 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:13:34,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 11:13:34,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:39,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:13:41,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:13:42,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:13:44,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:13:48,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 11:13:48,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:13:49,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:13:51,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1634320.0, ans=0.0 2023-10-04 11:13:52,913 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.939e+02 2.140e+02 2.417e+02 3.613e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-04 11:13:53,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:13:53,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:13:56,949 INFO [train.py:1046] (2/4) Epoch 47, batch 800, loss[loss=0.1691, simple_loss=0.2558, pruned_loss=0.0412, over 24378.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2341, pruned_loss=0.03593, over 4656786.00 frames. ], batch size: 77, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:13:57,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:58,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:14:05,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:14:05,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:06,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:14:06,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:14:08,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:08,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:09,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:12,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:14,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:14:15,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 11:14:15,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:17,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:14:18,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:14:19,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:14:20,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 11:14:20,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:20,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 11:14:23,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:26,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:27,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:14:27,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:14:30,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:31,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:35,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:14:35,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:14:35,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 11:14:37,301 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 11:14:38,652 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 11:14:38,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:14:38,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:14:41,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:41,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:14:47,372 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 11:14:47,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 11:14:48,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:14:49,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:14:55,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:14:58,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:58,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1634653.3333333333, ans=0.125 2023-10-04 11:15:00,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 11:15:00,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:15:03,326 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:15:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 11:15:08,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:15:09,816 INFO [train.py:1046] (2/4) Epoch 47, batch 850, loss[loss=0.1586, simple_loss=0.2387, pruned_loss=0.03923, over 23781.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2341, pruned_loss=0.03606, over 4667429.39 frames. ], batch size: 212, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:15:11,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:15:12,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 11:15:12,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:15:14,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:15:14,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 11:15:14,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:14,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1634720.0, ans=0.125 2023-10-04 11:15:15,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:15:17,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:20,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:15:22,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:15:23,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 11:15:23,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 11:15:24,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 11:15:26,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:15:26,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:15:27,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:27,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:15:27,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:15:33,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:34,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:35,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 11:15:36,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 11:15:40,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:41,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 11:15:44,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 11:15:46,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 11:15:46,305 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 11:15:46,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:15:46,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:15:48,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:15:50,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:51,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:51,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 11:15:54,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:15:54,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:54,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:15:56,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:15:56,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:15:56,882 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.32 vs. limit=22.5 2023-10-04 11:15:57,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:15:58,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 11:16:04,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:16:04,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:16:04,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:16:06,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:16:06,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1634920.0, ans=0.125 2023-10-04 11:16:07,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:16:10,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:16:11,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:16:11,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:16:13,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:13,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:16:16,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1634986.6666666667, ans=0.0 2023-10-04 11:16:19,634 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.035e+02 2.355e+02 2.898e+02 4.168e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 11:16:21,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:16:21,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:16:23,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 11:16:23,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:16:23,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:16:24,530 INFO [train.py:1046] (2/4) Epoch 47, batch 900, loss[loss=0.1666, simple_loss=0.2479, pruned_loss=0.0426, over 23602.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2356, pruned_loss=0.03687, over 4669521.03 frames. ], batch size: 85, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:16:25,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 11:16:31,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:16:34,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:34,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 11:16:37,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:16:37,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 11:16:39,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 11:16:40,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:16:40,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:16:40,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:16:40,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:16:45,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.23 vs. limit=22.5 2023-10-04 11:16:50,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:16:50,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:50,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:16:52,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1635186.6666666667, ans=0.1 2023-10-04 11:16:55,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:16:58,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.14 vs. limit=15.0 2023-10-04 11:16:59,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 11:17:02,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:17:07,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:17:07,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:17:08,549 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 11:17:08,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 11:17:11,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1635253.3333333333, ans=0.0 2023-10-04 11:17:11,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1635253.3333333333, ans=0.125 2023-10-04 11:17:14,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:17:15,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:17:15,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:17:15,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1635253.3333333333, ans=0.0 2023-10-04 11:17:21,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:21,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:17:22,700 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-10-04 11:17:23,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 11:17:23,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:17:25,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 11:17:27,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:17:27,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:29,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:17:29,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:17:34,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 11:17:34,996 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 11:17:35,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:17:35,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 11:17:37,678 INFO [train.py:1046] (2/4) Epoch 47, batch 950, loss[loss=0.1483, simple_loss=0.2334, pruned_loss=0.03159, over 24622.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2371, pruned_loss=0.03747, over 4667260.91 frames. ], batch size: 65, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:17:37,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:41,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 11:17:46,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:17:50,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:50,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:51,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:17:54,623 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 11:17:54,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1635453.3333333333, ans=10.0 2023-10-04 11:17:57,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:57,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:17:59,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:17:59,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:17:59,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 11:18:00,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:18:02,724 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.59 vs. limit=15.0 2023-10-04 11:18:03,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:03,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 11:18:04,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:18:08,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:08,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:18:08,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:18:09,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 11:18:11,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 11:18:12,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:18:14,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:18:21,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:18:21,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:18:24,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 11:18:26,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 11:18:26,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:18:26,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:18:26,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:26,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:18:32,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 11:18:33,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:18:35,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:18:35,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:35,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 11:18:35,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:18:35,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:18:36,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 11:18:39,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:18:42,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:18:44,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1635653.3333333333, ans=0.1 2023-10-04 11:18:46,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:18:48,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 11:18:48,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 11:18:50,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.047e+02 2.206e+02 2.421e+02 3.668e+02, threshold=4.411e+02, percent-clipped=0.0 2023-10-04 11:18:53,455 INFO [train.py:1046] (2/4) Epoch 47, batch 1000, loss[loss=0.15, simple_loss=0.2339, pruned_loss=0.03302, over 24694.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2364, pruned_loss=0.03725, over 4689081.19 frames. ], batch size: 65, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:18:53,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:57,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 11:18:57,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:00,348 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.44 vs. limit=15.0 2023-10-04 11:19:03,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:19:04,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 11:19:04,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 11:19:09,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:09,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:19:10,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:13,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 11:19:16,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 11:19:17,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 11:19:19,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:19:19,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 11:19:22,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 11:19:23,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 11:19:24,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:25,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:30,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1635853.3333333333, ans=0.125 2023-10-04 11:19:30,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1635853.3333333333, ans=0.0 2023-10-04 11:19:32,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:33,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:19:34,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:34,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:34,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 11:19:36,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:19:36,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:19:36,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:37,654 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 11:19:40,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 11:19:42,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 11:19:42,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1635920.0, ans=0.1 2023-10-04 11:19:43,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 11:19:45,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:19:45,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1635920.0, ans=0.1 2023-10-04 11:19:50,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:50,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:19:50,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:51,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:19:53,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 11:19:53,931 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-10-04 11:19:54,730 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:19:56,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 11:19:58,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 11:19:58,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:19:58,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:20:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:20:03,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:20:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:20:06,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.89 vs. limit=6.0 2023-10-04 11:20:07,906 INFO [train.py:1046] (2/4) Epoch 47, batch 1050, loss[loss=0.1519, simple_loss=0.2214, pruned_loss=0.04119, over 23457.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2352, pruned_loss=0.037, over 4690485.38 frames. ], batch size: 285, lr: 2.17e-03, grad_scale: 4.0 2023-10-04 11:20:09,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:20:10,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:20:13,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:20:13,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:20:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:20:16,013 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.83 vs. limit=12.0 2023-10-04 11:20:18,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:20:19,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:20:22,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:20:23,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:20:23,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:20:24,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:20:25,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 11:20:25,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1636120.0, ans=0.0 2023-10-04 11:20:27,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:20:28,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 11:20:30,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:20:30,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 11:20:30,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:20:37,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:20:37,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:20:37,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:20:39,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 11:20:39,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 11:20:39,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:20:42,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 11:20:46,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 11:20:47,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:20:47,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1636186.6666666667, ans=0.125 2023-10-04 11:20:49,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:20:52,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:20:52,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:20:52,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:20:58,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:21:01,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 11:21:01,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 11:21:03,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 11:21:03,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:21:03,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:21:06,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 11:21:08,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:21:11,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:21:11,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:21:11,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:21:11,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:21:16,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:21:16,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 11:21:16,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:21:16,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 11:21:17,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 11:21:17,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:21:20,774 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.061e+02 2.280e+02 2.915e+02 5.230e+02, threshold=4.560e+02, percent-clipped=2.0 2023-10-04 11:21:22,023 INFO [train.py:1046] (2/4) Epoch 47, batch 1100, loss[loss=0.1493, simple_loss=0.2237, pruned_loss=0.03748, over 24310.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2347, pruned_loss=0.03641, over 4713380.15 frames. ], batch size: 56, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:21:22,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:21:26,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:21:26,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1636386.6666666667, ans=0.125 2023-10-04 11:21:26,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1636386.6666666667, ans=0.1 2023-10-04 11:21:29,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:21:32,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:21:32,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:21:32,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 11:21:33,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:21:36,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:21:39,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:21:42,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:21:42,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 11:21:44,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:21:44,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1636453.3333333333, ans=0.05 2023-10-04 11:21:45,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:21:45,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:21:48,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:21:49,785 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:21:54,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:21:57,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 11:21:58,562 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 11:21:58,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:00,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:03,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:22:03,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:22:05,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 11:22:06,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:22:06,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:22:06,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:22:06,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1636586.6666666667, ans=0.0 2023-10-04 11:22:07,733 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:07,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 11:22:11,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:22:11,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 11:22:14,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:22:19,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:22:22,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 11:22:24,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 11:22:24,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:27,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:22:27,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:22:27,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 11:22:27,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1636653.3333333333, ans=0.125 2023-10-04 11:22:28,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:22:28,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:22:30,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 11:22:30,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:22:31,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 11:22:33,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:22:33,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:22:35,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:22:36,337 INFO [train.py:1046] (2/4) Epoch 47, batch 1150, loss[loss=0.1536, simple_loss=0.2311, pruned_loss=0.03801, over 23666.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2349, pruned_loss=0.03581, over 4731086.77 frames. ], batch size: 232, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:22:37,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1636720.0, ans=0.125 2023-10-04 11:22:39,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:22:41,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:22:43,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:22:43,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:22:43,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 11:22:45,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:22:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 11:22:49,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:22:49,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:22:49,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1636786.6666666667, ans=0.0 2023-10-04 11:22:54,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.33 vs. limit=12.0 2023-10-04 11:22:55,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 11:22:55,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1636786.6666666667, ans=0.125 2023-10-04 11:22:55,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1636786.6666666667, ans=0.125 2023-10-04 11:22:59,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:23:02,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:23:02,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:02,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 11:23:02,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:23:02,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:23:07,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 11:23:07,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:23:10,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:23:19,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:19,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1636920.0, ans=0.1 2023-10-04 11:23:27,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:27,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 11:23:27,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:29,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:34,987 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 11:23:36,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:44,125 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 11:23:45,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:23:45,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1636986.6666666667, ans=0.1 2023-10-04 11:23:47,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:23:47,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:23:48,702 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.016e+02 2.247e+02 2.696e+02 4.163e+02, threshold=4.494e+02, percent-clipped=0.0 2023-10-04 11:23:48,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:23:50,022 INFO [train.py:1046] (2/4) Epoch 47, batch 1200, loss[loss=0.1349, simple_loss=0.2157, pruned_loss=0.02702, over 24345.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.235, pruned_loss=0.03582, over 4729953.00 frames. ], batch size: 56, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:23:52,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:23:53,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1637053.3333333333, ans=0.0 2023-10-04 11:23:57,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:23:57,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:23:59,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1637053.3333333333, ans=0.125 2023-10-04 11:24:00,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:00,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:00,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:24:02,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.25 vs. limit=22.5 2023-10-04 11:24:03,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:24:03,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1637120.0, ans=0.125 2023-10-04 11:24:04,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:24:04,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1637120.0, ans=0.125 2023-10-04 11:24:05,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:24:05,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:24:10,390 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 11:24:11,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 11:24:12,643 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-10-04 11:24:15,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:24:16,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:24:19,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:21,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:24:21,235 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 11:24:22,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:22,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1637186.6666666667, ans=0.125 2023-10-04 11:24:28,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:24:28,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:24:28,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 11:24:31,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:24:31,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1637186.6666666667, ans=0.125 2023-10-04 11:24:34,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 11:24:37,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 11:24:37,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:38,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:24:40,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:24:40,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:24:41,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:41,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:24:41,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:24:42,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 11:24:43,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:24:43,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:24:43,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:24:46,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:24:46,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:24:49,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:24:51,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:24:54,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 11:24:58,830 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 11:25:00,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:25:01,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1637320.0, ans=0.125 2023-10-04 11:25:03,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:25:04,356 INFO [train.py:1046] (2/4) Epoch 47, batch 1250, loss[loss=0.1561, simple_loss=0.2392, pruned_loss=0.03648, over 23333.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2361, pruned_loss=0.03604, over 4732255.81 frames. ], batch size: 93, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:25:04,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:25:05,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:25:07,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 11:25:11,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:25:11,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:13,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 11:25:16,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:25:16,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:25:16,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1637386.6666666667, ans=0.0 2023-10-04 11:25:19,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:25:19,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:21,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:25:21,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:25:23,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:25:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 11:25:29,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:25:29,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:25:32,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:25:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:35,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:37,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:25:42,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 11:25:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:25:45,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:25:46,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 11:25:46,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:46,996 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 11:25:48,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:48,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:53,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:56,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:56,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:25:56,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 11:25:56,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 11:25:56,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1637586.6666666667, ans=0.5 2023-10-04 11:25:57,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 11:25:59,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:02,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 11:26:02,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:26:03,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 11:26:03,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:26:05,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1637653.3333333333, ans=0.07 2023-10-04 11:26:06,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 11:26:06,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:26:06,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:26:06,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:26:06,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1637653.3333333333, ans=0.0 2023-10-04 11:26:06,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1637653.3333333333, ans=0.125 2023-10-04 11:26:08,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:26:08,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 11:26:11,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:26:12,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:26:14,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:26:16,847 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.071e+02 2.257e+02 2.568e+02 3.764e+02, threshold=4.514e+02, percent-clipped=0.0 2023-10-04 11:26:16,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:26:17,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1637720.0, ans=0.125 2023-10-04 11:26:18,270 INFO [train.py:1046] (2/4) Epoch 47, batch 1300, loss[loss=0.1559, simple_loss=0.242, pruned_loss=0.03489, over 24652.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2361, pruned_loss=0.0362, over 4735198.13 frames. ], batch size: 68, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:26:20,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:26:20,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 11:26:24,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:26,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:26:26,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1637720.0, ans=0.125 2023-10-04 11:26:28,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:26:28,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:26:30,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:26:31,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 11:26:34,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:26:35,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:26:37,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 11:26:40,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:26:44,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:26:44,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:26:46,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:48,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:26:48,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:26:48,697 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.94 vs. limit=12.0 2023-10-04 11:26:50,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:26:51,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 11:26:55,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:26:55,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:26:55,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1637853.3333333333, ans=0.125 2023-10-04 11:26:57,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1637853.3333333333, ans=0.125 2023-10-04 11:26:57,576 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=22.5 2023-10-04 11:26:58,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 11:27:00,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:27:01,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:27:03,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:27:04,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 11:27:04,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:04,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 11:27:05,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:09,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:27:09,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:27:13,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 11:27:14,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 11:27:14,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 11:27:19,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:27:22,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 11:27:23,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:27:27,329 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=22.5 2023-10-04 11:27:29,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 11:27:32,242 INFO [train.py:1046] (2/4) Epoch 47, batch 1350, loss[loss=0.1671, simple_loss=0.2465, pruned_loss=0.04386, over 23286.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2354, pruned_loss=0.03638, over 4728313.70 frames. ], batch size: 105, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:27:32,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:27:35,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:27:39,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:27:39,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:27:41,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:27:41,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:27:45,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:27:48,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 11:27:50,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:27:50,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1638120.0, ans=0.125 2023-10-04 11:27:51,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:27:53,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 11:27:54,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:27:56,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 11:27:57,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 11:27:58,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 11:28:00,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:00,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 11:28:10,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:20,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:21,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:22,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 11:28:22,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1638253.3333333333, ans=0.125 2023-10-04 11:28:25,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:26,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 11:28:26,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:28:26,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:28:29,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:28:32,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 11:28:33,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:28:35,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 11:28:36,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1638320.0, ans=0.0 2023-10-04 11:28:36,237 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:28:39,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 11:28:44,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.84 vs. limit=6.0 2023-10-04 11:28:44,663 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 2.056e+02 2.395e+02 2.955e+02 4.262e+02, threshold=4.789e+02, percent-clipped=0.0 2023-10-04 11:28:44,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 11:28:44,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:46,166 INFO [train.py:1046] (2/4) Epoch 47, batch 1400, loss[loss=0.1633, simple_loss=0.2513, pruned_loss=0.03763, over 23707.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2343, pruned_loss=0.03609, over 4717005.18 frames. ], batch size: 85, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:28:49,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:28:51,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:28:51,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1638386.6666666667, ans=0.125 2023-10-04 11:28:56,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 11:28:58,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 11:29:08,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:29:09,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:29:13,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:29:13,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:29:17,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:29:17,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 11:29:18,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1638520.0, ans=0.0 2023-10-04 11:29:25,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:26,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:31,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 11:29:31,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:29:32,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:29:33,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:29:35,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:29:35,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:29:36,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:29:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:29:36,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 11:29:38,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:29:39,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:40,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1638586.6666666667, ans=0.125 2023-10-04 11:29:44,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:29:47,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1638653.3333333333, ans=0.04949747468305833 2023-10-04 11:29:49,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1638653.3333333333, ans=0.0 2023-10-04 11:29:52,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1638653.3333333333, ans=0.2 2023-10-04 11:29:52,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1638653.3333333333, ans=0.125 2023-10-04 11:29:52,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1638653.3333333333, ans=0.125 2023-10-04 11:29:53,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 11:29:53,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:29:53,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:29:55,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 11:29:55,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1638653.3333333333, ans=0.0 2023-10-04 11:29:56,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:29:58,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:30:01,008 INFO [train.py:1046] (2/4) Epoch 47, batch 1450, loss[loss=0.1461, simple_loss=0.2207, pruned_loss=0.03574, over 18806.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2334, pruned_loss=0.03637, over 4706381.90 frames. ], batch size: 41, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:30:01,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:30:02,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:30:03,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:03,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 11:30:08,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:30:09,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:30:11,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:30:11,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 11:30:12,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:30:14,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 11:30:14,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:15,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:15,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 11:30:17,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:30:18,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:30:18,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 11:30:18,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:18,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1638786.6666666667, ans=0.0 2023-10-04 11:30:20,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:30:21,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:24,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:30,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:30:30,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:30:30,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1638853.3333333333, ans=0.0 2023-10-04 11:30:31,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:30:31,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:33,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1638853.3333333333, ans=0.125 2023-10-04 11:30:34,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:34,460 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:30:35,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:35,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:30:38,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 11:30:40,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:30:45,121 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 11:30:46,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:30:47,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:30:49,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:30:50,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 11:30:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:30:55,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 11:30:55,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 11:30:55,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1638920.0, ans=0.0 2023-10-04 11:30:58,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:30:59,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:31:01,116 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:31:02,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 11:31:03,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 11:31:05,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 11:31:06,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:08,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:31:14,654 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 2.021e+02 2.399e+02 2.893e+02 4.968e+02, threshold=4.797e+02, percent-clipped=1.0 2023-10-04 11:31:14,679 INFO [train.py:1046] (2/4) Epoch 47, batch 1500, loss[loss=0.1599, simple_loss=0.2374, pruned_loss=0.04123, over 23581.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2336, pruned_loss=0.03623, over 4721742.76 frames. ], batch size: 256, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:31:18,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 11:31:18,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:31:18,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:31:19,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:31:21,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:31:22,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:31:24,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 11:31:24,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:31:26,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:31:26,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:31:26,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:31:29,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:31:29,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:31:33,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1639120.0, ans=0.0 2023-10-04 11:31:34,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:31:34,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 11:31:34,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1639120.0, ans=0.0 2023-10-04 11:31:35,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:31:35,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:31:37,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:41,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 11:31:46,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 11:31:46,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:31:48,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 11:31:50,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:31:50,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:31:52,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:52,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:31:53,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 11:31:53,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:31:53,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:31:53,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 11:31:53,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:32:01,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:32:01,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 11:32:05,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:32:05,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:32:09,634 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 11:32:09,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:09,683 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 11:32:11,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:11,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:32:13,677 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 11:32:15,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:32:17,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 11:32:19,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:20,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:32:20,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1639320.0, ans=0.125 2023-10-04 11:32:21,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:21,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:32:23,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:23,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:32:23,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=1639320.0, ans=0.02 2023-10-04 11:32:25,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 11:32:25,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 11:32:25,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:32:26,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 11:32:28,065 INFO [train.py:1046] (2/4) Epoch 47, batch 1550, loss[loss=0.1689, simple_loss=0.2544, pruned_loss=0.04174, over 24030.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03646, over 4719763.64 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:32:28,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 11:32:30,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:32:32,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:32,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:32:32,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:32:33,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:35,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:35,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1639386.6666666667, ans=0.125 2023-10-04 11:32:37,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1639386.6666666667, ans=0.05 2023-10-04 11:32:39,076 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 11:32:39,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:39,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:32:40,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:32:42,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:32:42,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 11:32:44,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:32:44,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 11:32:47,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 11:32:47,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 11:32:47,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:47,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:32:51,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:32:54,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 11:32:54,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 11:33:02,259 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.25 vs. limit=15.0 2023-10-04 11:33:03,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:05,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:33:06,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:33:06,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:33:07,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 11:33:13,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:33:15,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:17,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:33:20,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:33:20,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:20,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 11:33:20,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:33:20,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1639586.6666666667, ans=0.1 2023-10-04 11:33:23,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:33:23,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:23,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 11:33:23,532 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 11:33:26,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:33:31,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1639653.3333333333, ans=0.0 2023-10-04 11:33:32,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 11:33:35,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:33:35,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:37,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 11:33:37,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1639653.3333333333, ans=10.0 2023-10-04 11:33:38,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:33:38,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:33:38,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:33:39,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:33:40,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:33:41,231 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.799e+02 2.051e+02 2.234e+02 2.804e+02 4.353e+02, threshold=4.467e+02, percent-clipped=0.0 2023-10-04 11:33:41,257 INFO [train.py:1046] (2/4) Epoch 47, batch 1600, loss[loss=0.164, simple_loss=0.2352, pruned_loss=0.0464, over 23739.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.0367, over 4727993.37 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:33:44,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:33:45,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 11:33:45,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 11:33:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 11:33:50,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:33:52,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 11:33:53,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:33:56,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:33:59,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:34:04,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 11:34:06,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:34:08,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 11:34:08,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:08,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 11:34:13,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.71 vs. limit=15.0 2023-10-04 11:34:13,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 11:34:14,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1639853.3333333333, ans=0.2 2023-10-04 11:34:16,070 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=9.89 vs. limit=22.5 2023-10-04 11:34:21,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:34:22,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 11:34:22,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:34:22,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:34:22,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:34:25,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 11:34:29,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 11:34:29,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:34:31,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:32,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:32,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:34:36,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:34:38,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:34:38,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:34:44,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:45,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:34:47,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 11:34:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:34:49,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 11:34:52,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1639986.6666666667, ans=0.0 2023-10-04 11:34:54,419 INFO [train.py:1046] (2/4) Epoch 47, batch 1650, loss[loss=0.1553, simple_loss=0.2246, pruned_loss=0.04302, over 23808.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2355, pruned_loss=0.03687, over 4731228.51 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:34:54,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:34:56,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:34:57,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:34:57,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 11:34:57,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 11:34:57,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 11:34:57,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 11:35:01,119 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.14 vs. limit=15.0 2023-10-04 11:35:02,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:35:02,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:35:03,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1640053.3333333333, ans=0.125 2023-10-04 11:35:04,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:04,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:35:06,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1640053.3333333333, ans=0.0 2023-10-04 11:35:08,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:35:08,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1640120.0, ans=0.125 2023-10-04 11:35:09,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1640120.0, ans=0.1 2023-10-04 11:35:10,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 11:35:12,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:35:12,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:35:12,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:35:12,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:35:13,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 11:35:13,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 11:35:15,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1640120.0, ans=0.125 2023-10-04 11:35:16,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1640120.0, ans=0.125 2023-10-04 11:35:19,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:35:20,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:35:27,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 11:35:28,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.34 vs. limit=10.0 2023-10-04 11:35:29,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:32,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 11:35:33,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:35:35,771 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-10-04 11:35:36,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:35:36,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:35:37,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:35:39,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:35:39,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:43,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:35:43,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1640253.3333333333, ans=0.125 2023-10-04 11:35:44,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:44,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:35:44,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:35:44,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:46,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:35:49,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:35:49,437 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:35:50,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 11:35:52,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:35:52,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 11:35:53,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 11:35:55,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 11:35:55,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:55,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:35:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:35:56,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:56,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 11:35:59,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:36:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:36:01,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:36:04,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 11:36:07,884 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.048e+02 2.297e+02 2.769e+02 5.278e+02, threshold=4.594e+02, percent-clipped=2.0 2023-10-04 11:36:07,909 INFO [train.py:1046] (2/4) Epoch 47, batch 1700, loss[loss=0.1449, simple_loss=0.2206, pruned_loss=0.03463, over 23459.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2352, pruned_loss=0.03658, over 4732037.24 frames. ], batch size: 134, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:36:08,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:36:08,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:36:09,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 11:36:10,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640386.6666666667, ans=0.1 2023-10-04 11:36:11,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:36:11,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:36:11,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:36:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:36:12,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:36:12,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 11:36:15,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:36:24,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:36:25,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1640453.3333333333, ans=0.125 2023-10-04 11:36:25,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1640453.3333333333, ans=0.0 2023-10-04 11:36:26,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:36:31,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:36:31,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:36:33,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:36:33,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:36:36,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 11:36:39,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:36:39,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:36:41,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:36:42,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:36:43,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 11:36:45,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 11:36:46,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:36:46,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 11:36:48,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:36:57,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:36:57,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:36:58,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:37:00,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:37:00,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 11:37:00,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:37:01,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:01,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 11:37:03,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:37:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:03,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:03,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:04,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:04,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:37:06,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:06,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1640653.3333333333, ans=0.2 2023-10-04 11:37:07,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:37:07,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:37:11,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:37:11,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1640653.3333333333, ans=0.0 2023-10-04 11:37:12,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 11:37:15,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:37:16,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:37:18,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 11:37:22,915 INFO [train.py:1046] (2/4) Epoch 47, batch 1750, loss[loss=0.1694, simple_loss=0.257, pruned_loss=0.04086, over 24352.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.234, pruned_loss=0.03641, over 4722355.14 frames. ], batch size: 77, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:37:24,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:25,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:27,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:37:28,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 11:37:28,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:31,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:37:31,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:36,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 11:37:37,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:40,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 11:37:40,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:42,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:37:44,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:37:46,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 11:37:48,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:37:49,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 11:37:57,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:37:59,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:00,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:38:03,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:03,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:38:04,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:38:06,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.35 vs. limit=22.5 2023-10-04 11:38:06,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:38:09,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:38:09,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 11:38:10,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-04 11:38:11,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:38:14,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 11:38:15,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:38:17,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:38:17,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:38:20,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1640986.6666666667, ans=0.0 2023-10-04 11:38:22,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:38:22,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:38:22,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:23,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:38:26,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:38:29,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:38:31,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:38:31,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 11:38:31,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:33,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:38:33,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:33,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:38:34,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:38:36,283 INFO [train.py:1046] (2/4) Epoch 47, batch 1800, loss[loss=0.1321, simple_loss=0.2124, pruned_loss=0.02597, over 24444.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03612, over 4735454.60 frames. ], batch size: 58, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:38:36,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:38:37,619 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.081e+02 2.379e+02 2.770e+02 6.213e+02, threshold=4.757e+02, percent-clipped=1.0 2023-10-04 11:38:39,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:38:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:39,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1641053.3333333333, ans=0.1 2023-10-04 11:38:42,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:38:44,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:44,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1641053.3333333333, ans=0.125 2023-10-04 11:38:48,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:38:50,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:38:53,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:38:54,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:56,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:57,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:38:58,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:38:58,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 11:39:00,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:00,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1641120.0, ans=0.125 2023-10-04 11:39:01,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:05,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 11:39:07,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 11:39:07,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 11:39:09,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:09,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:39:09,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:39:10,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:39:16,996 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 11:39:18,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:39:21,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:23,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 11:39:23,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 11:39:23,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.11 vs. limit=15.0 2023-10-04 11:39:24,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:39:24,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:39:26,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:39:30,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 11:39:34,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:39:36,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 11:39:37,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:39:37,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:37,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:39:37,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 11:39:40,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:39:40,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:39:43,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 11:39:43,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:45,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:39:45,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:39:45,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:48,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:48,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:39:49,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:39:49,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:39:51,555 INFO [train.py:1046] (2/4) Epoch 47, batch 1850, loss[loss=0.1616, simple_loss=0.2378, pruned_loss=0.04271, over 23833.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03642, over 4736050.55 frames. ], batch size: 164, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:39:53,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:39:53,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:39:54,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1641386.6666666667, ans=0.0 2023-10-04 11:39:58,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:39:58,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 11:40:03,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 11:40:04,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1641453.3333333333, ans=0.125 2023-10-04 11:40:06,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 11:40:10,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:40:10,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 11:40:10,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 11:40:21,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:40:22,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 11:40:27,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:40:27,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:40:30,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 11:40:31,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:40:31,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:40:33,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:40:35,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:40:39,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:40:40,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:40:40,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:40:40,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:40:40,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:40:42,538 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.42 vs. limit=15.0 2023-10-04 11:40:42,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.11 vs. limit=12.0 2023-10-04 11:40:43,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:40:44,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:40:46,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 11:40:46,471 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.50 vs. limit=12.0 2023-10-04 11:40:47,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:40:52,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:40:52,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:40:52,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 11:40:52,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 11:40:54,405 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 11:40:55,911 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 11:40:59,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:40:59,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:40:59,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:40:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:00,818 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.59 vs. limit=15.0 2023-10-04 11:41:01,271 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 11:41:01,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:41:01,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:04,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:41:04,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:41:05,473 INFO [train.py:1046] (2/4) Epoch 47, batch 1900, loss[loss=0.1543, simple_loss=0.2473, pruned_loss=0.0307, over 24432.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.235, pruned_loss=0.03669, over 4739304.32 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:41:05,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:41:06,793 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.127e+02 2.369e+02 2.756e+02 3.360e+02, threshold=4.738e+02, percent-clipped=0.0 2023-10-04 11:41:06,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 11:41:08,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:10,250 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 11:41:10,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:41:11,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:41:15,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:41:17,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:41:18,620 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 11:41:18,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 11:41:21,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:41:21,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:41:21,429 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 11:41:22,668 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 11:41:27,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 11:41:29,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:41:31,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 11:41:33,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 11:41:40,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1641853.3333333333, ans=0.1 2023-10-04 11:41:43,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 11:41:46,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 11:41:46,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:47,525 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 11:41:47,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 11:41:47,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 11:41:47,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 11:41:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:41:51,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 11:41:54,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:41:57,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:41:57,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 11:41:58,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1641920.0, ans=0.125 2023-10-04 11:41:59,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:42:02,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1641986.6666666667, ans=10.0 2023-10-04 11:42:03,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 11:42:03,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:42:05,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1641986.6666666667, ans=0.125 2023-10-04 11:42:09,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:42:09,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:42:09,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:42:10,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:42:11,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1641986.6666666667, ans=0.2 2023-10-04 11:42:12,260 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.68 vs. limit=12.0 2023-10-04 11:42:12,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:42:13,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:42:13,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:42:15,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:42:15,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:42:15,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=1641986.6666666667, ans=0.02 2023-10-04 11:42:18,179 INFO [train.py:1046] (2/4) Epoch 47, batch 1950, loss[loss=0.1647, simple_loss=0.255, pruned_loss=0.03717, over 24605.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2359, pruned_loss=0.03685, over 4737742.47 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:42:18,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:42:18,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:42:18,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:42:19,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:42:22,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:42:25,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:42:25,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:25,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:42:27,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 11:42:28,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.73 vs. limit=6.0 2023-10-04 11:42:28,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:42:28,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:30,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:34,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:42:34,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:42:34,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:37,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:42:39,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:42:40,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:42:40,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:42:40,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:43,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:46,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:42:46,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:42:46,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:42:46,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 11:42:47,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:42:47,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:42:47,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:52,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:52,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.18 vs. limit=22.5 2023-10-04 11:42:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:42:58,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:43:01,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:43:01,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:43:01,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 11:43:03,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:43:06,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:43:07,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:43:07,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:43:16,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:18,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:19,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1642320.0, ans=0.0 2023-10-04 11:43:21,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:23,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:43:24,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:43:24,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:43:25,476 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-10-04 11:43:26,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 11:43:26,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:43:26,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:43:28,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 11:43:29,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:43:33,277 INFO [train.py:1046] (2/4) Epoch 47, batch 2000, loss[loss=0.1543, simple_loss=0.2288, pruned_loss=0.03994, over 23685.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2367, pruned_loss=0.0371, over 4729407.53 frames. ], batch size: 232, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:43:34,755 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.107e+02 2.252e+02 2.646e+02 4.173e+02, threshold=4.505e+02, percent-clipped=0.0 2023-10-04 11:43:36,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:43:37,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:43:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:43:38,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:43:40,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:41,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 11:43:41,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:43:43,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1642386.6666666667, ans=0.0 2023-10-04 11:43:46,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:43:47,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 11:43:49,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:43:49,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:43:53,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:43:53,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 11:43:53,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1642453.3333333333, ans=0.125 2023-10-04 11:43:54,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:56,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:56,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:58,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 11:43:58,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:43:58,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1642453.3333333333, ans=0.125 2023-10-04 11:43:59,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 11:43:59,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:44:04,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:04,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:44:04,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:05,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:07,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:44:07,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 11:44:11,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 11:44:11,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:44:11,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:11,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1642520.0, ans=0.0 2023-10-04 11:44:17,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:17,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:44:17,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:44:18,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:44:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:20,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:21,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:44:21,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:24,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:25,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:44:25,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 11:44:32,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:44:34,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:36,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1642653.3333333333, ans=0.125 2023-10-04 11:44:37,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:37,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:44:40,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:43,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:43,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:43,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1642653.3333333333, ans=0.0 2023-10-04 11:44:45,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:44:45,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:44:47,635 INFO [train.py:1046] (2/4) Epoch 47, batch 2050, loss[loss=0.1579, simple_loss=0.246, pruned_loss=0.03494, over 24377.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03675, over 4732287.29 frames. ], batch size: 77, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:44:47,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:47,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1642720.0, ans=0.0 2023-10-04 11:44:49,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:51,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:55,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.47 vs. limit=15.0 2023-10-04 11:44:56,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:59,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:45:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:45:02,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:45:02,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 11:45:02,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:45:04,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:45:04,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:45:11,873 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:45:13,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:45:13,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:45:13,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=1642786.6666666667, ans=12.0 2023-10-04 11:45:14,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 11:45:16,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:45:19,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 11:45:19,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:45:23,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:45:24,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:45:25,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:45:25,990 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:45:27,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:45:28,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:45:28,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:45:32,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1642920.0, ans=0.0 2023-10-04 11:45:33,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:45:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:45:35,573 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:45:38,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:45:39,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:45:44,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:45:47,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:45:48,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 11:45:53,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:45:53,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:45:56,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:45:57,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 11:46:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 11:46:00,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:02,621 INFO [train.py:1046] (2/4) Epoch 47, batch 2100, loss[loss=0.1455, simple_loss=0.2274, pruned_loss=0.03178, over 24624.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2342, pruned_loss=0.03639, over 4720474.45 frames. ], batch size: 60, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:46:02,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:46:02,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:46:05,226 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.063e+02 2.330e+02 2.672e+02 3.956e+02, threshold=4.660e+02, percent-clipped=0.0 2023-10-04 11:46:05,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:46:05,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 11:46:05,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 11:46:05,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1643053.3333333333, ans=0.0 2023-10-04 11:46:06,821 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:46:10,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:46:11,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:46:14,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:15,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:46:15,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 11:46:18,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:46:18,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 11:46:18,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 11:46:19,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:21,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:46:21,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 11:46:21,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 11:46:24,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 11:46:24,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:46:27,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:46:27,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:46:29,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:46:31,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 11:46:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:31,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:46:34,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 11:46:34,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:34,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 11:46:35,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 11:46:35,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 11:46:37,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:46:39,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:46:41,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:46:43,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:46:45,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:46,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:46,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 11:46:47,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:47,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:47,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:48,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 11:46:50,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 11:46:52,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 11:46:54,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:46:58,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:46:59,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 11:47:03,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:06,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:47:06,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:47:06,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:47:08,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 11:47:08,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:47:10,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:10,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:47:11,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:47:12,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:14,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 11:47:15,695 INFO [train.py:1046] (2/4) Epoch 47, batch 2150, loss[loss=0.1557, simple_loss=0.2485, pruned_loss=0.03147, over 24563.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2345, pruned_loss=0.03629, over 4730987.26 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:47:15,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 11:47:15,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:18,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:47:18,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:47:18,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:47:18,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:47:23,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:47:24,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:26,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:28,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:47:28,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:28,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:47:30,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:32,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:47:32,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:47:35,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:36,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 11:47:39,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:41,825 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:47:43,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:43,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:47:43,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:43,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:47:44,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:44,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:47:44,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:46,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 11:47:46,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1643520.0, ans=0.125 2023-10-04 11:47:48,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:47:50,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:50,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:52,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:47:53,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:47:56,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:56,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:47:58,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:58,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 11:47:58,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:47:59,256 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:48:02,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:48:03,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:04,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:48:04,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:48:06,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:08,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:08,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 11:48:09,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 11:48:09,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:48:09,541 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 11:48:10,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:10,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:48:12,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 11:48:12,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:48:12,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 11:48:12,881 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 11:48:12,881 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 11:48:12,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 11:48:14,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:15,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:48:16,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn2.whiten.whitening_limit, batch_count=1643653.3333333333, ans=22.5 2023-10-04 11:48:17,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:48:17,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:17,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:48:18,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:18,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:48:27,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1643653.3333333333, ans=0.035 2023-10-04 11:48:28,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 11:48:30,049 INFO [train.py:1046] (2/4) Epoch 47, batch 2200, loss[loss=0.137, simple_loss=0.2201, pruned_loss=0.02689, over 24620.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03631, over 4727420.95 frames. ], batch size: 60, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:48:31,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:48:34,221 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.021e+02 2.246e+02 2.617e+02 4.385e+02, threshold=4.493e+02, percent-clipped=0.0 2023-10-04 11:48:36,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:36,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:48:36,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:48:37,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:48:40,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:40,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:48:40,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 11:48:46,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 11:48:47,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:48:54,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 11:48:55,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:55,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:48:56,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:49:00,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:49:02,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 11:49:05,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:49:05,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:07,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 11:49:10,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:49:13,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:49:14,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:49:16,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:20,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 11:49:20,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:21,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 11:49:24,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:24,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:49:24,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:26,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:49:26,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:49:26,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:26,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:27,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:49:27,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:49:29,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:49:29,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1643986.6666666667, ans=0.125 2023-10-04 11:49:33,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:49:33,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:49:34,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:49:36,088 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 11:49:39,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:49:39,485 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 11:49:40,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:49:41,643 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.65 vs. limit=22.5 2023-10-04 11:49:42,104 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 11:49:43,434 INFO [train.py:1046] (2/4) Epoch 47, batch 2250, loss[loss=0.145, simple_loss=0.2278, pruned_loss=0.03111, over 23692.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.234, pruned_loss=0.03636, over 4713642.34 frames. ], batch size: 149, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:49:43,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:43,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:49:45,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:47,019 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 11:49:48,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:49:49,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:49:55,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:49:57,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:49:58,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1644120.0, ans=0.2 2023-10-04 11:49:59,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:00,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:50:01,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:50:05,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 11:50:05,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:50:05,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:50:06,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 11:50:06,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:50:06,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:08,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:50:09,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1644120.0, ans=0.0 2023-10-04 11:50:10,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1644120.0, ans=0.07 2023-10-04 11:50:14,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:50:16,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 11:50:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:50:17,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1644186.6666666667, ans=0.05 2023-10-04 11:50:18,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 11:50:20,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:22,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:50:24,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1644186.6666666667, ans=0.125 2023-10-04 11:50:24,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1644186.6666666667, ans=0.125 2023-10-04 11:50:27,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:50:30,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:50:31,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:50:31,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:50:34,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:50:35,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:50:38,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:50:40,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:50:41,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1644320.0, ans=0.125 2023-10-04 11:50:45,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:50:45,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:50:46,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:50:47,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1644320.0, ans=0.2 2023-10-04 11:50:52,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 11:50:54,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:50:54,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 11:50:54,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:50:54,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:50:57,181 INFO [train.py:1046] (2/4) Epoch 47, batch 2300, loss[loss=0.1434, simple_loss=0.2218, pruned_loss=0.03248, over 23607.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.234, pruned_loss=0.03646, over 4711664.84 frames. ], batch size: 135, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:50:57,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 11:51:00,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:51:00,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:01,758 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.249e+02 2.496e+02 2.917e+02 4.902e+02, threshold=4.992e+02, percent-clipped=2.0 2023-10-04 11:51:07,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:07,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:51:08,862 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 11:51:10,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:16,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:51:16,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:51:17,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:18,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:18,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 11:51:19,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:51:21,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:51:21,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:51:25,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:51:28,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:51:30,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:51:35,446 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.72 vs. limit=15.0 2023-10-04 11:51:35,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:51:35,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:39,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:51:41,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:44,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:51:46,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:51:46,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:51:47,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 11:51:50,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 11:51:50,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:50,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:51:50,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:51:50,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:51:52,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 11:51:52,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:51:52,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 11:51:52,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:51:52,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:53,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 11:51:59,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:52:05,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:52:07,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:52:08,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1644653.3333333333, ans=0.2 2023-10-04 11:52:08,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1644653.3333333333, ans=22.5 2023-10-04 11:52:09,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:52:09,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:52:10,658 INFO [train.py:1046] (2/4) Epoch 47, batch 2350, loss[loss=0.1632, simple_loss=0.2494, pruned_loss=0.03847, over 24464.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2353, pruned_loss=0.03693, over 4705114.91 frames. ], batch size: 63, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:52:10,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:52:10,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:52:12,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:52:12,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 11:52:15,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1644720.0, ans=0.0 2023-10-04 11:52:17,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:52:18,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 11:52:20,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1644720.0, ans=0.015 2023-10-04 11:52:23,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 11:52:24,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:52:26,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1644786.6666666667, ans=0.0 2023-10-04 11:52:28,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:28,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:28,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:52:28,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:52:30,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 11:52:33,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:52:38,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.25 vs. limit=15.0 2023-10-04 11:52:38,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 11:52:40,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:52:42,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:52:42,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:52:45,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:52:45,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 11:52:47,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:52:50,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:52:50,420 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:52:50,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:52:51,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:52:53,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 11:52:54,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:52:56,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:56,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:52:58,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 11:52:59,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:53:01,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 11:53:01,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:53:04,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1644920.0, ans=0.0 2023-10-04 11:53:06,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 11:53:11,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 11:53:12,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:53:12,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:53:12,548 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 11:53:12,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 11:53:15,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 11:53:18,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:53:22,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:53:23,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:53:25,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:53:25,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 11:53:25,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 11:53:26,464 INFO [train.py:1046] (2/4) Epoch 47, batch 2400, loss[loss=0.1656, simple_loss=0.2327, pruned_loss=0.04929, over 23748.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.037, over 4712749.90 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:53:30,966 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.104e+02 2.348e+02 2.668e+02 4.204e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-04 11:53:32,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:53:32,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:53:35,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 11:53:35,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:53:35,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1645053.3333333333, ans=0.1 2023-10-04 11:53:36,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:53:37,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.27 vs. limit=12.0 2023-10-04 11:53:38,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 11:53:38,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1645053.3333333333, ans=0.125 2023-10-04 11:53:43,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:53:46,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 11:53:50,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:53:55,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 11:53:56,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:53:59,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:05,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:54:05,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 11:54:06,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:54:15,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:17,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:54:19,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:21,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:54:21,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:54:21,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:54:21,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:22,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:54:22,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:54:27,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:54:27,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:54:27,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 11:54:29,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 11:54:30,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:54:30,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:30,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1645320.0, ans=0.1 2023-10-04 11:54:31,371 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.91 vs. limit=22.5 2023-10-04 11:54:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 11:54:33,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 11:54:33,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 11:54:33,221 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 11:54:35,094 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 11:54:36,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:54:37,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:37,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:54:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 11:54:39,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:40,507 INFO [train.py:1046] (2/4) Epoch 47, batch 2450, loss[loss=0.1519, simple_loss=0.2306, pruned_loss=0.03657, over 23495.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03654, over 4716133.49 frames. ], batch size: 134, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:54:40,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:54:43,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:54:43,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:54:46,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:46,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:54:48,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 11:54:54,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:54:54,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:55,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1645453.3333333333, ans=0.0 2023-10-04 11:54:58,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:54:58,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:54:58,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:54:58,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 11:55:01,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:55:04,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:55:04,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:55:10,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:55:10,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:11,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:11,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:55:11,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1645520.0, ans=0.1 2023-10-04 11:55:13,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 11:55:14,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:55:18,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1645520.0, ans=0.125 2023-10-04 11:55:22,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:23,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:55:24,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:55:25,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:55:25,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:25,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=15.0 2023-10-04 11:55:26,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:55:26,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 11:55:29,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:31,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:55:34,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:55:34,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:55:37,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1645586.6666666667, ans=0.125 2023-10-04 11:55:39,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:55:39,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 11:55:40,425 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:55:41,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:55:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 11:55:43,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:55:43,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:55:47,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:55:50,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.49 vs. limit=15.0 2023-10-04 11:55:51,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:52,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:55:54,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 11:55:55,340 INFO [train.py:1046] (2/4) Epoch 47, batch 2500, loss[loss=0.1605, simple_loss=0.2398, pruned_loss=0.0406, over 24611.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03629, over 4701273.06 frames. ], batch size: 60, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:55:55,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:55:59,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:56:01,447 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.014e+02 2.322e+02 2.787e+02 4.599e+02, threshold=4.643e+02, percent-clipped=0.0 2023-10-04 11:56:08,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1645720.0, ans=0.125 2023-10-04 11:56:09,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:56:09,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:56:10,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:56:10,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 11:56:16,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:56:17,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:56:19,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:56:19,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 11:56:19,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 11:56:21,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:21,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:56:23,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 11:56:23,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:23,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 11:56:23,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:27,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:56:27,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:56:28,415 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.20 vs. limit=22.5 2023-10-04 11:56:32,287 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:56:32,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 11:56:32,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:56:32,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1645853.3333333333, ans=0.125 2023-10-04 11:56:33,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:38,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:41,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:43,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:56:50,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:56:54,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 11:56:54,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:56:54,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:56:54,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1645986.6666666667, ans=0.5 2023-10-04 11:56:56,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:56:56,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:56:57,285 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 11:56:57,286 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 11:56:57,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 11:57:01,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:01,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 11:57:01,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 11:57:01,833 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:57:02,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:57:02,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 11:57:05,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1645986.6666666667, ans=0.125 2023-10-04 11:57:07,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 11:57:08,881 INFO [train.py:1046] (2/4) Epoch 47, batch 2550, loss[loss=0.19, simple_loss=0.2497, pruned_loss=0.06517, over 19218.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03648, over 4704071.69 frames. ], batch size: 389, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:57:10,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:57:12,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:57:13,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:57:16,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:57:16,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 11:57:17,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:57:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 11:57:23,871 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:57:25,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:28,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:57:28,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 11:57:28,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:57:29,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:57:29,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:30,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:57:32,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 11:57:32,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:57:32,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:32,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 11:57:39,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.20 vs. limit=15.0 2023-10-04 11:57:43,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:57:46,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:57:46,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:46,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:57:47,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:57:47,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1646186.6666666667, ans=0.0 2023-10-04 11:57:51,877 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-10-04 11:57:53,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:56,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:57:56,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:57:56,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:57:57,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.69 vs. limit=10.0 2023-10-04 11:57:57,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:57:57,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:58:00,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:58:00,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:58:03,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1646253.3333333333, ans=0.125 2023-10-04 11:58:05,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:58:05,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 11:58:05,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:58:06,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:58:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:58:08,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:58:09,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1646320.0, ans=0.125 2023-10-04 11:58:11,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:13,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1646320.0, ans=0.1 2023-10-04 11:58:17,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:58:20,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:23,339 INFO [train.py:1046] (2/4) Epoch 47, batch 2600, loss[loss=0.157, simple_loss=0.2431, pruned_loss=0.0355, over 24017.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2353, pruned_loss=0.03666, over 4715706.88 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:58:23,422 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 11:58:26,148 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 11:58:26,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:58:26,212 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 11:58:27,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 11:58:27,539 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 11:58:28,868 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.029e+02 2.282e+02 2.767e+02 4.631e+02, threshold=4.564e+02, percent-clipped=0.0 2023-10-04 11:58:30,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:58:30,329 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 11:58:31,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 11:58:33,120 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 11:58:34,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:58:35,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 11:58:37,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 11:58:39,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:58:39,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 11:58:41,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1646453.3333333333, ans=0.125 2023-10-04 11:58:42,430 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 11:58:42,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 11:58:49,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:58:49,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:49,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:58:49,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 11:58:50,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1646453.3333333333, ans=0.125 2023-10-04 11:58:52,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:58:56,875 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 11:59:02,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:59:03,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:04,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 11:59:05,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:59:05,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:59:05,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 11:59:08,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:59:08,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:59:10,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:14,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-10-04 11:59:14,786 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 11:59:14,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:14,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:59:15,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=12.0 2023-10-04 11:59:16,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1646586.6666666667, ans=0.125 2023-10-04 11:59:19,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1646586.6666666667, ans=0.0 2023-10-04 11:59:21,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:59:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:59:22,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 11:59:22,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:59:25,377 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:59:25,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:59:29,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1646653.3333333333, ans=0.125 2023-10-04 11:59:31,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 11:59:32,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:35,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:59:36,496 INFO [train.py:1046] (2/4) Epoch 47, batch 2650, loss[loss=0.1639, simple_loss=0.2505, pruned_loss=0.03869, over 24007.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.236, pruned_loss=0.03679, over 4710875.93 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:59:39,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 11:59:39,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:40,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:59:42,077 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 11:59:42,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:59:44,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:46,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:59:46,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1646720.0, ans=0.025 2023-10-04 11:59:47,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:59:51,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:52,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 11:59:52,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:59:52,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:59:54,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.60 vs. limit=22.5 2023-10-04 11:59:55,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 11:59:56,569 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 11:59:59,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:00,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 12:00:00,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:02,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 12:00:06,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:06,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:00:06,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:06,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:09,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 12:00:10,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 12:00:13,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:00:17,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 12:00:17,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:19,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:19,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1646853.3333333333, ans=0.2 2023-10-04 12:00:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:00:21,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:00:21,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:23,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:00:25,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:00:26,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:00:26,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:00:26,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:00:28,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:29,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:00:29,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:30,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:00:30,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:00:33,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:35,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:00:35,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:35,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 12:00:39,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:41,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:41,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:44,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:45,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:00:45,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:49,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:00:49,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 12:00:50,825 INFO [train.py:1046] (2/4) Epoch 47, batch 2700, loss[loss=0.1458, simple_loss=0.23, pruned_loss=0.03076, over 22492.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2362, pruned_loss=0.03677, over 4703327.26 frames. ], batch size: 49, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:00:52,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:00:52,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 12:00:54,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1647053.3333333333, ans=0.0 2023-10-04 12:00:55,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:56,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.063e+02 2.218e+02 2.649e+02 4.383e+02, threshold=4.436e+02, percent-clipped=0.0 2023-10-04 12:00:56,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:56,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:58,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:00:58,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:00:58,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:00:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 12:00:59,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:01:01,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:01:01,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:01:02,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:01:06,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:01:08,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 12:01:08,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:01:12,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:01:12,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:17,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:01:17,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:01:18,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:01:18,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:01:22,756 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.96 vs. limit=12.0 2023-10-04 12:01:23,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:01:24,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:01:26,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:01:26,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:01:30,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:30,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:01:37,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.35 vs. limit=15.0 2023-10-04 12:01:39,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:01:40,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:01:42,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:01:42,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:01:45,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:47,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:01:47,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:01:50,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:01:52,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:52,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:01:53,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1647320.0, ans=22.5 2023-10-04 12:01:54,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:01:57,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:57,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1647320.0, ans=0.0 2023-10-04 12:01:58,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 12:02:00,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:01,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1647320.0, ans=0.0 2023-10-04 12:02:02,946 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:02:02,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 12:02:04,334 INFO [train.py:1046] (2/4) Epoch 47, batch 2750, loss[loss=0.1529, simple_loss=0.2358, pruned_loss=0.035, over 24333.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2354, pruned_loss=0.0367, over 4718079.83 frames. ], batch size: 61, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:02:04,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 12:02:04,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:05,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:02:07,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:07,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:02:08,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:10,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:02:11,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.19 vs. limit=15.0 2023-10-04 12:02:11,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:02:11,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:02:11,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:11,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 12:02:12,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:02:12,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:16,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1647386.6666666667, ans=0.2 2023-10-04 12:02:19,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 12:02:21,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:02:22,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:22,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:02:24,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:02:25,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:02:25,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:02:27,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:27,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:30,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:02:31,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:02:32,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:02:34,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:36,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:02:39,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1647520.0, ans=0.1 2023-10-04 12:02:42,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:42,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:02:44,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:02:48,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:48,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:02:48,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:02:56,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:02:56,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:02:56,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 12:03:01,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:02,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 12:03:07,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 12:03:09,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:03:11,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 12:03:12,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:03:13,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:03:13,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 12:03:13,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:03:17,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:03:17,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:18,394 INFO [train.py:1046] (2/4) Epoch 47, batch 2800, loss[loss=0.1708, simple_loss=0.2584, pruned_loss=0.04156, over 24036.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.0366, over 4717235.29 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:03:18,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:03:19,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 12:03:19,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:19,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:23,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:23,614 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 12:03:23,615 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 12:03:24,722 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.018e+02 2.201e+02 2.488e+02 4.073e+02, threshold=4.402e+02, percent-clipped=0.0 2023-10-04 12:03:26,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:27,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:03:27,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:03:31,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:03:32,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 12:03:35,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:03:36,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 12:03:38,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:38,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:03:38,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:03:43,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:03:43,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:43,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:03:44,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:03:51,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:03:53,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:55,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1647853.3333333333, ans=0.125 2023-10-04 12:03:56,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:57,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:03:57,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:02,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:04:02,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 12:04:04,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:05,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:04:05,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:04:08,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:09,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:12,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:04:15,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:04:15,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:15,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:04:16,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:04:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:04:17,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:04:17,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 12:04:17,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:19,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:04:19,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:21,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 12:04:24,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:24,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:04:25,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:04:26,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 12:04:32,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:04:33,917 INFO [train.py:1046] (2/4) Epoch 47, batch 2850, loss[loss=0.1403, simple_loss=0.2206, pruned_loss=0.02998, over 23572.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.03612, over 4718559.05 frames. ], batch size: 256, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:04:33,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:04:34,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:04:35,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:04:38,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:04:38,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:04:38,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:41,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:41,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:43,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:04:43,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 12:04:50,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 12:04:50,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:04:50,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1648120.0, ans=0.1 2023-10-04 12:04:52,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 12:04:53,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:56,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 12:04:56,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 12:04:57,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:01,209 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.55 vs. limit=15.0 2023-10-04 12:05:10,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:05:12,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:05:12,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:05:12,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:05:12,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:05:12,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:05:13,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:05:15,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 12:05:16,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:05:16,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:05:18,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:05:19,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:22,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:05:23,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:05:24,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1648253.3333333333, ans=0.125 2023-10-04 12:05:25,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:27,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:05:30,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:05:30,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:31,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:33,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:05:37,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:05:39,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 12:05:39,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 12:05:40,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:05:42,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:05:42,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 12:05:43,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:05:44,108 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=12.0 2023-10-04 12:05:44,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:05:44,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:05:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:05:44,773 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 12:05:44,809 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 12:05:44,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:05:46,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:47,444 INFO [train.py:1046] (2/4) Epoch 47, batch 2900, loss[loss=0.1429, simple_loss=0.2312, pruned_loss=0.02731, over 24686.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2335, pruned_loss=0.03608, over 4716990.41 frames. ], batch size: 65, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:05:50,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:05:50,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:05:50,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:05:51,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 12:05:53,560 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.030e+02 2.253e+02 2.601e+02 4.096e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-04 12:05:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:56,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 12:05:58,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 12:05:59,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:05:59,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:06:02,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:06:04,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:06:05,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:06:07,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:06:07,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1648453.3333333333, ans=0.125 2023-10-04 12:06:10,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:06:10,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 12:06:10,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:06:12,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:13,461 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.77 vs. limit=15.0 2023-10-04 12:06:14,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 12:06:15,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 12:06:16,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.82 vs. limit=22.5 2023-10-04 12:06:18,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:06:18,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 12:06:18,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:06:19,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:06:19,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:06:22,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:06:24,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:28,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:06:31,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:06:32,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 12:06:32,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 12:06:32,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:06:36,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1648586.6666666667, ans=0.125 2023-10-04 12:06:38,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1648586.6666666667, ans=0.125 2023-10-04 12:06:39,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:06:40,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 12:06:40,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:06:46,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:06:53,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:06:53,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1648653.3333333333, ans=0.1 2023-10-04 12:06:55,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 12:06:58,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:06:58,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 12:06:59,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:07:00,878 INFO [train.py:1046] (2/4) Epoch 47, batch 2950, loss[loss=0.1548, simple_loss=0.234, pruned_loss=0.03782, over 23759.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2343, pruned_loss=0.03614, over 4717453.83 frames. ], batch size: 164, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:07:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:07:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:07:07,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 12:07:09,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:07:09,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:10,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:12,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:07:13,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 12:07:13,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1648720.0, ans=0.125 2023-10-04 12:07:14,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 12:07:16,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:07:16,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:07:20,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:07:21,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:07:21,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1648786.6666666667, ans=0.125 2023-10-04 12:07:24,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:07:24,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:07:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:07:27,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:07:29,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:29,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:29,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:07:30,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-04 12:07:32,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 12:07:37,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1648853.3333333333, ans=0.125 2023-10-04 12:07:39,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 12:07:39,337 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 12:07:40,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:07:42,039 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 12:07:42,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 12:07:43,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:07:43,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:07:43,477 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 12:07:43,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:07:46,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 12:07:47,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:07:47,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:07:50,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:50,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1648920.0, ans=0.0 2023-10-04 12:07:51,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:07:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:07:51,675 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 12:07:51,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:51,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 12:07:54,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1648920.0, ans=0.025 2023-10-04 12:07:59,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:08:00,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:08:01,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 12:08:01,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:08:05,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 12:08:06,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:08:07,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:08:08,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:08:11,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:08:11,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:08:13,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:08:14,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:14,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:08:14,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:08:14,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:08:15,876 INFO [train.py:1046] (2/4) Epoch 47, batch 3000, loss[loss=0.1692, simple_loss=0.2401, pruned_loss=0.04918, over 23690.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2356, pruned_loss=0.03676, over 4717927.92 frames. ], batch size: 164, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:08:15,876 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 12:08:28,119 INFO [train.py:1078] (2/4) Epoch 47, validation: loss=0.3516, simple_loss=0.269, pruned_loss=0.2171, over 1125622.00 frames. 2023-10-04 12:08:28,120 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 12:08:28,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:08:29,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 12:08:30,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1649053.3333333333, ans=15.0 2023-10-04 12:08:31,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:33,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:08:34,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:08:34,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.80 vs. limit=15.0 2023-10-04 12:08:35,323 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.022e+02 2.270e+02 2.675e+02 4.950e+02, threshold=4.541e+02, percent-clipped=1.0 2023-10-04 12:08:36,890 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 12:08:36,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 12:08:39,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1649053.3333333333, ans=0.09899494936611666 2023-10-04 12:08:40,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:08:40,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:08:41,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 12:08:41,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:08:49,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:08:54,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1649120.0, ans=0.0 2023-10-04 12:08:55,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1649186.6666666667, ans=0.1 2023-10-04 12:08:56,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:09:03,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 12:09:04,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:09:07,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:09:07,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:09:09,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:09:10,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:09:10,957 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 12:09:14,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 12:09:14,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:09:15,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:09:17,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:09:17,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:09:17,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:17,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:09:21,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:09:22,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:09:22,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:09:25,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:09:26,982 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 12:09:28,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:09:28,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:09:30,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:09:33,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:33,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:35,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-10-04 12:09:36,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:09:36,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 12:09:36,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:09:36,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 12:09:37,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:09:39,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 12:09:40,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:09:42,640 INFO [train.py:1046] (2/4) Epoch 47, batch 3050, loss[loss=0.1498, simple_loss=0.236, pruned_loss=0.03184, over 23257.00 frames. ], tot_loss[loss=0.155, simple_loss=0.236, pruned_loss=0.037, over 4717667.96 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:09:42,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:09:42,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 12:09:44,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 12:09:44,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:09:46,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:09:46,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:46,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:09:46,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:09:47,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:09:48,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 12:09:51,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:09:51,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1649386.6666666667, ans=0.125 2023-10-04 12:09:53,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:09:53,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.74 vs. limit=15.0 2023-10-04 12:09:54,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:09:57,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:00,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 12:10:06,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 12:10:06,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 12:10:08,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:09,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:10:14,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:14,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:10:16,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:17,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:10:17,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1649520.0, ans=0.125 2023-10-04 12:10:18,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:10:18,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:18,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:10:18,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:20,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:20,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:24,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:24,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 12:10:25,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:25,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:10:27,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:10:28,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:10:28,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:10:30,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:35,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:36,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:40,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:42,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:10:42,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:42,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1649653.3333333333, ans=0.1 2023-10-04 12:10:44,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:10:44,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:10:44,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:10:46,338 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:10:47,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 12:10:48,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:10:48,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:50,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 12:10:50,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1649653.3333333333, ans=0.1 2023-10-04 12:10:51,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:55,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:57,083 INFO [train.py:1046] (2/4) Epoch 47, batch 3100, loss[loss=0.1437, simple_loss=0.2254, pruned_loss=0.03102, over 23575.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.03657, over 4722793.45 frames. ], batch size: 134, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:10:58,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:10:59,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:11:01,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 12:11:01,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1649720.0, ans=0.1 2023-10-04 12:11:03,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 12:11:04,586 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.016e+02 2.223e+02 2.559e+02 3.925e+02, threshold=4.446e+02, percent-clipped=0.0 2023-10-04 12:11:04,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 12:11:06,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:11:09,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:11:09,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:13,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 12:11:17,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:21,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 12:11:25,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:11:25,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:27,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:11:27,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:11:27,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 12:11:28,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:11:28,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 12:11:28,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:11:30,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:31,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 12:11:34,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:11:38,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:11:38,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 12:11:39,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 12:11:42,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:42,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:43,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1649920.0, ans=0.09899494936611666 2023-10-04 12:11:44,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:11:44,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:44,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:11:45,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:11:45,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:11:48,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:11:50,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:11:50,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:50,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:11:54,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:11:55,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 12:11:58,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:11:58,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 12:11:59,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:11:59,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:59,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 12:12:11,724 INFO [train.py:1046] (2/4) Epoch 47, batch 3150, loss[loss=0.1492, simple_loss=0.2341, pruned_loss=0.03216, over 23427.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2335, pruned_loss=0.03634, over 4712946.60 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 4.0 2023-10-04 12:12:11,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 12:12:13,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:15,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:12:17,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:12:17,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:12:18,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 12:12:19,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:19,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:12:19,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 12:12:22,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:24,001 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 12:12:26,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 12:12:26,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:12:28,229 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 12:12:28,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:12:28,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1650120.0, ans=0.125 2023-10-04 12:12:29,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 12:12:30,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 12:12:30,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 12:12:30,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:32,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:12:32,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:33,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 12:12:35,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:35,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:37,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:12:38,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 12:12:41,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 12:12:43,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:12:44,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:12:46,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:12:48,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 12:12:50,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 12:12:51,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:12:52,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:12:52,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:12:53,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:12:53,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:12:54,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:12:54,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:12:56,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 12:12:56,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:12:56,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:12:56,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1650253.3333333333, ans=0.0 2023-10-04 12:12:57,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:12:59,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:12:59,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 12:12:59,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:01,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 12:13:01,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:03,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 12:13:03,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 12:13:04,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:13:04,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:05,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1650253.3333333333, ans=0.125 2023-10-04 12:13:06,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 12:13:07,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 12:13:08,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:13:12,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:13:14,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:14,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:13:14,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1650320.0, ans=0.2 2023-10-04 12:13:18,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:13:18,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:21,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 12:13:25,664 INFO [train.py:1046] (2/4) Epoch 47, batch 3200, loss[loss=0.1511, simple_loss=0.2183, pruned_loss=0.04191, over 23749.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2319, pruned_loss=0.03599, over 4709674.01 frames. ], batch size: 164, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:13:25,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:13:25,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 12:13:28,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:31,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:13:31,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 12:13:31,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1650386.6666666667, ans=0.0 2023-10-04 12:13:32,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:34,113 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.237e+02 2.558e+02 4.162e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-04 12:13:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:13:41,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:49,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:13:49,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1650453.3333333333, ans=0.125 2023-10-04 12:13:54,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1650520.0, ans=0.125 2023-10-04 12:13:58,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 12:13:58,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:14:01,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 12:14:03,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:14:05,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:14:05,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:14:07,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:14:12,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 12:14:13,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 12:14:15,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 12:14:17,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 12:14:18,447 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.29 vs. limit=10.0 2023-10-04 12:14:20,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:14:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:26,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:14:28,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:28,487 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 12:14:28,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:14:31,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:14:32,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 12:14:34,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 12:14:35,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 12:14:37,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 12:14:39,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:14:40,328 INFO [train.py:1046] (2/4) Epoch 47, batch 3250, loss[loss=0.1519, simple_loss=0.2347, pruned_loss=0.0345, over 24580.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2326, pruned_loss=0.03623, over 4712281.80 frames. ], batch size: 60, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:14:40,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:14:40,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 12:14:40,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1650720.0, ans=0.2 2023-10-04 12:14:41,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:14:41,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:14:43,668 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 12:14:46,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:14:47,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:14:55,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:14:55,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 12:14:56,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:14:56,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:56,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:14:57,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1650786.6666666667, ans=0.0 2023-10-04 12:14:58,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:14:58,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:14:59,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1650786.6666666667, ans=0.125 2023-10-04 12:15:01,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:02,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:15:03,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:03,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:03,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:15:05,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:06,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:15:10,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:10,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:12,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:12,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:15:12,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:15:17,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 12:15:17,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:15:17,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:15:19,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:15:20,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:15:28,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:15:34,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:15:34,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:34,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 12:15:34,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:15:34,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:15:36,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 12:15:38,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 12:15:39,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:15:40,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:15:41,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:15:42,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 12:15:43,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:15:44,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:15:46,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:15:47,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 12:15:47,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:15:49,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:15:50,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 12:15:54,445 INFO [train.py:1046] (2/4) Epoch 47, batch 3300, loss[loss=0.1478, simple_loss=0.2281, pruned_loss=0.03376, over 23223.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2336, pruned_loss=0.0365, over 4711620.84 frames. ], batch size: 105, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:15:54,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:15:54,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 12:15:55,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 12:15:57,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 12:15:57,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:01,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:16:02,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:16:02,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:03,620 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.067e+02 2.305e+02 2.787e+02 4.644e+02, threshold=4.609e+02, percent-clipped=2.0 2023-10-04 12:16:05,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:16:05,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:16:06,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1651053.3333333333, ans=0.2 2023-10-04 12:16:07,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:08,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:16:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 12:16:14,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:16:14,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:17,561 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 12:16:17,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:16:19,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:16:19,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:16:19,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:16:19,613 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.35 vs. limit=10.0 2023-10-04 12:16:20,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 12:16:24,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:24,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:16:27,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:27,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 12:16:27,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1651186.6666666667, ans=0.125 2023-10-04 12:16:28,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 12:16:28,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:29,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:16:31,408 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 12:16:33,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 12:16:33,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:16:33,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1651186.6666666667, ans=0.125 2023-10-04 12:16:36,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 12:16:38,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:16:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:16:42,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:16:44,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:16:44,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:44,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:44,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:16:46,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:16:46,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:48,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:16:49,589 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 12:16:49,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 12:16:51,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1651253.3333333333, ans=0.0 2023-10-04 12:16:53,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:16:55,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:16:55,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:16:56,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:56,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:16:57,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:16:57,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:16:58,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:16:58,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:59,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:17:00,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 12:17:01,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:03,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:06,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:17:06,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:17:07,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:08,858 INFO [train.py:1046] (2/4) Epoch 47, batch 3350, loss[loss=0.1559, simple_loss=0.2318, pruned_loss=0.03995, over 23772.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2345, pruned_loss=0.03676, over 4716971.63 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:17:10,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:17:10,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:13,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:17:15,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:15,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:17:19,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:20,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:17:23,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:23,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:17:24,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 12:17:26,030 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 12:17:26,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:26,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1651453.3333333333, ans=0.125 2023-10-04 12:17:26,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1651453.3333333333, ans=0.1 2023-10-04 12:17:28,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 12:17:28,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 12:17:29,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1651453.3333333333, ans=0.125 2023-10-04 12:17:30,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:17:30,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:17:31,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:31,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 12:17:31,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:32,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:17:34,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:34,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:34,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:36,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:17:40,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:42,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:42,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:46,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:17:48,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:50,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:50,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:51,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1651586.6666666667, ans=0.0 2023-10-04 12:17:51,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=1651586.6666666667, ans=0.1 2023-10-04 12:17:53,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:54,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 12:17:54,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:17:54,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 12:17:54,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:17:57,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 12:17:57,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:58,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1651586.6666666667, ans=0.0 2023-10-04 12:18:03,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1651586.6666666667, ans=0.0 2023-10-04 12:18:07,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:18:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 12:18:08,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:18:10,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:18:12,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:18:16,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:18:18,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 12:18:18,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:18:18,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:18:19,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:18:20,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 12:18:21,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:18:22,926 INFO [train.py:1046] (2/4) Epoch 47, batch 3400, loss[loss=0.2078, simple_loss=0.283, pruned_loss=0.06631, over 19363.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.235, pruned_loss=0.03639, over 4723148.14 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:18:22,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 12:18:24,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:18:24,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:18:25,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:18:26,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:18:26,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 12:18:27,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1651720.0, ans=0.125 2023-10-04 12:18:31,053 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.016e+02 2.273e+02 2.707e+02 4.180e+02, threshold=4.545e+02, percent-clipped=0.0 2023-10-04 12:18:31,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 12:18:31,172 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 12:18:31,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:18:31,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1651720.0, ans=0.125 2023-10-04 12:18:37,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:18:37,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:18:37,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:18:38,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:18:44,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:18:44,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 12:18:49,691 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.35 vs. limit=8.0 2023-10-04 12:18:50,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:18:51,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:18:51,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:18:53,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 12:18:56,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:19:01,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1651853.3333333333, ans=0.125 2023-10-04 12:19:02,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 12:19:05,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1651920.0, ans=0.0 2023-10-04 12:19:08,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:19:08,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:19:09,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 12:19:09,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:19:10,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:11,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:19:11,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:19:15,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:19:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:19:18,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:19:23,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:19:25,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 12:19:29,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:19:33,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 12:19:36,416 INFO [train.py:1046] (2/4) Epoch 47, batch 3450, loss[loss=0.1683, simple_loss=0.2526, pruned_loss=0.04202, over 24075.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2344, pruned_loss=0.0363, over 4724858.95 frames. ], batch size: 86, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:19:36,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 12:19:38,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:19:40,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:19:40,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 12:19:40,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1652053.3333333333, ans=0.125 2023-10-04 12:19:41,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:19:43,989 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.44 vs. limit=6.0 2023-10-04 12:19:44,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:19:50,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:19:50,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:19:52,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:19:52,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:55,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:58,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 12:20:03,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 12:20:03,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:20:05,038 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:20:07,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:08,763 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.13 vs. limit=15.0 2023-10-04 12:20:11,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1652186.6666666667, ans=0.2 2023-10-04 12:20:14,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 12:20:14,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:20:20,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:20:20,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:20:21,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:20:21,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:20:23,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 12:20:23,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:20:25,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:20:28,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:20:31,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 12:20:34,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:20:36,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1652320.0, ans=0.1 2023-10-04 12:20:40,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:20:41,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:43,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:20:44,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1652320.0, ans=0.2 2023-10-04 12:20:46,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:46,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:20:48,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:20:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:20:50,999 INFO [train.py:1046] (2/4) Epoch 47, batch 3500, loss[loss=0.1454, simple_loss=0.1989, pruned_loss=0.0459, over 18953.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03613, over 4698377.64 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:20:52,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:20:52,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1652386.6666666667, ans=0.0 2023-10-04 12:20:57,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:20:57,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 12:20:58,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1652386.6666666667, ans=0.125 2023-10-04 12:20:59,813 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.045e+02 2.266e+02 2.721e+02 4.311e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 12:20:59,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:21:02,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:21:04,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:21:05,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 12:21:09,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:21:09,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:21:11,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:21:11,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:21:11,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:21:13,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:13,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:21:13,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 12:21:16,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:16,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:21:19,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:21:22,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:23,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 12:21:23,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:21:26,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:21:26,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:21:28,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:30,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:21:30,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:21:31,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 12:21:32,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 12:21:34,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 12:21:34,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:21:35,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:37,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:21:37,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:21:41,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:21:42,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:21:44,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1652586.6666666667, ans=0.125 2023-10-04 12:21:50,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:21:50,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 12:21:50,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 12:21:50,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:21:51,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:21:53,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:21:54,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:56,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1652653.3333333333, ans=0.125 2023-10-04 12:21:57,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 12:21:57,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:22:00,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:22:00,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 12:22:01,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 12:22:03,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:03,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:22:03,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1652720.0, ans=0.2 2023-10-04 12:22:04,900 INFO [train.py:1046] (2/4) Epoch 47, batch 3550, loss[loss=0.1383, simple_loss=0.2234, pruned_loss=0.02666, over 13722.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2323, pruned_loss=0.036, over 4695639.43 frames. ], batch size: 29, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:22:05,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:05,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:07,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:22:15,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:17,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 12:22:19,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:22:21,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:22:21,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:22,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:22:22,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:22:26,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:22:26,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:22:28,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:28,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:22:28,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:22:34,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:22:35,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:22:35,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1652853.3333333333, ans=0.125 2023-10-04 12:22:38,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:22:38,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:38,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:22:39,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 12:22:39,471 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:40,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:42,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:22:47,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:47,810 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.97 vs. limit=15.0 2023-10-04 12:22:49,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:22:49,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:50,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 12:22:51,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:22:53,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 12:22:53,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:22:54,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1652920.0, ans=10.0 2023-10-04 12:22:57,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:22:57,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:23:00,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 12:23:01,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:04,015 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.16 vs. limit=12.0 2023-10-04 12:23:06,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:06,487 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:23:07,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 12:23:08,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:11,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:23:13,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 12:23:18,606 INFO [train.py:1046] (2/4) Epoch 47, batch 3600, loss[loss=0.1535, simple_loss=0.233, pruned_loss=0.03701, over 23547.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2328, pruned_loss=0.03585, over 4708359.45 frames. ], batch size: 120, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:23:20,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 12:23:20,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:23:20,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:23:23,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:23,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:24,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:23:27,224 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.182e+02 2.453e+02 2.820e+02 4.666e+02, threshold=4.905e+02, percent-clipped=3.0 2023-10-04 12:23:27,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:23:27,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:30,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:23:31,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:23:31,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:31,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 12:23:33,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=12.0 2023-10-04 12:23:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:23:37,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:40,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:23:41,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:23:42,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.81 vs. limit=6.0 2023-10-04 12:23:43,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:23:43,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:23:43,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 12:23:44,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:23:46,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:46,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:23:50,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:53,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:23:54,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:23:55,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 12:23:58,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1653186.6666666667, ans=0.1 2023-10-04 12:24:00,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:01,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:24:01,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 12:24:04,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1653253.3333333333, ans=0.125 2023-10-04 12:24:06,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:24:11,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:14,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:23,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:24:23,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:24:23,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 12:24:24,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 12:24:26,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 12:24:29,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:24:29,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:24:31,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 12:24:31,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:24:31,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:24:31,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:34,580 INFO [train.py:1046] (2/4) Epoch 47, batch 3650, loss[loss=0.1492, simple_loss=0.2354, pruned_loss=0.0315, over 24452.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03626, over 4709189.14 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:24:34,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 12:24:35,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 12:24:38,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:38,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 12:24:42,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 12:24:43,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:24:45,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 12:24:46,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 12:24:51,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:24:51,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:24:51,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:24:53,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1653453.3333333333, ans=0.125 2023-10-04 12:24:55,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:24:55,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:57,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 12:24:58,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:24:58,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:24:58,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 12:24:58,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1653453.3333333333, ans=0.0 2023-10-04 12:25:00,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:25:00,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:25:00,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:04,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:25:05,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 12:25:07,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 12:25:07,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:25:09,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 12:25:10,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1653520.0, ans=0.09899494936611666 2023-10-04 12:25:11,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:25:11,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:25:17,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:25:17,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:17,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:25:19,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:25:20,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:25:24,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1653586.6666666667, ans=0.0 2023-10-04 12:25:25,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:25:28,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:25:28,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:29,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:25:31,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:25:32,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:32,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:25:38,403 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 12:25:40,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1653653.3333333333, ans=0.125 2023-10-04 12:25:41,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:25:41,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:25:42,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.02 vs. limit=6.0 2023-10-04 12:25:43,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:25:43,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:44,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:25:44,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:47,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 12:25:47,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:48,590 INFO [train.py:1046] (2/4) Epoch 47, batch 3700, loss[loss=0.1617, simple_loss=0.2437, pruned_loss=0.03984, over 23260.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2339, pruned_loss=0.0365, over 4705713.58 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:25:48,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:25:50,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:25:52,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:25:56,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:56,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 12:25:56,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:56,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:25:58,225 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 1.988e+02 2.215e+02 2.636e+02 3.694e+02, threshold=4.430e+02, percent-clipped=0.0 2023-10-04 12:25:58,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:26:01,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:26:02,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:02,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:03,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:26:05,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:26:05,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:26:08,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:09,468 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 12:26:16,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:26:16,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:26:19,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:26:19,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 12:26:19,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:26:24,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:24,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 12:26:25,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:27,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:26:27,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:29,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:26:31,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:26:36,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:26:37,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 12:26:37,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:37,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 12:26:42,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:26:42,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:26:45,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:45,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 12:26:48,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:26:48,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:26:48,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:26:48,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:50,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1653986.6666666667, ans=0.0 2023-10-04 12:26:52,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:26:52,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 12:26:54,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 12:26:55,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:26:55,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:26:55,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:26:57,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:26:59,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:27:01,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:27:02,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:03,832 INFO [train.py:1046] (2/4) Epoch 47, batch 3750, loss[loss=0.1326, simple_loss=0.2085, pruned_loss=0.02831, over 19386.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2344, pruned_loss=0.03643, over 4710591.72 frames. ], batch size: 42, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:27:04,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1654053.3333333333, ans=0.125 2023-10-04 12:27:04,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1654053.3333333333, ans=0.2 2023-10-04 12:27:05,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 12:27:05,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 12:27:08,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:27:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 12:27:09,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:27:10,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:27:12,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:27:14,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:27:18,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:27:20,785 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.13 vs. limit=22.5 2023-10-04 12:27:21,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:27:22,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:27:24,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:27:27,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:27:30,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 12:27:31,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:27:34,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:27:34,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:27:35,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 12:27:35,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1654186.6666666667, ans=0.125 2023-10-04 12:27:39,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 12:27:39,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:27:41,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:27:42,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:27:47,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:48,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 12:27:51,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 12:27:54,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:54,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.05 vs. limit=15.0 2023-10-04 12:27:57,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:27:58,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:28:01,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:28:05,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:28:06,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:28:08,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:28:08,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1654320.0, ans=0.05 2023-10-04 12:28:09,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:28:11,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:28:14,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1654320.0, ans=0.0 2023-10-04 12:28:14,981 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.54 vs. limit=15.0 2023-10-04 12:28:18,583 INFO [train.py:1046] (2/4) Epoch 47, batch 3800, loss[loss=0.1511, simple_loss=0.2226, pruned_loss=0.03976, over 23844.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2344, pruned_loss=0.03651, over 4715058.40 frames. ], batch size: 195, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:28:18,711 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:28:22,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:24,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:28:24,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 12:28:25,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1654386.6666666667, ans=0.125 2023-10-04 12:28:27,395 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.785e+02 2.014e+02 2.169e+02 2.643e+02 4.021e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-04 12:28:27,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:28:28,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:28:30,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:28:32,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 12:28:32,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:32,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:28:32,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1654453.3333333333, ans=0.125 2023-10-04 12:28:33,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:28:33,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:28:35,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:35,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 12:28:35,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1654453.3333333333, ans=0.0 2023-10-04 12:28:38,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 12:28:38,220 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:28:41,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:28:44,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:28:44,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:28:45,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:28:45,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:48,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1654520.0, ans=0.2 2023-10-04 12:28:49,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:50,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:51,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1654520.0, ans=0.125 2023-10-04 12:28:55,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:28:55,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 12:28:57,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:29:03,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:29:03,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1654586.6666666667, ans=0.0 2023-10-04 12:29:07,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:29:10,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 12:29:10,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 12:29:12,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:29:13,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:29:14,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:15,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 12:29:21,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 12:29:21,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 12:29:21,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:22,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:29:26,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:29:26,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:29:33,658 INFO [train.py:1046] (2/4) Epoch 47, batch 3850, loss[loss=0.1109, simple_loss=0.1649, pruned_loss=0.02846, over 19489.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.03606, over 4713851.12 frames. ], batch size: 389, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:29:33,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:29:33,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 12:29:36,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:29:36,520 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:40,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:29:43,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:29:46,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:29:46,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 12:29:52,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:29:54,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:55,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:29:56,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:30:00,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1654786.6666666667, ans=15.0 2023-10-04 12:30:00,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.82 vs. limit=6.0 2023-10-04 12:30:00,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:01,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:30:01,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:01,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:30:03,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:05,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:05,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:30:05,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 12:30:06,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 12:30:08,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:30:08,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:09,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1654853.3333333333, ans=0.0 2023-10-04 12:30:11,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:11,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:12,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 12:30:12,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1654853.3333333333, ans=0.0 2023-10-04 12:30:14,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 12:30:16,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:18,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 12:30:19,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1654920.0, ans=0.2 2023-10-04 12:30:20,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:30:26,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:27,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1654920.0, ans=0.125 2023-10-04 12:30:28,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:30,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:31,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 12:30:32,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 12:30:36,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:37,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:40,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:30:40,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:30:40,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:41,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:41,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:30:41,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 12:30:43,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:30:44,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 12:30:44,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:44,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:47,532 INFO [train.py:1046] (2/4) Epoch 47, batch 3900, loss[loss=0.1319, simple_loss=0.2054, pruned_loss=0.0292, over 23644.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.03575, over 4735204.91 frames. ], batch size: 256, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:30:47,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:30:47,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:50,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:30:50,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:50,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:50,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:30:50,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 12:30:50,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:54,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:30:56,161 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.990e+02 2.251e+02 2.622e+02 3.660e+02, threshold=4.502e+02, percent-clipped=0.0 2023-10-04 12:30:56,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:30:56,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:30:58,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:31:01,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:31:01,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:31:01,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1655120.0, ans=0.125 2023-10-04 12:31:02,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:31:02,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 12:31:02,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:31:03,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 12:31:05,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:31:05,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 12:31:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 12:31:11,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:31:13,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:31:13,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:31:14,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:19,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:31:21,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:31:23,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:31:23,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:31:24,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:31:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:31:30,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:31:31,579 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.52 vs. limit=15.0 2023-10-04 12:31:37,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:31:39,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:31:47,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:31:51,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:53,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 12:31:53,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 12:31:53,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:54,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 12:31:55,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:31:55,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 12:32:01,860 INFO [train.py:1046] (2/4) Epoch 47, batch 3950, loss[loss=0.1569, simple_loss=0.242, pruned_loss=0.0359, over 23362.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.0358, over 4716549.95 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:32:03,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:32:04,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 12:32:04,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:32:05,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1655386.6666666667, ans=0.1 2023-10-04 12:32:08,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:32:10,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:32:15,622 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 12:32:15,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:32:17,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 12:32:17,067 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 12:32:17,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:32:20,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:32:20,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:32:20,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:32:24,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 12:32:25,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:32:25,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:32:26,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:32:27,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:32:27,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:32:28,746 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:32:39,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:32:39,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:32:43,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1655520.0, ans=0.0 2023-10-04 12:32:45,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 12:32:49,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 12:32:49,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 12:32:50,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:32:51,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:32:51,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1655586.6666666667, ans=0.0 2023-10-04 12:32:56,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:32:58,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:32:58,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:32:58,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:32:58,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 12:33:02,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:33:04,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:33:07,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 12:33:16,702 INFO [train.py:1046] (2/4) Epoch 47, batch 4000, loss[loss=0.1508, simple_loss=0.2421, pruned_loss=0.02973, over 24366.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2341, pruned_loss=0.03607, over 4723169.21 frames. ], batch size: 74, lr: 2.15e-03, grad_scale: 32.0 2023-10-04 12:33:18,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:24,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:25,572 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.039e+02 2.177e+02 2.497e+02 4.705e+02, threshold=4.355e+02, percent-clipped=2.0 2023-10-04 12:33:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:33:29,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:33:29,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:31,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 12:33:31,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:33:32,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 12:33:32,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:33:32,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 12:33:32,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1655786.6666666667, ans=0.07 2023-10-04 12:33:34,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:33:34,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1655786.6666666667, ans=0.125 2023-10-04 12:33:37,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:33:37,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:33:37,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:33:37,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:33:37,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:33:40,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:33:42,201 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 12:33:42,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:33:43,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:33:46,960 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 12:33:47,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:33:48,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:33:54,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 12:33:55,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:33:55,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1655853.3333333333, ans=0.2 2023-10-04 12:33:56,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:33:58,318 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 12:33:59,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:34:01,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 12:34:01,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:34:02,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:34:03,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:34:05,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.54 vs. limit=15.0 2023-10-04 12:34:06,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:34:06,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:34:07,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:34:10,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 12:34:10,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:34:10,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 12:34:15,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:34:18,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 12:34:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:34:21,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:34:22,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:34:23,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:34:23,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1655986.6666666667, ans=0.125 2023-10-04 12:34:27,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:34:30,569 INFO [train.py:1046] (2/4) Epoch 47, batch 4050, loss[loss=0.2012, simple_loss=0.2734, pruned_loss=0.06454, over 19475.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2346, pruned_loss=0.0361, over 4731227.70 frames. ], batch size: 388, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:34:30,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:34:31,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 12:34:33,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:34:34,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:34:36,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:34:36,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:34:37,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:34:40,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:34:43,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:34:44,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:34:46,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:34:46,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:34:50,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:34:51,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:34:54,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 12:34:56,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 12:34:56,146 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 12:34:58,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:35:00,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1656186.6666666667, ans=0.025 2023-10-04 12:35:05,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 12:35:06,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:35:10,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:35:14,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:35:15,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:35:15,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:35:19,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:35:19,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1656253.3333333333, ans=0.125 2023-10-04 12:35:20,802 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:35:21,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 12:35:21,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:35:22,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.67 vs. limit=15.0 2023-10-04 12:35:25,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:35:26,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 12:35:30,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:35:35,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1656320.0, ans=0.125 2023-10-04 12:35:36,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 12:35:38,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:35:38,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:35:40,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 12:35:40,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 12:35:40,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:40,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1656320.0, ans=0.2 2023-10-04 12:35:41,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:35:43,426 INFO [train.py:1046] (2/4) Epoch 47, batch 4100, loss[loss=0.1435, simple_loss=0.2232, pruned_loss=0.03191, over 24311.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2353, pruned_loss=0.03633, over 4716217.00 frames. ], batch size: 56, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:35:43,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:43,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:35:47,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.96 vs. limit=22.5 2023-10-04 12:35:47,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 12:35:49,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 12:35:50,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 12:35:53,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 12:35:53,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:53,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:54,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:54,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:35:55,639 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 12:35:56,922 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.115e+02 2.367e+02 2.901e+02 4.348e+02, threshold=4.733e+02, percent-clipped=0.0 2023-10-04 12:35:57,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:35:58,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:35:58,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:58,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:36:01,975 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.77 vs. limit=22.5 2023-10-04 12:36:03,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:36:05,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:36:06,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:36:06,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 12:36:07,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:36:07,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:36:07,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:36:07,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:36:07,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 12:36:10,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:14,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 12:36:14,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:36:16,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:36:16,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 12:36:18,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:36:19,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:36:19,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:36:21,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 12:36:22,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:36:24,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:36:26,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 12:36:26,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:36:27,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:36:30,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:35,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:36:39,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:36:39,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:36:47,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:36:47,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:51,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:36:53,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1656653.3333333333, ans=0.1 2023-10-04 12:36:54,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:36:56,202 INFO [train.py:1046] (2/4) Epoch 47, batch 4150, loss[loss=0.1476, simple_loss=0.2241, pruned_loss=0.03558, over 24326.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.236, pruned_loss=0.03684, over 4714084.18 frames. ], batch size: 56, lr: 2.15e-03, grad_scale: 4.0 2023-10-04 12:36:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:36:58,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1656720.0, ans=0.125 2023-10-04 12:36:59,876 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:37:01,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:37:01,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:37:04,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1656720.0, ans=0.1 2023-10-04 12:37:05,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 12:37:05,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:37:05,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 12:37:05,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1656720.0, ans=0.0 2023-10-04 12:37:06,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 12:37:06,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 12:37:06,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:37:11,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:37:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:37:15,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:37:15,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:37:16,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:37:18,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:37:18,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:37:18,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1656786.6666666667, ans=0.0 2023-10-04 12:37:20,426 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 12:37:24,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:37:29,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:37:29,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 12:37:30,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 12:37:31,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:37:32,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 12:37:32,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:37:32,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:37:32,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1656853.3333333333, ans=0.0 2023-10-04 12:37:36,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:36,875 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.71 vs. limit=15.0 2023-10-04 12:37:37,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:37:40,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 12:37:42,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:37:44,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:37:44,597 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:37:46,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 12:37:46,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:37:47,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 12:37:49,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:37:52,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:37:53,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:54,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 12:37:54,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:37:54,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:37:55,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.18 vs. limit=15.0 2023-10-04 12:37:57,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:37:59,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 12:37:59,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:59,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:37:59,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:38:00,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1656986.6666666667, ans=0.125 2023-10-04 12:38:01,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 12:38:01,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:38:02,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:38:02,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:38:04,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:38:04,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 12:38:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:38:08,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:38:09,619 INFO [train.py:1046] (2/4) Epoch 47, batch 4200, loss[loss=0.1313, simple_loss=0.2182, pruned_loss=0.02222, over 24309.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03671, over 4703413.67 frames. ], batch size: 61, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:38:09,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 12:38:11,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:38:12,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:38:13,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:38:14,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:38:14,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:38:18,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 12:38:20,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 12:38:20,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:23,200 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.120e+02 2.356e+02 2.706e+02 4.391e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 12:38:24,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:38:27,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:38:28,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:38:29,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:38:31,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:31,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 12:38:31,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:38:34,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:34,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:38:34,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:38:36,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:38:38,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 12:38:38,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:42,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:38:43,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:38:46,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:38:46,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:38:47,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1657186.6666666667, ans=0.0 2023-10-04 12:38:48,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1657186.6666666667, ans=0.2 2023-10-04 12:38:50,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:38:50,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 12:38:50,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:38:51,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:38:55,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:38:56,487 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:38:56,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1657253.3333333333, ans=0.1 2023-10-04 12:39:02,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:39:05,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 12:39:06,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:39:10,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:39:10,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:12,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 12:39:18,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:39:23,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:39:24,713 INFO [train.py:1046] (2/4) Epoch 47, batch 4250, loss[loss=0.1431, simple_loss=0.2225, pruned_loss=0.03184, over 23438.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.03628, over 4708926.64 frames. ], batch size: 119, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:39:24,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:39:26,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:32,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:39:32,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 12:39:32,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:39:35,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:38,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:39:42,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:42,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:45,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:39:45,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:39:46,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:48,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:48,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:48,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1657453.3333333333, ans=0.125 2023-10-04 12:39:50,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:39:52,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:39:53,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 12:39:56,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 12:39:56,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:56,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1657520.0, ans=0.2 2023-10-04 12:39:57,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:39:57,800 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:59,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:39:59,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:59,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:40:03,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:40:03,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:40:08,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:40:09,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:09,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 12:40:09,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:40:10,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 12:40:11,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:40:14,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:40:14,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:40:14,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:40:17,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 12:40:18,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:40:20,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:40:24,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:40:27,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:29,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:40:31,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:40:31,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:40:34,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:40:34,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:40:34,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 12:40:35,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:40:39,649 INFO [train.py:1046] (2/4) Epoch 47, batch 4300, loss[loss=0.1373, simple_loss=0.2267, pruned_loss=0.024, over 24453.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2326, pruned_loss=0.03594, over 4704686.37 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:40:39,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:40:39,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:40:39,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1657720.0, ans=0.125 2023-10-04 12:40:45,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:40:45,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1657720.0, ans=0.125 2023-10-04 12:40:52,820 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.021e+02 2.286e+02 2.632e+02 3.512e+02, threshold=4.572e+02, percent-clipped=0.0 2023-10-04 12:40:54,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:54,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 12:40:54,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:40:57,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:40:57,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:40:57,708 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 12:40:59,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1657786.6666666667, ans=0.2 2023-10-04 12:41:02,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:41:03,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:41:06,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 12:41:06,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:41:07,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 12:41:08,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1657853.3333333333, ans=0.125 2023-10-04 12:41:10,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:41:11,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:41:14,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:41:14,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:41:14,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:41:16,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:41:16,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:41:16,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 12:41:17,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 12:41:20,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:41:23,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:23,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:41:25,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:25,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:41:25,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 12:41:25,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 12:41:25,888 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 12:41:25,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:41:27,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 12:41:27,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 12:41:30,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:41:31,560 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 12:41:32,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.75 vs. limit=5.0 2023-10-04 12:41:33,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:41:33,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1657920.0, ans=0.2 2023-10-04 12:41:35,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:41:35,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:41:36,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1657920.0, ans=0.0 2023-10-04 12:41:37,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 12:41:37,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:41:37,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:38,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:41:38,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:41:40,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:41:42,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:41:44,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:41:45,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:45,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:41:47,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1657986.6666666667, ans=0.2 2023-10-04 12:41:51,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 12:41:53,003 INFO [train.py:1046] (2/4) Epoch 47, batch 4350, loss[loss=0.1584, simple_loss=0.234, pruned_loss=0.04146, over 23422.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2336, pruned_loss=0.03617, over 4716376.02 frames. ], batch size: 285, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:41:53,084 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:41:57,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:41:58,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:42:01,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:42:01,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:42:04,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1658053.3333333333, ans=0.0 2023-10-04 12:42:07,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:42:10,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:42:11,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:42:11,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:42:14,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1658120.0, ans=0.0 2023-10-04 12:42:15,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:42:18,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:42:20,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:42:26,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 12:42:26,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:42:28,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:30,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1658186.6666666667, ans=0.2 2023-10-04 12:42:31,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:34,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 12:42:34,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1658186.6666666667, ans=0.125 2023-10-04 12:42:38,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:42:40,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:42:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 12:42:45,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:42:47,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:42:47,208 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 12:42:48,602 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 12:42:48,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:42:48,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:42:49,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:42:50,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:42:51,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:42:51,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:42:54,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 12:42:54,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:54,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:42:54,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:54,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 12:42:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 12:42:56,117 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 12:42:56,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 12:43:00,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:43:01,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:43:01,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:03,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:43:04,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.37 vs. limit=10.0 2023-10-04 12:43:05,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 12:43:06,655 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 12:43:06,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:07,965 INFO [train.py:1046] (2/4) Epoch 47, batch 4400, loss[loss=0.1371, simple_loss=0.2163, pruned_loss=0.02896, over 24589.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2342, pruned_loss=0.0362, over 4726007.70 frames. ], batch size: 60, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:43:11,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:43:11,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:12,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:43:14,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1658386.6666666667, ans=0.0 2023-10-04 12:43:15,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 12:43:15,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 12:43:15,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1658386.6666666667, ans=0.2 2023-10-04 12:43:16,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 12:43:16,858 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 12:43:18,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:43:18,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:43:20,886 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.809e+02 2.049e+02 2.248e+02 2.577e+02 4.164e+02, threshold=4.496e+02, percent-clipped=0.0 2023-10-04 12:43:21,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 12:43:22,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:22,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:22,414 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 12:43:25,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:25,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 12:43:25,754 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 12:43:25,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1658453.3333333333, ans=0.125 2023-10-04 12:43:28,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 12:43:30,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 12:43:30,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 12:43:31,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:33,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:43:33,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:43:35,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:43:35,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1658453.3333333333, ans=0.0 2023-10-04 12:43:36,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 12:43:37,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 12:43:38,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:41,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:43:41,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:43,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:43,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:43,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 12:43:45,055 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 12:43:45,598 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.21 vs. limit=15.0 2023-10-04 12:43:48,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:48,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1658520.0, ans=0.125 2023-10-04 12:43:50,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.24 vs. limit=8.0 2023-10-04 12:43:53,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:43:55,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 12:43:58,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:44:01,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:44:01,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1658586.6666666667, ans=0.0 2023-10-04 12:44:04,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:44:06,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 12:44:06,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:44:06,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:44:06,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:44:06,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:44:06,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1658653.3333333333, ans=0.125 2023-10-04 12:44:10,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 12:44:13,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 12:44:15,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 12:44:15,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:15,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 12:44:16,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:44:19,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:44:22,206 INFO [train.py:1046] (2/4) Epoch 47, batch 4450, loss[loss=0.1571, simple_loss=0.2299, pruned_loss=0.04215, over 23753.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2355, pruned_loss=0.03663, over 4722824.12 frames. ], batch size: 164, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:44:23,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 12:44:26,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:44:29,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:29,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:44:31,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1658720.0, ans=0.125 2023-10-04 12:44:35,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:44:35,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:44:38,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1658786.6666666667, ans=0.125 2023-10-04 12:44:39,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:40,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:44:43,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:44:43,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:43,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 12:44:43,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:44:44,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:44,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:44:44,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:44:48,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:44:53,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:44:53,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:44:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:44:55,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:56,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:45:00,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:45:01,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 12:45:01,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 12:45:01,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:45:03,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1658853.3333333333, ans=0.125 2023-10-04 12:45:04,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:45:05,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 12:45:08,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:45:08,807 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.29 vs. limit=15.0 2023-10-04 12:45:09,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1658920.0, ans=0.125 2023-10-04 12:45:11,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:45:13,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 12:45:13,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:13,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:45:13,261 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:45:14,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:45:16,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:45:19,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:45:19,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 12:45:20,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:45:20,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1658986.6666666667, ans=0.0 2023-10-04 12:45:23,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:45:24,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:45:26,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:26,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:45:29,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:45:30,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 12:45:32,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:45:36,890 INFO [train.py:1046] (2/4) Epoch 47, batch 4500, loss[loss=0.1444, simple_loss=0.2113, pruned_loss=0.03877, over 22775.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.03701, over 4714164.84 frames. ], batch size: 322, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:45:37,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:45:39,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 12:45:39,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 12:45:41,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:45:45,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:45,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:45:47,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:45:47,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:45:48,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:45:48,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:45:49,651 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.076e+02 2.315e+02 2.981e+02 4.706e+02, threshold=4.629e+02, percent-clipped=1.0 2023-10-04 12:45:59,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:45:59,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:46:01,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:46:02,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:46:05,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:46:11,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:46:14,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:46:18,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-10-04 12:46:18,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:46:21,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:46:21,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 12:46:22,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:24,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:46:25,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1659253.3333333333, ans=0.125 2023-10-04 12:46:27,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:46:27,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:46:29,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:46:29,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 12:46:29,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:46:29,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:34,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:46:34,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:46:37,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:46:39,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:46:41,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 12:46:43,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 12:46:43,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 12:46:48,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 12:46:48,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1659386.6666666667, ans=0.125 2023-10-04 12:46:49,739 INFO [train.py:1046] (2/4) Epoch 47, batch 4550, loss[loss=0.124, simple_loss=0.1984, pruned_loss=0.02478, over 24431.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2342, pruned_loss=0.03665, over 4716389.97 frames. ], batch size: 58, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:46:49,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 12:46:51,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:46:53,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:55,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:58,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:46:58,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1659386.6666666667, ans=0.1 2023-10-04 12:46:58,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1659386.6666666667, ans=0.125 2023-10-04 12:47:03,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:47:05,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:47:05,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:05,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:47:05,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:08,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:09,021 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:47:12,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:47:15,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 12:47:16,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 12:47:17,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:47:18,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 12:47:19,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1659520.0, ans=0.0 2023-10-04 12:47:21,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 12:47:21,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:47:25,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 12:47:25,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:47:30,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:30,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:30,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:47:30,395 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:47:31,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 12:47:33,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:47:36,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:36,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:47:36,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:37,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 12:47:39,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 12:47:39,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:47:40,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 12:47:42,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 12:47:42,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:45,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:45,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:47:46,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:46,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:47:48,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:47:49,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 12:47:51,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:47:51,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 12:47:51,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 12:47:51,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:47:51,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 12:47:54,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:47:54,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:47:54,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1659653.3333333333, ans=0.0 2023-10-04 12:47:56,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:47:56,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:58,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:47:59,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:47:59,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:48:03,437 INFO [train.py:1046] (2/4) Epoch 47, batch 4600, loss[loss=0.1358, simple_loss=0.2202, pruned_loss=0.02572, over 24462.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.03627, over 4725169.30 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:48:03,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:03,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:48:06,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:48:06,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:48:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:08,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 12:48:09,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:48:12,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:48:13,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:14,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:17,261 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.101e+02 2.353e+02 2.748e+02 3.773e+02, threshold=4.707e+02, percent-clipped=0.0 2023-10-04 12:48:21,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 12:48:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:25,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:27,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:48:28,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:33,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 12:48:33,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:48:35,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:48:39,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:41,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:48:43,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:48:46,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 12:48:48,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:48:52,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:55,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:48:58,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:58,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 12:48:58,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:59,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 12:48:59,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:59,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:01,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:02,574 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:49:02,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:04,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 12:49:04,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 12:49:04,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 12:49:04,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:06,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:49:08,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:08,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:12,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.37 vs. limit=6.0 2023-10-04 12:49:18,384 INFO [train.py:1046] (2/4) Epoch 47, batch 4650, loss[loss=0.1458, simple_loss=0.2269, pruned_loss=0.03236, over 23319.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2334, pruned_loss=0.03615, over 4712745.77 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:49:18,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:49:20,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:49:21,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.16 vs. limit=6.0 2023-10-04 12:49:21,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:49:21,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:49:22,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:22,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:49:23,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:49:26,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 12:49:30,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:49:33,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 12:49:33,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:49:34,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 12:49:34,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:49:34,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 12:49:34,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 12:49:34,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:36,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:49:40,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:49:42,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1660120.0, ans=0.0 2023-10-04 12:49:43,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:49:43,702 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 12:49:46,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:49:46,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1660186.6666666667, ans=0.2 2023-10-04 12:49:48,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 12:49:48,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1660186.6666666667, ans=0.2 2023-10-04 12:49:49,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:49,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:49:50,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 12:49:52,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:49:53,094 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.43 vs. limit=15.0 2023-10-04 12:49:53,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:49:56,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1660186.6666666667, ans=0.5 2023-10-04 12:49:57,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:00,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1660186.6666666667, ans=0.125 2023-10-04 12:50:02,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:06,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:50:07,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:07,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:50:08,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 12:50:08,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 12:50:10,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 12:50:10,756 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 12:50:12,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:14,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1660253.3333333333, ans=0.1 2023-10-04 12:50:19,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:50:19,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:50:19,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 12:50:19,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:50:19,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:50:22,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:50:23,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:50:23,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:50:24,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:50:28,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:28,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:50:28,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:50:29,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 12:50:31,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:50:32,620 INFO [train.py:1046] (2/4) Epoch 47, batch 4700, loss[loss=0.1512, simple_loss=0.2437, pruned_loss=0.02934, over 24673.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2339, pruned_loss=0.03631, over 4714700.09 frames. ], batch size: 73, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:50:32,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 12:50:40,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:42,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:42,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:50:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:50:45,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:50:46,212 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.148e+02 2.347e+02 2.759e+02 4.268e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-04 12:50:49,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 12:50:49,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 12:50:51,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:52,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:50:53,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:54,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:57,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1660453.3333333333, ans=0.125 2023-10-04 12:50:59,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1660453.3333333333, ans=0.125 2023-10-04 12:51:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:51:04,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:51:05,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:51:13,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 12:51:14,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:51:17,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:19,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1660586.6666666667, ans=0.0 2023-10-04 12:51:21,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 12:51:21,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:51:26,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:51:27,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 12:51:28,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:28,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:30,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1660653.3333333333, ans=0.125 2023-10-04 12:51:31,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:51:32,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:51:32,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 12:51:33,006 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 12:51:34,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:38,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:38,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:38,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 12:51:39,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:42,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 12:51:46,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:51:47,644 INFO [train.py:1046] (2/4) Epoch 47, batch 4750, loss[loss=0.1652, simple_loss=0.2439, pruned_loss=0.04328, over 23768.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03661, over 4718088.60 frames. ], batch size: 179, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:51:47,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:51:52,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:51:52,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:51:54,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 12:51:54,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:51:54,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1660720.0, ans=0.035 2023-10-04 12:51:57,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 12:51:58,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:51:58,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:58,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:05,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 12:52:08,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:52:11,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 12:52:12,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:15,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:52:15,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:52:15,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:52:17,097 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 12:52:17,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 12:52:21,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 12:52:23,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:52:25,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:52:26,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:52:26,714 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 12:52:26,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:52:28,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:52:31,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:52:34,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 12:52:34,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 12:52:35,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:52:36,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:52:36,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:52:38,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:52:38,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 12:52:42,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 12:52:44,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:52:47,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:52:47,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 12:52:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:49,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:52:51,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:52:53,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:52:53,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:52:53,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1660986.6666666667, ans=0.125 2023-10-04 12:52:58,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:52:58,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 12:52:58,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 12:52:59,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1661053.3333333333, ans=0.0 2023-10-04 12:53:00,048 INFO [train.py:1046] (2/4) Epoch 47, batch 4800, loss[loss=0.1572, simple_loss=0.2444, pruned_loss=0.035, over 24564.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2357, pruned_loss=0.03681, over 4723067.02 frames. ], batch size: 71, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:53:00,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 12:53:01,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:53:01,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:53:04,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 12:53:07,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:09,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:14,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:53:15,288 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.075e+02 2.450e+02 3.076e+02 6.025e+02, threshold=4.900e+02, percent-clipped=3.0 2023-10-04 12:53:15,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:15,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 12:53:17,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:53:19,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:53:20,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:53:24,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:26,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:26,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:53:27,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:27,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 12:53:27,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:28,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:31,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:33,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:35,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:35,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:53:37,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:53:37,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:38,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 12:53:38,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 12:53:40,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1661186.6666666667, ans=0.125 2023-10-04 12:53:41,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:42,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:53:43,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:53:43,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:53:43,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:53:44,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:53:44,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:53:48,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:53:50,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1661253.3333333333, ans=0.1 2023-10-04 12:53:51,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:54,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:53:57,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1661253.3333333333, ans=0.1 2023-10-04 12:53:58,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 12:53:58,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:59,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:59,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:54:00,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:54:04,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:54:05,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:54:05,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:05,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:54:05,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:54:06,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:54:07,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1661320.0, ans=0.1 2023-10-04 12:54:09,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:11,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:54:13,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 12:54:13,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 12:54:14,688 INFO [train.py:1046] (2/4) Epoch 47, batch 4850, loss[loss=0.1472, simple_loss=0.2249, pruned_loss=0.03477, over 23580.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2348, pruned_loss=0.0366, over 4729389.18 frames. ], batch size: 149, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:54:14,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:54:14,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:54:14,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:54:14,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:17,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:54:22,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1661386.6666666667, ans=0.0 2023-10-04 12:54:26,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 12:54:26,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:31,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:54:32,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:54:32,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:35,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=1661453.3333333333, ans=0.025 2023-10-04 12:54:36,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:38,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:54:39,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:54:39,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 12:54:42,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:54:44,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:54:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:54:45,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:54:45,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 12:54:49,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:54:49,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:54:55,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:54:55,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 12:54:56,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 12:54:57,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:54:59,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1661586.6666666667, ans=0.2 2023-10-04 12:55:03,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:55:04,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 12:55:04,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:55:04,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:55:06,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:55:06,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 12:55:06,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:55:07,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 12:55:07,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:07,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:55:07,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1661586.6666666667, ans=0.125 2023-10-04 12:55:09,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 12:55:17,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:55:20,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1661653.3333333333, ans=0.0 2023-10-04 12:55:21,112 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:55:23,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:55:23,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:55:28,914 INFO [train.py:1046] (2/4) Epoch 47, batch 4900, loss[loss=0.1536, simple_loss=0.2415, pruned_loss=0.03289, over 24355.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03658, over 4708076.03 frames. ], batch size: 77, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:55:28,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 12:55:28,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:55:33,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:55:34,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:34,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:55:36,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.61 vs. limit=10.0 2023-10-04 12:55:38,129 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.08 vs. limit=22.5 2023-10-04 12:55:38,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 12:55:43,496 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.085e+02 2.441e+02 2.898e+02 5.040e+02, threshold=4.881e+02, percent-clipped=1.0 2023-10-04 12:55:44,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.00 vs. limit=22.5 2023-10-04 12:55:44,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 12:55:47,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 12:55:49,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 12:55:49,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:55:49,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:49,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:55:49,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:55:49,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:55:50,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 12:55:55,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 12:55:55,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:55:57,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:55:57,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:55:59,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:56:00,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:01,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:01,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 12:56:04,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:56:05,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:56:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 12:56:05,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 12:56:09,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 12:56:11,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:56:11,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:56:11,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:56:13,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:13,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.81 vs. limit=15.0 2023-10-04 12:56:14,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 12:56:14,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:56:14,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 12:56:14,692 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:56:17,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:19,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:56:20,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:56:25,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 12:56:25,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:56:28,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:56:28,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 12:56:30,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1661986.6666666667, ans=0.05 2023-10-04 12:56:32,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:56:34,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:56:35,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 12:56:35,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:56:35,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:56:38,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:41,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:56:41,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:56:41,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:56:42,874 INFO [train.py:1046] (2/4) Epoch 47, batch 4950, loss[loss=0.1538, simple_loss=0.2326, pruned_loss=0.03754, over 23251.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2328, pruned_loss=0.03608, over 4709003.49 frames. ], batch size: 105, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:56:42,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 12:56:44,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:56:47,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:56:47,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:56:50,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 12:56:50,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 12:56:50,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:56:51,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 12:56:51,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:51,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:56:53,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:56:53,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:56:56,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:56,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:56:57,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:56:59,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:57:00,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:01,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:57:03,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1662120.0, ans=0.125 2023-10-04 12:57:04,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:57:08,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:11,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:57:13,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:13,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:15,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1662186.6666666667, ans=0.125 2023-10-04 12:57:16,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:57:16,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 12:57:17,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 12:57:19,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:21,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:57:21,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:57:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:57:21,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:57:22,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:57:25,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:57:27,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:57:29,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:57:31,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:31,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:33,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 12:57:33,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:57:34,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:57:38,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:57:40,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:57:40,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:57:40,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:40,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:57:41,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:57:43,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:57:44,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:57:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:57:45,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 12:57:51,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:57:56,420 INFO [train.py:1046] (2/4) Epoch 47, batch 5000, loss[loss=0.143, simple_loss=0.2187, pruned_loss=0.03367, over 24268.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2333, pruned_loss=0.03627, over 4714325.04 frames. ], batch size: 56, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:57:56,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 12:57:56,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:58:04,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:58:04,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:58:05,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 12:58:06,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 12:58:09,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:58:10,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 12:58:10,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:58:10,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:58:12,123 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.008e+02 2.119e+02 2.497e+02 3.372e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-04 12:58:13,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 12:58:13,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:14,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:58:14,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 12:58:14,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:58:14,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:58:17,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 12:58:17,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 12:58:18,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:58:18,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 12:58:18,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:58:19,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:19,776 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:58:19,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 12:58:21,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 12:58:22,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 12:58:22,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:25,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:26,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 12:58:26,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:58:27,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:29,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:58:29,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:58:32,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 12:58:33,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:58:34,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:58:35,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1662520.0, ans=0.125 2023-10-04 12:58:37,680 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 12:58:41,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:58:42,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1662586.6666666667, ans=0.1 2023-10-04 12:58:43,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:43,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:58:45,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 12:58:45,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:45,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:58:45,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:58:47,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 12:58:49,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:58:51,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:58:53,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:58:58,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 12:59:01,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:09,859 INFO [train.py:1046] (2/4) Epoch 47, batch 5050, loss[loss=0.1473, simple_loss=0.2334, pruned_loss=0.03064, over 24451.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2335, pruned_loss=0.0361, over 4717422.00 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 4.0 2023-10-04 12:59:11,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:59:12,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:12,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:59:12,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:59:12,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:59:14,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:59:14,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:15,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1662720.0, ans=0.2 2023-10-04 12:59:18,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:18,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 12:59:20,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:59:21,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:59:22,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:59:23,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 12:59:26,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:59:26,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:59:29,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:59:29,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:59:30,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:59:35,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1662786.6666666667, ans=0.1 2023-10-04 12:59:39,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 12:59:40,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:59:40,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:59:41,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 12:59:41,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.89 vs. limit=15.0 2023-10-04 12:59:42,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:59:43,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:43,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:59:45,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:59:45,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 12:59:46,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 12:59:46,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:49,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:59:49,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1662853.3333333333, ans=0.125 2023-10-04 12:59:52,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 12:59:55,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:59:56,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 12:59:58,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:59:58,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:00:00,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:00,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:00:03,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:00:06,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:00:06,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:06,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:00:06,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:00:06,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 13:00:07,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:00:09,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:00:13,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:00:13,273 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 13:00:13,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:00:14,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:00:14,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:15,837 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 13:00:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:00:17,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 13:00:17,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:18,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1662986.6666666667, ans=0.125 2023-10-04 13:00:22,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:22,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:22,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 13:00:23,423 INFO [train.py:1046] (2/4) Epoch 47, batch 5100, loss[loss=0.1593, simple_loss=0.2353, pruned_loss=0.04165, over 23419.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2342, pruned_loss=0.0364, over 4722058.63 frames. ], batch size: 285, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:00:23,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 13:00:24,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:24,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:00:24,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:00:28,042 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 13:00:29,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:00:33,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 13:00:33,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 13:00:34,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:34,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1663053.3333333333, ans=0.125 2023-10-04 13:00:35,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:00:38,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:00:38,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 13:00:38,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1663120.0, ans=0.125 2023-10-04 13:00:40,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 13:00:41,329 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.103e+02 2.372e+02 2.934e+02 4.830e+02, threshold=4.743e+02, percent-clipped=2.0 2023-10-04 13:00:44,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:44,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:00:48,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:52,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 13:00:53,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:00:54,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:54,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 13:00:57,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:00,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:00,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 13:01:03,224 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 13:01:03,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:03,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 13:01:03,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 13:01:07,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:01:13,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1663253.3333333333, ans=0.0 2023-10-04 13:01:15,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:18,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 13:01:18,918 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 13:01:18,933 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 13:01:20,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 13:01:20,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:22,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 13:01:27,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 13:01:30,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:01:30,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:01:31,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 13:01:34,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:01:34,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 13:01:37,695 INFO [train.py:1046] (2/4) Epoch 47, batch 5150, loss[loss=0.176, simple_loss=0.2471, pruned_loss=0.05249, over 22673.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03668, over 4714619.42 frames. ], batch size: 322, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:01:39,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:01:39,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:01:39,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:01:41,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:01:41,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:01:42,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:01:43,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 13:01:43,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 13:01:43,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 13:01:45,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:01:45,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 13:01:45,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1663386.6666666667, ans=0.1 2023-10-04 13:01:46,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:46,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 13:01:48,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:01:48,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:01:54,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:01:55,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 13:01:55,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:55,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:01:58,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:01:58,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:01:58,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:01:59,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:01:59,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:02:01,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 13:02:03,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:02:03,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:02:04,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 13:02:06,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 13:02:08,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:02:10,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1663520.0, ans=0.1 2023-10-04 13:02:12,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:02:14,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 13:02:15,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:02:18,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1663520.0, ans=0.0 2023-10-04 13:02:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:02:23,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:02:27,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:02:27,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:02:31,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 13:02:34,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:02:34,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:02:36,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:02:39,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:02:39,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:02:40,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1663653.3333333333, ans=0.125 2023-10-04 13:02:41,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 13:02:45,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:02:46,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:02:51,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:02:51,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:02:51,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:02:51,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:02:51,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:02:51,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:02:52,709 INFO [train.py:1046] (2/4) Epoch 47, batch 5200, loss[loss=0.1332, simple_loss=0.2026, pruned_loss=0.03187, over 23556.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.0368, over 4717832.54 frames. ], batch size: 256, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 13:02:55,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:02:56,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:02:59,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:04,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 13:03:04,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:03:05,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:07,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:09,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:03:09,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:11,072 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.051e+02 2.238e+02 2.543e+02 5.592e+02, threshold=4.477e+02, percent-clipped=1.0 2023-10-04 13:03:12,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 13:03:15,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:03:16,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:18,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 13:03:20,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:03:20,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:03:21,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 13:03:21,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 13:03:23,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 13:03:24,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:24,301 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 13:03:24,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:25,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:25,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:03:26,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 13:03:26,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:03:28,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:30,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 13:03:31,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 13:03:31,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 13:03:36,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 13:03:37,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:03:43,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:03:44,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:03:44,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 13:03:45,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.76 vs. limit=15.0 2023-10-04 13:03:46,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:46,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:03:46,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:46,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:03:51,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:03:52,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:03:52,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1663986.6666666667, ans=0.2 2023-10-04 13:03:55,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:55,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:03:55,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:01,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:04:02,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 13:04:02,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:04:04,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:04:05,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:05,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1664053.3333333333, ans=0.125 2023-10-04 13:04:06,771 INFO [train.py:1046] (2/4) Epoch 47, batch 5250, loss[loss=0.1671, simple_loss=0.252, pruned_loss=0.04114, over 23395.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.234, pruned_loss=0.03647, over 4724124.15 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 13:04:06,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:04:06,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:04:10,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:04:12,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:04:13,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:04:16,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:04:17,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1664053.3333333333, ans=0.2 2023-10-04 13:04:20,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:04:21,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1664120.0, ans=0.0 2023-10-04 13:04:22,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:04:25,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:04:25,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:04:27,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 13:04:27,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:04:29,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:30,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.16 vs. limit=22.5 2023-10-04 13:04:30,942 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.47 vs. limit=15.0 2023-10-04 13:04:46,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1664186.6666666667, ans=0.125 2023-10-04 13:04:52,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.89 vs. limit=10.0 2023-10-04 13:04:57,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1664253.3333333333, ans=0.2 2023-10-04 13:05:00,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.71 vs. limit=22.5 2023-10-04 13:05:15,763 INFO [train.py:1046] (2/4) Epoch 47, batch 5300, loss[loss=0.1487, simple_loss=0.2212, pruned_loss=0.0381, over 23867.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03599, over 4718611.92 frames. ], batch size: 195, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:05:29,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:05:29,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 13:05:29,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 13:05:29,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:29,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:30,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:30,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:30,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:30,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:05:30,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:30,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:05:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:05:30,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 13:05:30,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 13:05:31,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 13:05:31,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:05:31,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 13:05:31,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 13:05:31,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:31,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:31,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:05:31,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:05:31,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:05:32,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:05:32,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:32,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:32,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:05:32,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:32,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:05:32,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:32,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:05:33,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 13:05:33,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:05:33,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:33,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 13:05:33,583 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 13:05:33,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:05:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:05:33,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 13:05:33,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 13:05:33,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:05:34,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:05:34,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:05:34,523 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 13:05:34,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 13:05:34,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:05:34,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:34,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 13:05:34,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 13:05:35,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 13:05:35,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:05:37,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1664466.6666666667, ans=0.125 2023-10-04 13:05:41,669 INFO [train.py:1046] (2/4) Epoch 48, batch 0, loss[loss=0.1553, simple_loss=0.2378, pruned_loss=0.03634, over 23536.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2378, pruned_loss=0.03634, over 23536.00 frames. ], batch size: 94, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:05:41,670 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 13:05:54,824 INFO [train.py:1078] (2/4) Epoch 48, validation: loss=0.3604, simple_loss=0.2801, pruned_loss=0.2204, over 1125622.00 frames. 2023-10-04 13:05:54,825 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 13:05:56,146 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.072e+02 2.267e+02 2.671e+02 6.295e+02, threshold=4.535e+02, percent-clipped=1.0 2023-10-04 13:05:56,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 13:05:56,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:05:59,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:06:05,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:05,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:06:05,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:06,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 13:06:07,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 13:06:09,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:09,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1664533.3333333333, ans=0.2 2023-10-04 13:06:11,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:15,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:17,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:17,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:06:17,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:06:18,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 13:06:20,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=1664533.3333333333, ans=0.02 2023-10-04 13:06:21,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:06:23,599 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.34 vs. limit=22.5 2023-10-04 13:06:27,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:06:28,104 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.31 vs. limit=6.0 2023-10-04 13:06:28,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:30,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 13:06:33,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1664600.0, ans=0.125 2023-10-04 13:06:34,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:06:34,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:06:36,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:06:36,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1664600.0, ans=0.0 2023-10-04 13:06:40,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:06:43,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:06:48,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 13:06:52,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 13:06:53,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:06:53,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:54,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:06:54,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:56,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 13:06:57,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:07:00,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1664733.3333333333, ans=0.0 2023-10-04 13:07:02,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:07:04,838 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 13:07:06,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:07:06,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1664733.3333333333, ans=0.125 2023-10-04 13:07:09,440 INFO [train.py:1046] (2/4) Epoch 48, batch 50, loss[loss=0.159, simple_loss=0.2314, pruned_loss=0.04325, over 23782.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2354, pruned_loss=0.03613, over 1065794.39 frames. ], batch size: 164, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:07:11,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:07:12,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:07:12,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 13:07:14,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:07:14,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:07:16,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:07:17,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1664800.0, ans=0.1 2023-10-04 13:07:18,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:07:20,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:07:22,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 13:07:22,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:30,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:07:32,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 13:07:33,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 13:07:33,663 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.22 vs. limit=15.0 2023-10-04 13:07:35,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:07:37,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:07:37,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:37,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:07:38,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:07:38,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:07:38,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:46,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:07:48,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:07:48,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:07:50,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 13:07:51,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:07:52,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:07:52,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 13:07:54,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:07:55,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 13:07:57,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1665000.0, ans=0.125 2023-10-04 13:08:04,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:04,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:08:04,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:05,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:08:05,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:08:08,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 13:08:08,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 13:08:10,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:11,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:08:12,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:08:12,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:08:13,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 13:08:14,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 13:08:16,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 13:08:18,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:18,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:08:19,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 13:08:19,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 13:08:21,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:21,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:08:24,070 INFO [train.py:1046] (2/4) Epoch 48, batch 100, loss[loss=0.1629, simple_loss=0.2514, pruned_loss=0.03726, over 24069.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2361, pruned_loss=0.03575, over 1888314.00 frames. ], batch size: 80, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:08:24,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:08:24,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:08:25,441 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.052e+02 2.272e+02 2.677e+02 5.287e+02, threshold=4.544e+02, percent-clipped=2.0 2023-10-04 13:08:25,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:08:29,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:08:30,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.0 2023-10-04 13:08:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:08:34,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 13:08:34,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:35,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1665133.3333333333, ans=0.2 2023-10-04 13:08:37,944 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.84 vs. limit=22.5 2023-10-04 13:08:38,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:08:38,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:08:38,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:08:38,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:08:38,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:08:40,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 13:08:41,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:08:41,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:42,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:42,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:08:44,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1665200.0, ans=0.0 2023-10-04 13:08:45,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 13:08:47,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:48,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:50,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:08:52,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:08:52,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1665266.6666666667, ans=0.95 2023-10-04 13:08:57,928 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 13:08:57,950 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 13:09:00,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:00,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:09:03,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:09:04,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:09:06,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:10,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.87 vs. limit=15.0 2023-10-04 13:09:11,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:12,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 13:09:13,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 13:09:16,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:09:16,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1665333.3333333333, ans=0.09899494936611666 2023-10-04 13:09:18,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:09:19,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:24,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:27,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:09:28,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:09:30,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:31,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:09:33,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:33,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:09:33,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:34,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 13:09:34,437 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 13:09:34,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:35,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.01 vs. limit=12.0 2023-10-04 13:09:35,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:09:35,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:35,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:37,758 INFO [train.py:1046] (2/4) Epoch 48, batch 150, loss[loss=0.1505, simple_loss=0.2329, pruned_loss=0.03401, over 23303.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2359, pruned_loss=0.03588, over 2525188.89 frames. ], batch size: 105, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:09:37,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 13:09:37,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:09:37,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:09:37,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:39,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:09:39,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:39,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:09:40,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:09:43,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:43,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1665466.6666666667, ans=0.04949747468305833 2023-10-04 13:09:46,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:09:46,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:09:46,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:49,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:49,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:51,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:09:52,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:55,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 13:09:55,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 13:09:55,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 13:10:00,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:10:00,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:10:01,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:10:02,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1665533.3333333333, ans=0.2 2023-10-04 13:10:03,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:10:03,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:03,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:03,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:04,611 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 13:10:06,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:06,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1665600.0, ans=0.1 2023-10-04 13:10:12,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:10:12,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1665600.0, ans=0.125 2023-10-04 13:10:12,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1665600.0, ans=0.125 2023-10-04 13:10:16,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:10:16,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 13:10:19,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:10:19,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:10:19,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:10:22,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1665666.6666666667, ans=0.0 2023-10-04 13:10:23,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:10:23,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:10:25,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:10:26,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:28,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 13:10:34,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:35,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:10:35,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:10:35,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:10:38,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:38,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 13:10:41,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:10:43,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:10:44,578 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:10:46,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:10:47,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 13:10:47,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:10:47,345 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 13:10:51,282 INFO [train.py:1046] (2/4) Epoch 48, batch 200, loss[loss=0.1506, simple_loss=0.2391, pruned_loss=0.03109, over 24560.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.237, pruned_loss=0.03639, over 3012820.11 frames. ], batch size: 71, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:10:51,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:54,669 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.085e+02 2.349e+02 2.813e+02 4.148e+02, threshold=4.699e+02, percent-clipped=0.0 2023-10-04 13:10:54,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:54,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:10:57,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 13:10:57,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1665800.0, ans=0.125 2023-10-04 13:10:59,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:10:59,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:02,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 13:11:03,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:11:05,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:06,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:09,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:11:10,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:11:10,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:30,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:11:30,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:11:31,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:11:31,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:11:31,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:11:31,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:11:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:35,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1666000.0, ans=0.0 2023-10-04 13:11:36,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:11:36,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:11:37,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:11:39,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 13:11:39,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:11:40,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:43,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:11:49,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:11:56,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:56,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:12:01,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:02,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.87 vs. limit=15.0 2023-10-04 13:12:04,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 13:12:05,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.69 vs. limit=15.0 2023-10-04 13:12:05,741 INFO [train.py:1046] (2/4) Epoch 48, batch 250, loss[loss=0.1784, simple_loss=0.2488, pruned_loss=0.05401, over 19571.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2358, pruned_loss=0.03666, over 3378927.04 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:12:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:12:05,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:12:05,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:12:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:12:07,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 13:12:09,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:12:09,237 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 13:12:10,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:13,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:12:13,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:14,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1666133.3333333333, ans=0.125 2023-10-04 13:12:15,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:12:17,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:12:17,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:19,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:12:19,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1666200.0, ans=0.0 2023-10-04 13:12:19,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1666200.0, ans=0.125 2023-10-04 13:12:20,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:12:32,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:12:33,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1666200.0, ans=0.2 2023-10-04 13:12:34,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:12:35,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:12:42,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:12:42,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:12:44,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:12:44,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:12:46,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:12:46,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:12:46,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:12:49,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:12:51,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 13:12:51,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:12:53,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:12:53,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:12:53,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:12:54,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:12:55,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:12:55,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:12:57,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:12:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:12:59,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:12:59,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1666333.3333333333, ans=0.125 2023-10-04 13:13:01,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:13:06,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:13:08,364 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-10-04 13:13:11,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:13:14,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:16,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:13:16,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1666400.0, ans=0.1 2023-10-04 13:13:18,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 13:13:20,236 INFO [train.py:1046] (2/4) Epoch 48, batch 300, loss[loss=0.1436, simple_loss=0.2304, pruned_loss=0.02841, over 24492.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2349, pruned_loss=0.03647, over 3681295.56 frames. ], batch size: 66, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:13:20,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:13:20,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:13:22,911 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.014e+02 2.190e+02 2.558e+02 4.207e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-04 13:13:22,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 13:13:23,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:13:24,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:13:24,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 13:13:27,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1666466.6666666667, ans=0.0 2023-10-04 13:13:29,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:13:29,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:13:34,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:13:34,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 13:13:36,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:37,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:13:37,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 13:13:37,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:13:42,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:13:46,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:13:46,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 13:13:47,530 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.62 vs. limit=22.5 2023-10-04 13:13:49,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 13:13:49,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:13:52,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:13:55,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:13:55,210 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 13:13:55,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:13:56,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:13:56,963 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:13:58,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:13:58,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:02,784 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:14:02,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 13:14:02,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:14:03,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1666666.6666666667, ans=0.125 2023-10-04 13:14:07,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:09,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 13:14:09,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:14,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:14:17,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:14:17,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 13:14:20,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:20,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:14:22,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:22,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:14:24,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 13:14:24,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:14:24,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:25,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 13:14:28,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:28,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:30,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:14:30,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:31,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:34,161 INFO [train.py:1046] (2/4) Epoch 48, batch 350, loss[loss=0.1603, simple_loss=0.2323, pruned_loss=0.04417, over 23809.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2335, pruned_loss=0.03584, over 3923021.67 frames. ], batch size: 179, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:14:35,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:14:36,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 13:14:38,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:42,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1666800.0, ans=0.0 2023-10-04 13:14:44,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:14:47,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:47,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:50,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 13:14:51,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:14:51,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 13:14:55,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:55,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 13:14:56,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:57,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 13:14:59,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:15:00,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:15:02,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:15:03,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:03,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:03,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:15:03,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:05,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:15:06,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:15:06,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:15:13,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:15:14,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:15:14,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:15:15,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:20,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 13:15:20,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:15:24,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:24,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:24,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:15:25,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 13:15:28,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:28,592 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 13:15:30,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 13:15:30,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:33,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:15:33,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 13:15:34,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:39,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:15:39,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:41,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:41,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:43,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:46,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:15:47,968 INFO [train.py:1046] (2/4) Epoch 48, batch 400, loss[loss=0.1467, simple_loss=0.2221, pruned_loss=0.03568, over 23417.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2331, pruned_loss=0.03554, over 4100916.91 frames. ], batch size: 285, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:15:49,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:15:49,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 13:15:49,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:50,745 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.047e+02 2.274e+02 2.611e+02 3.617e+02, threshold=4.549e+02, percent-clipped=0.0 2023-10-04 13:15:50,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:15:52,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:15:53,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:15:56,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:57,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:15:59,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 13:16:01,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 13:16:01,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:16:02,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 13:16:03,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:16:06,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:16:06,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:06,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 13:16:08,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:16:08,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:16:08,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:08,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:16:13,140 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 13:16:13,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 13:16:17,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:16:18,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:16:19,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 13:16:20,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 13:16:24,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:16:27,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:16:33,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 13:16:36,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:16:37,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 13:16:39,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:41,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:16:41,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 13:16:45,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:16:47,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:16:48,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:16:51,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:16:52,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 13:16:54,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:16:55,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 13:16:56,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:16:56,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:16:57,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1667400.0, ans=0.125 2023-10-04 13:16:58,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 13:16:59,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:16:59,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:16:59,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:17:01,184 INFO [train.py:1046] (2/4) Epoch 48, batch 450, loss[loss=0.1512, simple_loss=0.228, pruned_loss=0.03725, over 23800.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2336, pruned_loss=0.03567, over 4236347.94 frames. ], batch size: 195, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:17:01,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 13:17:01,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:17:01,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:17:02,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:17:02,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 13:17:04,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:17:06,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:17:07,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:17:18,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:18,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:17:20,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 13:17:21,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 13:17:22,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:17:25,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:27,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:17:29,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:17:30,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1667600.0, ans=0.125 2023-10-04 13:17:31,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:17:34,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 13:17:35,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 13:17:38,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 13:17:38,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:17:39,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:17:41,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:17:43,588 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 13:17:43,596 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 13:17:44,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:46,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:17:47,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 13:17:51,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:17:51,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:17:52,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:17:53,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 13:17:55,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:17:56,244 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.74 vs. limit=6.0 2023-10-04 13:17:56,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:17:56,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:17:59,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 13:18:03,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:18:03,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 13:18:05,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 13:18:06,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:18:09,872 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:18:11,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:18:11,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1667733.3333333333, ans=10.0 2023-10-04 13:18:14,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:18:15,888 INFO [train.py:1046] (2/4) Epoch 48, batch 500, loss[loss=0.1485, simple_loss=0.2318, pruned_loss=0.03264, over 24326.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.234, pruned_loss=0.03593, over 4339029.52 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:18:15,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:18:15,983 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 13:18:18,919 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 1.967e+02 2.163e+02 2.442e+02 3.421e+02, threshold=4.326e+02, percent-clipped=0.0 2023-10-04 13:18:19,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:18:20,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:18:20,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:18:20,446 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 13:18:21,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 13:18:21,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:18:25,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:18:27,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 13:18:29,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:18:29,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1667866.6666666667, ans=0.1 2023-10-04 13:18:31,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:18:31,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:18:33,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:18:36,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1667866.6666666667, ans=0.125 2023-10-04 13:18:40,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1667866.6666666667, ans=0.0 2023-10-04 13:18:44,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:44,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:18:44,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:18:46,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:46,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 13:18:47,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:18:49,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:18:51,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:18:51,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:18:51,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:51,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1667933.3333333333, ans=0.125 2023-10-04 13:18:52,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 13:18:56,680 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 13:18:58,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:18:58,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff3.min_abs, batch_count=1668000.0, ans=0.2 2023-10-04 13:19:00,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:19:04,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 13:19:07,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:19:07,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1668000.0, ans=0.125 2023-10-04 13:19:08,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:13,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:14,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:20,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:22,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 13:19:22,705 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:22,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:25,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 13:19:26,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:19:28,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:29,989 INFO [train.py:1046] (2/4) Epoch 48, batch 550, loss[loss=0.1769, simple_loss=0.2494, pruned_loss=0.05216, over 22708.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2352, pruned_loss=0.03602, over 4434943.61 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:19:31,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff3.min_abs, batch_count=1668133.3333333333, ans=0.2 2023-10-04 13:19:34,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 13:19:36,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 13:19:36,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:36,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 13:19:37,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:19:37,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:38,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:39,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:39,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:19:39,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1668133.3333333333, ans=0.0 2023-10-04 13:19:42,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:19:44,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:44,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 13:19:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:19:45,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1668200.0, ans=0.0 2023-10-04 13:19:49,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:19:50,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:52,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:19:53,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:55,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 13:19:56,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 13:19:56,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1668200.0, ans=0.125 2023-10-04 13:19:58,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:20:02,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:20:02,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:20:05,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:20:06,853 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:08,206 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 13:20:08,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:20:09,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:20:12,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:20:14,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:20:14,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:20:16,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:17,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 13:20:20,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 13:20:20,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:20,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:20:21,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:20:21,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:20:23,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:20:24,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:20:26,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:20:27,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:28,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 13:20:29,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:20:30,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:32,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:20:32,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:33,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:20:33,797 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 13:20:40,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 13:20:43,294 INFO [train.py:1046] (2/4) Epoch 48, batch 600, loss[loss=0.1599, simple_loss=0.2416, pruned_loss=0.03916, over 23713.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2367, pruned_loss=0.03667, over 4493719.07 frames. ], batch size: 149, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:20:43,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 13:20:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:20:45,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:20:45,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:46,919 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.073e+02 2.337e+02 2.691e+02 3.660e+02, threshold=4.674e+02, percent-clipped=0.0 2023-10-04 13:20:49,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1668466.6666666667, ans=0.125 2023-10-04 13:20:53,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:20:55,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:20:57,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 13:20:59,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:21:01,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:21:03,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:04,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 13:21:04,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:21:10,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 13:21:12,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:21:12,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:14,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:21:18,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1668600.0, ans=0.1 2023-10-04 13:21:19,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:21:19,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:21:20,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:21:25,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1668600.0, ans=0.125 2023-10-04 13:21:29,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:21:32,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:21:32,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:21:32,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:38,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 13:21:43,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:21:43,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:21:45,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 13:21:47,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:21:48,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 13:21:49,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:21:49,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:21:54,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1668733.3333333333, ans=0.025 2023-10-04 13:21:56,566 INFO [train.py:1046] (2/4) Epoch 48, batch 650, loss[loss=0.1595, simple_loss=0.2426, pruned_loss=0.03819, over 24634.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2357, pruned_loss=0.03666, over 4535053.52 frames. ], batch size: 68, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:21:56,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 13:21:58,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:21:59,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:22:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:22:02,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:03,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.53 vs. limit=12.0 2023-10-04 13:22:06,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 13:22:06,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:22:11,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:22:11,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:16,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:19,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 13:22:20,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:22:21,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:22,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1668866.6666666667, ans=0.125 2023-10-04 13:22:23,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:22:23,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:22:26,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:28,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:28,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:22:29,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:30,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:22:34,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:22:34,269 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 13:22:34,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:34,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:22:38,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:38,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:22:39,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:22:39,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:22:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 13:22:43,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:22:43,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:22:43,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:22:43,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:22:45,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:22:46,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 13:22:49,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 13:22:49,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:49,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:22:50,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:22:50,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:22:52,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:56,729 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:56,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:22:57,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.37 vs. limit=6.0 2023-10-04 13:22:58,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:23:00,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:23:00,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:23:00,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:23:09,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:23:09,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:09,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:23:10,815 INFO [train.py:1046] (2/4) Epoch 48, batch 700, loss[loss=0.1554, simple_loss=0.2441, pruned_loss=0.03332, over 24645.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.03637, over 4575830.95 frames. ], batch size: 68, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:23:10,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:16,003 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.014e+02 2.298e+02 2.689e+02 4.568e+02, threshold=4.597e+02, percent-clipped=0.0 2023-10-04 13:23:16,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 13:23:16,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 13:23:17,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 13:23:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:21,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:23:21,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1669133.3333333333, ans=0.2 2023-10-04 13:23:22,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 13:23:27,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:23:29,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:23:32,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:32,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:23:32,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:23:34,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:36,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1669200.0, ans=0.125 2023-10-04 13:23:39,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 13:23:39,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:23:40,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 13:23:44,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 13:23:47,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:23:47,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:23:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:23:54,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:23:54,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 13:23:58,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:58,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:23:59,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 13:24:02,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:24:04,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:08,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:10,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1669400.0, ans=0.05 2023-10-04 13:24:11,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:24:11,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 13:24:14,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 13:24:14,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 13:24:19,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:21,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.07 vs. limit=15.0 2023-10-04 13:24:21,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:24:21,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:24:24,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:24,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 13:24:25,872 INFO [train.py:1046] (2/4) Epoch 48, batch 750, loss[loss=0.1465, simple_loss=0.231, pruned_loss=0.03103, over 23274.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2335, pruned_loss=0.03617, over 4602513.77 frames. ], batch size: 119, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:24:27,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 13:24:27,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 13:24:27,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 13:24:28,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 13:24:28,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 13:24:30,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:24:30,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 13:24:31,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1669466.6666666667, ans=0.125 2023-10-04 13:24:32,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:32,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:24:34,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:24:35,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:35,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:24:37,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:24:39,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=12.0 2023-10-04 13:24:40,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:24:40,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:24:41,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:24:41,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1669533.3333333333, ans=0.125 2023-10-04 13:24:41,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1669533.3333333333, ans=0.125 2023-10-04 13:24:43,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:24:45,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:46,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 13:24:47,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:24:49,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:52,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:52,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:24:53,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 13:24:53,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:24:56,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 13:24:56,485 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 13:24:57,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 13:24:57,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:24:59,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:25:00,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:25:07,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:25:07,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:07,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:25:10,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:25:12,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:12,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 13:25:12,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:25:13,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 13:25:15,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:25:17,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:25:17,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 13:25:19,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:24,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:25:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:25:27,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:29,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:25:33,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 13:25:33,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:25:34,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:25:35,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:25:37,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:38,585 INFO [train.py:1046] (2/4) Epoch 48, batch 800, loss[loss=0.1624, simple_loss=0.2516, pruned_loss=0.03659, over 24576.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.234, pruned_loss=0.03624, over 4634042.09 frames. ], batch size: 71, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:25:38,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:38,694 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:25:43,439 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.959e+02 2.276e+02 2.649e+02 3.901e+02, threshold=4.552e+02, percent-clipped=0.0 2023-10-04 13:25:44,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:44,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:48,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:25:48,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:48,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:48,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:49,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1669800.0, ans=0.1 2023-10-04 13:25:50,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:54,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:25:55,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:25:57,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 13:25:58,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:59,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:59,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:25:59,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:25:59,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 13:26:01,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:01,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 13:26:03,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:06,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:07,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1669933.3333333333, ans=0.125 2023-10-04 13:26:08,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:26:08,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:26:11,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:11,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:14,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:26:15,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:26:15,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 13:26:17,401 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 13:26:17,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 13:26:17,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:26:19,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:26:19,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:19,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:26:22,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1670000.0, ans=0.0 2023-10-04 13:26:24,806 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 13:26:24,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 13:26:25,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=12.0 2023-10-04 13:26:26,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:26:29,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:26:33,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:26:37,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:37,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 13:26:37,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:26:40,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 13:26:42,220 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.72 vs. limit=15.0 2023-10-04 13:26:46,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:26:48,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:26:49,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 13:26:49,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:26:51,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:53,135 INFO [train.py:1046] (2/4) Epoch 48, batch 850, loss[loss=0.1601, simple_loss=0.2358, pruned_loss=0.04218, over 23761.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2347, pruned_loss=0.03656, over 4651308.88 frames. ], batch size: 135, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:26:53,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 13:26:53,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:26:53,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:55,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:26:57,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:26:58,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:26:59,424 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=15.0 2023-10-04 13:27:00,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 13:27:00,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 13:27:00,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 13:27:02,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:27:02,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:27:04,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:05,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:27:05,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:27:09,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1670200.0, ans=0.0 2023-10-04 13:27:11,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:27:11,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:11,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 13:27:13,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 13:27:19,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:27:19,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1670200.0, ans=0.125 2023-10-04 13:27:21,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 13:27:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 13:27:23,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 13:27:27,432 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 13:27:27,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:27:27,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:27:27,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:27:30,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:31,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:32,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 13:27:35,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:27:35,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:36,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:27:38,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:27:39,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:27:41,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:27:41,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 13:27:44,800 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.78 vs. limit=15.0 2023-10-04 13:27:45,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:27:45,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:27:45,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:27:45,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:27:46,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:50,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:53,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:27:53,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:27:54,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:27:55,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:28:03,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:28:06,355 INFO [train.py:1046] (2/4) Epoch 48, batch 900, loss[loss=0.1612, simple_loss=0.236, pruned_loss=0.04319, over 23891.00 frames. ], tot_loss[loss=0.155, simple_loss=0.236, pruned_loss=0.037, over 4670838.48 frames. ], batch size: 195, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:28:06,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:28:06,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 13:28:06,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:28:06,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:28:08,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1670466.6666666667, ans=0.0 2023-10-04 13:28:09,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 13:28:10,517 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 2.013e+02 2.239e+02 2.502e+02 3.512e+02, threshold=4.478e+02, percent-clipped=0.0 2023-10-04 13:28:12,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1670466.6666666667, ans=0.125 2023-10-04 13:28:14,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:28:16,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1670466.6666666667, ans=0.125 2023-10-04 13:28:17,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:28:18,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 13:28:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:28:22,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 13:28:22,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 13:28:23,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:28:23,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:28:23,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:28:25,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:28:30,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1670533.3333333333, ans=0.025 2023-10-04 13:28:33,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1670533.3333333333, ans=0.0 2023-10-04 13:28:36,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1670600.0, ans=0.0 2023-10-04 13:28:37,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:28:37,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:28:37,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:28:40,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:28:43,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 13:28:45,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.14 vs. limit=22.5 2023-10-04 13:28:45,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:28:47,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1670600.0, ans=0.125 2023-10-04 13:28:48,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:28:48,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:28:48,762 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 13:28:48,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1670666.6666666667, ans=0.05 2023-10-04 13:28:50,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 13:28:58,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:28:58,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:29:00,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:29:06,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:06,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:07,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 13:29:07,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:29:10,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 13:29:11,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:29:11,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:13,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:29:13,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:17,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 13:29:17,188 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 13:29:18,597 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:29:20,556 INFO [train.py:1046] (2/4) Epoch 48, batch 950, loss[loss=0.1475, simple_loss=0.2327, pruned_loss=0.03119, over 24692.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2358, pruned_loss=0.03682, over 4693551.76 frames. ], batch size: 65, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:29:20,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 13:29:21,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:24,249 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=15.0 2023-10-04 13:29:24,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 13:29:25,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1670800.0, ans=0.2 2023-10-04 13:29:28,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.05 vs. limit=15.0 2023-10-04 13:29:30,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:29:34,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:34,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:35,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:29:38,450 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 13:29:41,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:41,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:29:42,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:29:42,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:29:42,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 13:29:42,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1670866.6666666667, ans=0.0 2023-10-04 13:29:45,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:29:45,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:46,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 13:29:46,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:51,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:51,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:51,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:52,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 13:29:54,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:29:55,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:29:57,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:30:05,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:30:05,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:30:07,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 13:30:09,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 13:30:09,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:30:09,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:09,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:09,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:30:13,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 13:30:14,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:30:15,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=15.0 2023-10-04 13:30:16,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.36 vs. limit=15.0 2023-10-04 13:30:17,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:17,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:17,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 13:30:17,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:30:17,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:30:17,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 13:30:22,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:30:24,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1671066.6666666667, ans=0.125 2023-10-04 13:30:25,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:30:27,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1671066.6666666667, ans=0.0 2023-10-04 13:30:29,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1671066.6666666667, ans=0.0 2023-10-04 13:30:30,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:30:30,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 13:30:32,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 13:30:34,970 INFO [train.py:1046] (2/4) Epoch 48, batch 1000, loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.03726, over 23295.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03651, over 4692230.72 frames. ], batch size: 119, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:30:35,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:37,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 13:30:39,097 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.794e+02 2.109e+02 2.410e+02 2.800e+02 4.729e+02, threshold=4.820e+02, percent-clipped=1.0 2023-10-04 13:30:39,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:30:43,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:30:45,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1671133.3333333333, ans=0.1 2023-10-04 13:30:46,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 13:30:46,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 13:30:49,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1671200.0, ans=0.2 2023-10-04 13:30:50,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:30:50,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:30:51,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1671200.0, ans=0.125 2023-10-04 13:30:52,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:56,308 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 13:30:59,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 13:30:59,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 13:31:01,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:03,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 13:31:06,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 13:31:06,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 13:31:06,763 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.33 vs. limit=15.0 2023-10-04 13:31:07,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:07,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:14,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:31:14,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:31:15,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:15,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:15,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 13:31:15,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:19,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:31:19,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:31:20,435 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 13:31:23,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 13:31:24,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 13:31:27,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 13:31:27,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:31:33,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:33,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:31:34,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:35,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:31:37,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 13:31:37,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1671400.0, ans=0.125 2023-10-04 13:31:39,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:31:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 13:31:39,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 13:31:41,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:31:41,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:43,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1671400.0, ans=0.125 2023-10-04 13:31:44,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:31:46,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:31:48,215 INFO [train.py:1046] (2/4) Epoch 48, batch 1050, loss[loss=0.1584, simple_loss=0.248, pruned_loss=0.03444, over 24626.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2334, pruned_loss=0.03617, over 4700135.11 frames. ], batch size: 65, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:31:48,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:48,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1671466.6666666667, ans=0.0 2023-10-04 13:31:51,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:31:53,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:31:55,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:31:56,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:57,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1671466.6666666667, ans=0.1 2023-10-04 13:31:57,743 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.18 vs. limit=15.0 2023-10-04 13:31:58,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:32:01,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:32:03,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:32:05,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:32:06,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:32:07,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.18 vs. limit=15.0 2023-10-04 13:32:07,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:32:07,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:32:08,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 13:32:08,818 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.70 vs. limit=12.0 2023-10-04 13:32:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:32:09,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 13:32:10,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:32:10,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 13:32:10,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:32:16,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:32:16,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1671600.0, ans=0.0 2023-10-04 13:32:17,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:32:17,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:32:21,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 13:32:21,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 13:32:22,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:32:23,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 13:32:27,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 13:32:29,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:29,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1671600.0, ans=0.125 2023-10-04 13:32:32,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:32:34,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 13:32:34,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:32:34,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:32:34,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1671666.6666666667, ans=0.1 2023-10-04 13:32:38,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:32:40,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 13:32:41,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 13:32:41,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 13:32:41,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:32:42,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:32:43,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 13:32:43,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1671666.6666666667, ans=0.0 2023-10-04 13:32:47,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:32:49,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:32:49,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:32:49,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:32:49,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:53,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:53,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 13:32:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:32:55,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 13:32:55,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 13:32:56,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:32:56,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-10-04 13:32:59,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1671733.3333333333, ans=0.1 2023-10-04 13:33:00,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:33:02,717 INFO [train.py:1046] (2/4) Epoch 48, batch 1100, loss[loss=0.1514, simple_loss=0.2407, pruned_loss=0.03104, over 24454.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2326, pruned_loss=0.03609, over 4689923.34 frames. ], batch size: 66, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:33:05,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:33:05,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1671800.0, ans=0.125 2023-10-04 13:33:07,948 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.096e+02 2.413e+02 2.876e+02 5.398e+02, threshold=4.826e+02, percent-clipped=2.0 2023-10-04 13:33:10,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:33:12,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:33:13,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:33:13,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 13:33:16,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:33:19,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:33:20,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:33:24,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:33:24,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 13:33:25,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:33:27,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:33:27,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:33:28,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:33:32,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:33:35,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:33:38,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 13:33:39,441 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 13:33:39,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:42,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:43,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:33:43,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:33:44,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 13:33:46,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:33:46,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:33:46,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:33:47,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:47,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 13:33:51,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1672000.0, ans=0.0 2023-10-04 13:33:55,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:33:55,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 13:33:56,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:34:01,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:34:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 13:34:05,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:34:07,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:11,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:34:11,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:34:12,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 13:34:12,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:34:12,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:34:14,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 13:34:14,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:34:14,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 13:34:16,811 INFO [train.py:1046] (2/4) Epoch 48, batch 1150, loss[loss=0.1483, simple_loss=0.2283, pruned_loss=0.03416, over 24553.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2337, pruned_loss=0.03602, over 4715879.73 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:34:16,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:34:16,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:34:18,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:34:21,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:22,720 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:34:24,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:34:24,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1672133.3333333333, ans=0.1 2023-10-04 13:34:25,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:34:26,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:34:26,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 13:34:26,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:34:29,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 13:34:31,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:31,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:34:37,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 13:34:39,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:40,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1672200.0, ans=0.125 2023-10-04 13:34:43,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:43,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:34:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 13:34:44,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:34:44,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:34:48,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 13:34:49,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:51,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:35:01,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:35:06,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:35:06,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 13:35:08,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:08,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:15,777 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 13:35:15,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1672400.0, ans=0.125 2023-10-04 13:35:17,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:17,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1672400.0, ans=0.0 2023-10-04 13:35:24,310 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 13:35:27,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:35:27,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:35:27,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:35:29,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:35:30,491 INFO [train.py:1046] (2/4) Epoch 48, batch 1200, loss[loss=0.1656, simple_loss=0.2422, pruned_loss=0.04449, over 23502.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2344, pruned_loss=0.03615, over 4717778.33 frames. ], batch size: 285, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:35:31,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:35:36,974 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 1.971e+02 2.130e+02 2.381e+02 3.707e+02, threshold=4.260e+02, percent-clipped=0.0 2023-10-04 13:35:37,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:35:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:35:38,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:35:38,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:35:38,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:35:41,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:35:43,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:35:43,946 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.38 vs. limit=15.0 2023-10-04 13:35:44,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:35:44,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:46,097 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:35:47,297 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 13:35:51,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 13:35:54,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:35:55,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:35:58,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:36:01,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:36:01,560 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 13:36:01,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:36:08,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:36:08,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:36:08,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 13:36:10,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:36:13,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 13:36:16,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 13:36:16,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:36:17,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.91 vs. limit=22.5 2023-10-04 13:36:17,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:36:17,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:36:19,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:36:20,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:36:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:36:20,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:36:21,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 13:36:21,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:36:23,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:36:23,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:36:24,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:36:24,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:36:28,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:36:31,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1672733.3333333333, ans=0.125 2023-10-04 13:36:32,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:36:35,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 13:36:39,986 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 13:36:41,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:36:42,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:36:44,128 INFO [train.py:1046] (2/4) Epoch 48, batch 1250, loss[loss=0.1442, simple_loss=0.2264, pruned_loss=0.03101, over 24446.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2354, pruned_loss=0.03654, over 4722376.31 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:36:44,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:36:46,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:36:47,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 13:36:50,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:36:51,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:36:51,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 13:36:53,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:36:56,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:36:59,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:37:00,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:37:01,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:37:01,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:37:02,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.97 vs. limit=6.0 2023-10-04 13:37:03,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:37:06,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 13:37:06,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1672866.6666666667, ans=0.125 2023-10-04 13:37:07,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:37:07,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:37:09,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:37:10,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:13,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:15,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:37:21,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 13:37:21,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:37:21,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1672933.3333333333, ans=0.2 2023-10-04 13:37:22,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:37:23,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 13:37:24,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1672933.3333333333, ans=0.07 2023-10-04 13:37:25,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:37:25,703 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 13:37:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:25,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:28,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:28,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1673000.0, ans=0.0 2023-10-04 13:37:32,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:32,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:37:32,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1673000.0, ans=0.125 2023-10-04 13:37:34,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 13:37:34,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 13:37:34,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1673000.0, ans=0.0 2023-10-04 13:37:35,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 13:37:38,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:37:39,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.30 vs. limit=15.0 2023-10-04 13:37:40,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 13:37:40,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:44,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 13:37:44,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:37:45,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 13:37:45,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:37:47,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:37:47,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 13:37:47,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:37:50,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 13:37:53,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:37:53,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:37:54,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:37:56,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:37:57,678 INFO [train.py:1046] (2/4) Epoch 48, batch 1300, loss[loss=0.1561, simple_loss=0.244, pruned_loss=0.03406, over 24413.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2356, pruned_loss=0.03642, over 4719881.39 frames. ], batch size: 69, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:38:00,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:38:01,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 13:38:03,127 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.065e+02 2.223e+02 2.420e+02 4.502e+02, threshold=4.446e+02, percent-clipped=1.0 2023-10-04 13:38:04,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:38:06,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:38:07,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:38:10,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:38:12,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:38:12,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 13:38:12,898 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.38 vs. limit=6.0 2023-10-04 13:38:16,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:38:17,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:38:19,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 13:38:21,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1673200.0, ans=0.125 2023-10-04 13:38:22,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:38:25,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:38:26,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:38:26,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1673266.6666666667, ans=0.125 2023-10-04 13:38:28,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:38:28,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:38:29,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:38:29,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:38:31,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 13:38:35,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:38:35,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:38:35,953 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.09 vs. limit=6.0 2023-10-04 13:38:36,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 13:38:37,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:38:39,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:38:40,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1673333.3333333333, ans=0.125 2023-10-04 13:38:41,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:38:41,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 13:38:43,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:38:43,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 13:38:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:38:44,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1673333.3333333333, ans=0.2 2023-10-04 13:38:49,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:38:49,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:38:49,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1673333.3333333333, ans=0.125 2023-10-04 13:38:52,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 13:38:52,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 13:38:55,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 13:38:59,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:39:01,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 13:39:01,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1673400.0, ans=0.125 2023-10-04 13:39:02,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:39:09,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 13:39:11,723 INFO [train.py:1046] (2/4) Epoch 48, batch 1350, loss[loss=0.1506, simple_loss=0.23, pruned_loss=0.0356, over 23427.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2344, pruned_loss=0.03647, over 4717689.96 frames. ], batch size: 119, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:39:11,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:39:14,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:19,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:39:19,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:39:20,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:39:20,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:39:25,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:39:26,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 13:39:27,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1673533.3333333333, ans=0.0 2023-10-04 13:39:28,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:39:29,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:39:30,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 13:39:32,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:39:33,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:39:34,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 13:39:36,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 13:39:38,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 13:39:39,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:39,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 13:39:51,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:40:00,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:40:00,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:00,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 13:40:04,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:05,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 13:40:05,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:40:05,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.60 vs. limit=22.5 2023-10-04 13:40:06,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:40:08,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:40:09,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 13:40:12,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:40:17,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 13:40:18,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 13:40:25,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 13:40:25,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:26,570 INFO [train.py:1046] (2/4) Epoch 48, batch 1400, loss[loss=0.1426, simple_loss=0.2184, pruned_loss=0.03336, over 23867.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2321, pruned_loss=0.03599, over 4714203.29 frames. ], batch size: 150, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:40:29,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:40:29,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1673800.0, ans=0.0 2023-10-04 13:40:30,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:40:32,027 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.732e+02 2.089e+02 2.315e+02 2.656e+02 4.133e+02, threshold=4.629e+02, percent-clipped=0.0 2023-10-04 13:40:34,951 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 13:40:36,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 13:40:45,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:40:47,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:40:49,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:40:49,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:40:50,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1673866.6666666667, ans=0.2 2023-10-04 13:40:53,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:40:55,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 13:41:03,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:04,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:09,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 13:41:09,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:41:10,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:41:10,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:41:12,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:41:13,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:41:13,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:41:13,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1674000.0, ans=0.2 2023-10-04 13:41:15,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:41:16,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 13:41:16,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:41:20,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:41:26,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.63 vs. limit=15.0 2023-10-04 13:41:27,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1674066.6666666667, ans=0.1 2023-10-04 13:41:31,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 13:41:33,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:41:33,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:41:34,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 13:41:34,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:41:37,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:41:39,725 INFO [train.py:1046] (2/4) Epoch 48, batch 1450, loss[loss=0.1525, simple_loss=0.2103, pruned_loss=0.04736, over 19488.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2318, pruned_loss=0.0358, over 4725525.22 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:41:41,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:41:43,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:41:43,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:43,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 13:41:46,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:41:46,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:41:49,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:41:49,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 13:41:49,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1674133.3333333333, ans=0.125 2023-10-04 13:41:50,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:41:52,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 13:41:53,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:55,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:41:55,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 13:41:55,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:41:57,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:41:57,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 13:41:57,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:41:58,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:42:00,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:02,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:42:05,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:42:05,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:42:08,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:42:08,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:09,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:42:09,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:42:11,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:11,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:14,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 13:42:16,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1674266.6666666667, ans=0.125 2023-10-04 13:42:16,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1674266.6666666667, ans=0.125 2023-10-04 13:42:17,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:42:21,966 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 13:42:23,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:42:25,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:42:27,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:28,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 13:42:32,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:33,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 13:42:35,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 13:42:35,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:37,877 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.80 vs. limit=15.0 2023-10-04 13:42:38,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:42:38,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:42:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 13:42:43,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 13:42:44,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 13:42:45,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:45,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:42:45,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1674400.0, ans=0.0 2023-10-04 13:42:55,122 INFO [train.py:1046] (2/4) Epoch 48, batch 1500, loss[loss=0.1939, simple_loss=0.2593, pruned_loss=0.06426, over 19532.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2328, pruned_loss=0.03612, over 4719741.29 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:42:58,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 13:42:58,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:42:58,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:42:59,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:59,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:43:01,095 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.007e+02 2.219e+02 2.655e+02 4.541e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 13:43:01,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:43:01,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 13:43:02,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:43:02,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:43:02,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:43:04,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:43:05,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:43:05,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:43:08,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1674533.3333333333, ans=0.2 2023-10-04 13:43:11,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:43:11,863 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 13:43:13,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:43:13,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:43:14,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:43:17,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 13:43:18,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1674533.3333333333, ans=0.2 2023-10-04 13:43:21,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 13:43:23,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:43:25,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 13:43:26,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:43:28,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:43:28,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:43:30,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:43:30,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 13:43:31,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:43:31,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:43:32,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.72 vs. limit=15.0 2023-10-04 13:43:32,914 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 13:43:33,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:43:38,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:43:38,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 13:43:43,397 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:43:44,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:43:47,526 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 13:43:48,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:43:48,882 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 13:43:50,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:43:50,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:43:52,704 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 13:43:54,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:43:55,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 13:43:58,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:02,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:44:02,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:02,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:44:02,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:04,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:44:06,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 13:44:07,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 13:44:07,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:44:08,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 13:44:08,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 13:44:09,769 INFO [train.py:1046] (2/4) Epoch 48, batch 1550, loss[loss=0.1386, simple_loss=0.2251, pruned_loss=0.02607, over 24275.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03621, over 4720078.85 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:44:11,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:44:12,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:12,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:44:12,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:44:14,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:15,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:18,751 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 13:44:20,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:20,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:44:20,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:44:23,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:44:23,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 13:44:24,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:44:26,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 13:44:29,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 13:44:29,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 13:44:29,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:29,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:44:33,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:44:35,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 13:44:35,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 13:44:43,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:44:45,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-10-04 13:44:47,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:44:47,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:44:47,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:44:48,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 13:44:54,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1675000.0, ans=0.125 2023-10-04 13:44:55,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:44:57,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:00,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:45:01,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:45:03,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:45:03,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 13:45:03,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:45:04,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:45:06,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:06,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1675000.0, ans=0.125 2023-10-04 13:45:07,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 13:45:07,367 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 13:45:09,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:09,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1675066.6666666667, ans=0.125 2023-10-04 13:45:13,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 13:45:18,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:45:20,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:20,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 13:45:22,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:45:22,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:45:22,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:45:22,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:45:22,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1675133.3333333333, ans=0.0 2023-10-04 13:45:23,601 INFO [train.py:1046] (2/4) Epoch 48, batch 1600, loss[loss=0.168, simple_loss=0.2459, pruned_loss=0.04505, over 23971.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2339, pruned_loss=0.03583, over 4732074.81 frames. ], batch size: 86, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:45:23,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:45:26,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=12.0 2023-10-04 13:45:27,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:27,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 13:45:28,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 13:45:29,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1675133.3333333333, ans=0.0 2023-10-04 13:45:30,109 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.052e+02 2.355e+02 2.599e+02 3.468e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 13:45:30,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 13:45:30,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1675133.3333333333, ans=0.125 2023-10-04 13:45:30,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1675133.3333333333, ans=15.0 2023-10-04 13:45:32,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:45:34,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 13:45:35,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:45:37,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:45:41,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:45:46,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 13:45:48,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:45:48,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 13:45:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:50,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 13:45:56,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 13:46:04,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1675266.6666666667, ans=15.0 2023-10-04 13:46:05,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:46:05,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 13:46:06,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:46:06,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:46:06,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:46:09,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 13:46:13,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 13:46:15,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1675333.3333333333, ans=0.125 2023-10-04 13:46:16,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:46:17,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:17,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:17,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:46:20,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:46:20,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:46:21,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:46:29,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:29,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:46:31,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 13:46:31,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:46:33,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 13:46:33,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1675400.0, ans=0.1 2023-10-04 13:46:36,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:46:37,337 INFO [train.py:1046] (2/4) Epoch 48, batch 1650, loss[loss=0.1593, simple_loss=0.2282, pruned_loss=0.04523, over 23445.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2344, pruned_loss=0.03604, over 4727845.23 frames. ], batch size: 285, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:46:38,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:46:38,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:46:40,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 13:46:40,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 13:46:40,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 13:46:41,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 13:46:44,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:44,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:46:46,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:46:46,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:46:47,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:46:49,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 13:46:50,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1675533.3333333333, ans=0.125 2023-10-04 13:46:50,934 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.90 vs. limit=22.5 2023-10-04 13:46:51,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:46:51,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:46:51,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:46:51,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:46:51,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 13:46:52,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 13:46:55,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1675533.3333333333, ans=0.2 2023-10-04 13:46:58,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:46:59,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:47:04,957 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:47:08,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 13:47:08,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:09,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1675600.0, ans=0.1 2023-10-04 13:47:11,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 13:47:16,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:19,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:47:19,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:47:20,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:20,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1675666.6666666667, ans=0.125 2023-10-04 13:47:23,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:47:23,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:24,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:47:24,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:26,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:47:26,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:47:26,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:47:28,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:47:32,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:47:32,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 13:47:34,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1675666.6666666667, ans=0.125 2023-10-04 13:47:35,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:47:35,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 13:47:36,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 13:47:36,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 13:47:36,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:47:38,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:47:38,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:39,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:39,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 13:47:43,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:44,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1675733.3333333333, ans=0.125 2023-10-04 13:47:45,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:47:45,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:45,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1675733.3333333333, ans=0.125 2023-10-04 13:47:48,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 13:47:51,544 INFO [train.py:1046] (2/4) Epoch 48, batch 1700, loss[loss=0.1332, simple_loss=0.1961, pruned_loss=0.03519, over 22766.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.03595, over 4722188.82 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:47:52,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:52,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:47:52,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 13:47:54,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:47:54,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:47:54,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:47:57,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:47:57,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:47:58,925 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.221e+02 2.607e+02 3.070e+02 5.494e+02, threshold=5.214e+02, percent-clipped=5.0 2023-10-04 13:47:58,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 13:48:02,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:48:02,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1675800.0, ans=10.0 2023-10-04 13:48:09,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:48:12,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:48:16,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:48:18,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:48:18,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:48:18,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:48:21,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 13:48:22,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:48:23,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:24,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:48:24,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1675933.3333333333, ans=0.07 2023-10-04 13:48:25,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:48:28,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 13:48:28,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 13:48:29,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.64 vs. limit=15.0 2023-10-04 13:48:30,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:31,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 13:48:31,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:48:39,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:40,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:48:41,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:48:43,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:48:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 13:48:43,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:48:46,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:46,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 13:48:46,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:48:46,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:48:46,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:46,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:48:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:48:48,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:48:50,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:48:50,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:48:50,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:51,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1676066.6666666667, ans=0.0 2023-10-04 13:48:55,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:48:55,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 13:48:58,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:59,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:49:01,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 13:49:05,484 INFO [train.py:1046] (2/4) Epoch 48, batch 1750, loss[loss=0.1532, simple_loss=0.2165, pruned_loss=0.04494, over 22725.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2328, pruned_loss=0.03583, over 4711055.65 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:49:07,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:07,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1676133.3333333333, ans=0.125 2023-10-04 13:49:10,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:49:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:49:11,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 13:49:11,814 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:49:13,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:49:13,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:17,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 13:49:20,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:49:21,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 13:49:21,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:49:23,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:49:23,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1676200.0, ans=0.0 2023-10-04 13:49:26,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 13:49:28,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 13:49:29,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:49:31,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 13:49:38,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:49:42,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:49:42,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:49:45,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:45,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:49:48,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:49:49,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:51,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:49:51,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:49:51,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 13:49:54,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:49:59,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 13:49:59,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:50:00,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:00,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:50:04,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.39 vs. limit=15.0 2023-10-04 13:50:05,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:50:06,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:50:08,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:50:09,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:50:12,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:14,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:50:15,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:50:16,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 13:50:16,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:50:17,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1676400.0, ans=0.035 2023-10-04 13:50:18,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:50:18,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:20,038 INFO [train.py:1046] (2/4) Epoch 48, batch 1800, loss[loss=0.1607, simple_loss=0.2354, pruned_loss=0.04299, over 23698.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.232, pruned_loss=0.03563, over 4718696.71 frames. ], batch size: 232, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:50:20,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:50:20,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:50:20,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:50:22,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:50:24,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:50:25,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:50:27,376 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.032e+02 2.223e+02 2.665e+02 4.084e+02, threshold=4.447e+02, percent-clipped=0.0 2023-10-04 13:50:27,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:50:30,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 13:50:32,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:50:36,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:50:37,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:37,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:41,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:50:42,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:50:42,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 13:50:44,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:50:47,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:50:51,122 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 13:50:53,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 13:50:53,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 13:50:53,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1676600.0, ans=0.2 2023-10-04 13:50:54,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:50:55,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:55,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:57,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:51:03,221 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 13:51:04,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:51:06,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:08,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 13:51:08,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 13:51:08,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:51:10,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:51:10,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:51:10,704 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:51:15,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 13:51:20,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:51:21,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 13:51:23,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:51:23,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:51:23,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:51:23,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 13:51:26,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:51:28,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:51:29,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 13:51:29,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:51:33,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:51:33,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:51:33,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:34,530 INFO [train.py:1046] (2/4) Epoch 48, batch 1850, loss[loss=0.1348, simple_loss=0.2144, pruned_loss=0.02758, over 24293.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2328, pruned_loss=0.03584, over 4711365.68 frames. ], batch size: 56, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:51:34,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:34,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:51:37,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:51:37,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:51:38,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:51:40,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:51:43,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1676800.0, ans=0.125 2023-10-04 13:51:45,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:51:45,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 13:51:49,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 13:51:51,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 13:51:55,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:51:56,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 13:51:56,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 13:51:57,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1676866.6666666667, ans=0.95 2023-10-04 13:51:58,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1676866.6666666667, ans=0.125 2023-10-04 13:52:07,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:52:08,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 13:52:09,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:52:11,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:52:14,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 13:52:16,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:16,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:52:17,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:52:19,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:52:19,366 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:52:20,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:52:22,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.71 vs. limit=22.5 2023-10-04 13:52:23,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:52:24,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:24,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 13:52:24,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:27,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:52:28,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:52:31,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 13:52:32,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:52:32,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1677066.6666666667, ans=0.04949747468305833 2023-10-04 13:52:36,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:52:36,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:52:36,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 13:52:36,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 13:52:37,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1677066.6666666667, ans=0.2 2023-10-04 13:52:38,906 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 13:52:40,310 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 13:52:42,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:52:42,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:52:42,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:52:42,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:42,298 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 13:52:43,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:52:43,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:43,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:52:45,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:52:46,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:52:46,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 13:52:48,908 INFO [train.py:1046] (2/4) Epoch 48, batch 1900, loss[loss=0.1492, simple_loss=0.2273, pruned_loss=0.03557, over 23761.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2345, pruned_loss=0.03601, over 4707686.80 frames. ], batch size: 232, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:52:49,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:49,016 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 13:52:49,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:52:50,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:50,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1677133.3333333333, ans=0.0 2023-10-04 13:52:51,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.65 vs. limit=10.0 2023-10-04 13:52:55,963 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.090e+02 2.355e+02 2.808e+02 4.439e+02, threshold=4.709e+02, percent-clipped=0.0 2023-10-04 13:52:56,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:59,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:52:59,866 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 13:52:59,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 13:53:02,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:53:02,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:53:02,669 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 13:53:03,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 13:53:05,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1677200.0, ans=0.1 2023-10-04 13:53:07,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 13:53:08,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:53:13,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 13:53:16,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 13:53:24,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 13:53:26,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1677266.6666666667, ans=0.125 2023-10-04 13:53:27,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 13:53:27,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:53:27,398 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 13:53:27,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 13:53:29,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 13:53:29,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 13:53:29,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:53:33,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 13:53:35,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:53:38,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:53:38,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 13:53:39,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:53:44,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 13:53:46,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:53:50,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:53:50,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:53:50,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:53:51,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:53:53,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:53:53,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 13:53:53,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:53:57,162 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:53:57,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:53:58,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:53:58,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:54:00,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:54:01,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:54:03,586 INFO [train.py:1046] (2/4) Epoch 48, batch 1950, loss[loss=0.1449, simple_loss=0.2357, pruned_loss=0.02704, over 24481.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2344, pruned_loss=0.03579, over 4721817.63 frames. ], batch size: 66, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:54:05,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:54:06,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:54:08,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:08,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:54:09,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 13:54:11,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:54:13,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:14,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:15,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:54:17,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:17,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:18,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:54:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:54:20,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:54:21,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:54:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:22,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:25,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:54:25,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:25,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:54:25,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 13:54:27,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:54:27,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:54:27,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:31,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:35,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:54:38,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:54:40,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=22.5 2023-10-04 13:54:41,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:54:41,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:54:43,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 13:54:43,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:54:47,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:54:48,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:54:48,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:54:56,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:58,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:01,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:04,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:55:07,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:55:07,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:55:09,507 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 13:55:09,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:55:10,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:55:12,370 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 13:55:16,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:55:18,190 INFO [train.py:1046] (2/4) Epoch 48, batch 2000, loss[loss=0.1419, simple_loss=0.2233, pruned_loss=0.03027, over 24304.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03627, over 4731327.95 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:55:19,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:55:21,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:55:22,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:55:22,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:55:24,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1677800.0, ans=0.125 2023-10-04 13:55:25,197 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.085e+02 2.271e+02 2.591e+02 3.651e+02, threshold=4.543e+02, percent-clipped=0.0 2023-10-04 13:55:25,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:28,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1677800.0, ans=0.125 2023-10-04 13:55:29,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 13:55:29,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:55:32,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:55:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 13:55:36,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:55:36,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:55:37,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1677866.6666666667, ans=0.125 2023-10-04 13:55:38,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:55:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 13:55:41,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,753 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=12.0 2023-10-04 13:55:44,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 13:55:44,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:55:46,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 13:55:46,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1677933.3333333333, ans=0.125 2023-10-04 13:55:48,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:55:49,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:55:51,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:55:51,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:51,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:55:53,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:55:53,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 13:55:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 13:55:56,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:55:56,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:00,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:00,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:56:00,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:56:02,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:56:02,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1678000.0, ans=0.125 2023-10-04 13:56:04,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:56:05,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:05,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:56:05,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:06,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:06,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1678000.0, ans=0.0 2023-10-04 13:56:09,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:56:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 13:56:16,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:56:16,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:20,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:20,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:56:24,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:26,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:56:26,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:27,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:56:27,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:56:28,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:30,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:31,602 INFO [train.py:1046] (2/4) Epoch 48, batch 2050, loss[loss=0.1408, simple_loss=0.2213, pruned_loss=0.03013, over 23591.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2339, pruned_loss=0.03641, over 4723374.88 frames. ], batch size: 149, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:56:33,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:56:34,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:40,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:56:42,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:56:43,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:43,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:56:45,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 13:56:45,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:56:48,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:48,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:56:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:56:57,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:59,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 13:57:01,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:57:02,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 13:57:02,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:57:05,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:57:05,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1678266.6666666667, ans=0.0 2023-10-04 13:57:08,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:10,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:57:10,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:57:11,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:57:13,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:57:13,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:57:18,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:20,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:57:21,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:57:23,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:57:26,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1678333.3333333333, ans=0.0 2023-10-04 13:57:27,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:57:33,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:57:34,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 13:57:38,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:57:40,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:57:41,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1678400.0, ans=0.125 2023-10-04 13:57:42,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:57:43,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 13:57:45,296 INFO [train.py:1046] (2/4) Epoch 48, batch 2100, loss[loss=0.146, simple_loss=0.2229, pruned_loss=0.03462, over 23768.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.233, pruned_loss=0.0359, over 4735054.47 frames. ], batch size: 164, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:57:46,953 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 13:57:46,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:57:47,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1678466.6666666667, ans=0.0 2023-10-04 13:57:48,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:49,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:57:49,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:57:49,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 13:57:51,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 13:57:52,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:57:55,616 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.079e+02 2.318e+02 2.598e+02 4.333e+02, threshold=4.637e+02, percent-clipped=0.0 2023-10-04 13:57:55,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:57:55,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:57:58,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:57:59,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:57:59,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 13:58:01,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:58:01,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 13:58:01,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 13:58:02,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:02,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:58:02,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 13:58:03,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:58:08,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 13:58:08,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:58:12,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:58:12,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:58:13,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1678600.0, ans=0.125 2023-10-04 13:58:14,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1678600.0, ans=0.2 2023-10-04 13:58:17,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:58:17,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 13:58:17,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:17,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:58:20,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 13:58:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:20,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 13:58:21,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 13:58:21,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 13:58:23,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:58:26,229 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:58:27,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:58:29,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:58:30,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:32,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:32,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 13:58:32,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:32,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:33,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:33,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 13:58:34,138 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.82 vs. limit=15.0 2023-10-04 13:58:36,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 13:58:36,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 13:58:39,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:58:41,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:58:42,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 13:58:43,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1678733.3333333333, ans=0.125 2023-10-04 13:58:48,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:49,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:58:49,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:58:51,144 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:58:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 13:58:51,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:58:52,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:52,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:58:52,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:58:52,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:55,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 13:58:56,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 13:58:56,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:58:59,285 INFO [train.py:1046] (2/4) Epoch 48, batch 2150, loss[loss=0.1459, simple_loss=0.2196, pruned_loss=0.0361, over 22831.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2318, pruned_loss=0.03574, over 4715262.37 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:59:00,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:00,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:59:00,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:59:00,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:59:02,951 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.10 vs. limit=6.0 2023-10-04 13:59:04,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:59:07,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:09,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:10,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:59:10,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:12,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:59:12,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1678866.6666666667, ans=0.125 2023-10-04 13:59:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:16,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:59:16,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:59:19,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:19,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1678866.6666666667, ans=0.125 2023-10-04 13:59:21,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 13:59:23,503 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.41 vs. limit=22.5 2023-10-04 13:59:23,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn2.whiten.whitening_limit, batch_count=1678866.6666666667, ans=22.5 2023-10-04 13:59:25,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:25,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1678866.6666666667, ans=0.2 2023-10-04 13:59:26,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:59:27,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:27,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:27,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:28,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:59:29,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:29,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:59:29,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:59:30,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 13:59:33,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:59:34,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:34,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:36,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:59:36,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:59:38,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:38,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:59:42,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:42,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 13:59:42,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:59:45,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:45,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:46,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:47,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:59:48,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:48,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:48,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 13:59:48,716 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:59:48,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1679000.0, ans=0.125 2023-10-04 13:59:51,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 13:59:51,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:59:51,913 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 13:59:51,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:51,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:59:53,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1679000.0, ans=0.125 2023-10-04 13:59:54,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 13:59:54,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:59:54,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 13:59:54,580 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 13:59:54,580 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 13:59:55,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 13:59:58,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:58,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:58,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:59:59,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:01,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:00:02,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:00:02,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:04,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.48 vs. limit=22.5 2023-10-04 14:00:11,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:00:12,991 INFO [train.py:1046] (2/4) Epoch 48, batch 2200, loss[loss=0.1423, simple_loss=0.2261, pruned_loss=0.02928, over 24566.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.232, pruned_loss=0.03558, over 4726038.48 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:00:13,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 14:00:16,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:00:18,475 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.49 vs. limit=15.0 2023-10-04 14:00:22,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:23,617 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.040e+02 2.222e+02 2.604e+02 4.042e+02, threshold=4.443e+02, percent-clipped=0.0 2023-10-04 14:00:23,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:00:23,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:00:23,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1679133.3333333333, ans=0.125 2023-10-04 14:00:24,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1679133.3333333333, ans=0.125 2023-10-04 14:00:25,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:00:27,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:00:27,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:00:27,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 14:00:33,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 14:00:36,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:00:41,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 14:00:42,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:44,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:00:45,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:00:47,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:00:49,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 14:00:52,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:00:52,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:52,147 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 14:00:57,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:00:58,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:01,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:01:02,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:02,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1679333.3333333333, ans=0.125 2023-10-04 14:01:04,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 14:01:04,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:05,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 14:01:05,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1679333.3333333333, ans=0.125 2023-10-04 14:01:06,317 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-10-04 14:01:06,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:06,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:01:06,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:09,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:01:09,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:09,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:10,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:12,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:01:12,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:01:15,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:01:17,829 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.11 vs. limit=22.5 2023-10-04 14:01:18,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 14:01:18,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:01:21,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:01:23,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 14:01:26,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:01:26,449 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 14:01:27,697 INFO [train.py:1046] (2/4) Epoch 48, batch 2250, loss[loss=0.1497, simple_loss=0.2415, pruned_loss=0.02896, over 24346.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2326, pruned_loss=0.03557, over 4739531.54 frames. ], batch size: 74, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:01:27,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:01:29,102 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 14:01:30,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:01:30,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:01:32,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:01:33,546 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 14:01:36,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:01:36,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1679466.6666666667, ans=10.0 2023-10-04 14:01:37,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:01:44,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:01:44,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1679533.3333333333, ans=0.2 2023-10-04 14:01:46,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:01:47,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1679533.3333333333, ans=0.125 2023-10-04 14:01:48,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.55 vs. limit=22.5 2023-10-04 14:01:48,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:01:48,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:01:50,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:01:53,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 14:01:53,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:53,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:01:56,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 14:01:58,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:58,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:01:59,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:02:04,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:02:04,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:02:04,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:02:06,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 14:02:08,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:02:09,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:02:13,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:02:15,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:02:16,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:02:16,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:02:19,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:02:19,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:02:21,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1679666.6666666667, ans=0.125 2023-10-04 14:02:24,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:02:27,590 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:02:33,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:02:33,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:02:33,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:02:35,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1679733.3333333333, ans=15.0 2023-10-04 14:02:37,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:02:38,001 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:02:39,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:02:39,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 14:02:40,415 INFO [train.py:1046] (2/4) Epoch 48, batch 2300, loss[loss=0.1545, simple_loss=0.2294, pruned_loss=0.03984, over 23462.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2338, pruned_loss=0.03595, over 4740769.83 frames. ], batch size: 120, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:02:40,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:40,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:02:43,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 14:02:44,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:02:44,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:50,761 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.018e+02 2.211e+02 2.601e+02 3.731e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-04 14:02:50,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:51,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:02:52,783 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 14:02:54,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:02,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:03:02,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:03:02,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:02,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:02,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 14:03:03,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:03:06,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:03:07,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:03:09,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:03:10,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1679933.3333333333, ans=0.0 2023-10-04 14:03:11,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:03:14,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:03:19,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:03:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:22,971 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.47 vs. limit=22.5 2023-10-04 14:03:23,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:03:28,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:03:32,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:03:33,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:03:34,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:03:34,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 14:03:36,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1680000.0, ans=0.0 2023-10-04 14:03:37,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:03:37,982 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:03:39,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:39,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:03:39,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:03:39,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:03:40,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 14:03:40,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:03:41,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 14:03:41,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:03:41,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:42,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 14:03:46,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:03:51,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:03:53,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:03:53,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:03:53,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:03:55,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:03:55,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:03:55,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:03:57,263 INFO [train.py:1046] (2/4) Epoch 48, batch 2350, loss[loss=0.1616, simple_loss=0.2321, pruned_loss=0.04555, over 23449.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2344, pruned_loss=0.03623, over 4738373.15 frames. ], batch size: 285, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:03:57,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 14:04:04,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:04:04,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 14:04:10,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 14:04:11,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:04:15,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:15,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:15,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:04:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:04:19,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 14:04:22,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:04:23,785 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:04:26,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 14:04:27,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1680266.6666666667, ans=0.125 2023-10-04 14:04:28,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:04:31,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:04:31,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:04:34,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:04:36,016 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 14:04:36,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:04:37,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:04:38,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:04:38,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:04:41,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:04:44,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 14:04:44,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:04:45,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:45,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:04:47,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 14:04:48,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:04:50,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 14:04:50,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:04:55,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 14:04:59,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 14:05:00,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:05:00,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:05:00,856 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 14:05:00,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 14:05:02,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1680400.0, ans=0.125 2023-10-04 14:05:04,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 14:05:06,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:05:08,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:05:11,126 INFO [train.py:1046] (2/4) Epoch 48, batch 2400, loss[loss=0.133, simple_loss=0.204, pruned_loss=0.03104, over 23455.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.0364, over 4725093.61 frames. ], batch size: 285, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:05:12,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:05:13,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:05:15,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 14:05:15,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 14:05:22,853 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.080e+02 2.454e+02 2.862e+02 5.375e+02, threshold=4.908e+02, percent-clipped=3.0 2023-10-04 14:05:24,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:05:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:05:27,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 14:05:27,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:05:27,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:27,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 14:05:35,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:35,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1680533.3333333333, ans=0.125 2023-10-04 14:05:36,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 14:05:39,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:05:43,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 14:05:46,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:05:47,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:47,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1680600.0, ans=0.125 2023-10-04 14:05:54,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:05:54,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 14:05:54,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1680666.6666666667, ans=0.5 2023-10-04 14:05:55,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:06:02,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1680666.6666666667, ans=0.0 2023-10-04 14:06:03,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:05,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:06:09,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:10,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:06:10,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:06:10,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:06:10,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:10,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:06:10,755 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:06:10,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1680733.3333333333, ans=10.0 2023-10-04 14:06:13,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:06:13,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:06:14,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 14:06:14,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 14:06:17,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:06:17,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:18,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 14:06:18,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 14:06:20,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 14:06:20,261 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 14:06:20,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 14:06:22,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:06:24,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:24,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:06:25,327 INFO [train.py:1046] (2/4) Epoch 48, batch 2450, loss[loss=0.1693, simple_loss=0.256, pruned_loss=0.04124, over 24124.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2318, pruned_loss=0.03595, over 4715487.59 frames. ], batch size: 86, lr: 2.11e-03, grad_scale: 4.0 2023-10-04 14:06:25,426 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 14:06:26,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:26,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:06:29,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:06:29,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:06:32,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:33,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:06:35,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 14:06:41,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:06:41,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:44,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:06:44,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:06:44,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:06:45,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 14:06:49,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:51,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:06:53,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:06:56,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:06:56,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:06:58,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:06:58,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:59,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 14:07:00,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:07:06,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:07,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1680933.3333333333, ans=0.125 2023-10-04 14:07:08,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:07:08,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:08,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:07:10,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:11,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:07:12,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 14:07:15,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:07:15,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:07:18,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:07:18,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:23,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:07:23,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 14:07:24,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:07:25,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:07:25,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 14:07:26,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:07:26,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:07:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:07:32,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:32,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:07:36,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 14:07:37,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1681066.6666666667, ans=0.2 2023-10-04 14:07:38,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:07:39,854 INFO [train.py:1046] (2/4) Epoch 48, batch 2500, loss[loss=0.1433, simple_loss=0.215, pruned_loss=0.03584, over 23425.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2315, pruned_loss=0.03572, over 4717136.92 frames. ], batch size: 285, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:07:44,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:07:51,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:07:52,951 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.028e+02 2.220e+02 2.568e+02 3.726e+02, threshold=4.440e+02, percent-clipped=0.0 2023-10-04 14:07:53,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:54,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:07:54,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 14:07:58,494 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.97 vs. limit=22.5 2023-10-04 14:08:00,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:08:01,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:08:01,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:08:03,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:08:03,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 14:08:05,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:08:06,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 14:08:06,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:08,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 14:08:08,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:13,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:08:13,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:08:16,621 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:08:16,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 14:08:17,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:08:18,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:22,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:25,607 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:27,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:08:33,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:08:34,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 14:08:34,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:08:36,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:08:39,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:08:39,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:08:41,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 14:08:41,456 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 14:08:41,462 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 14:08:42,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:45,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 14:08:47,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 14:08:47,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:08:48,267 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 14:08:51,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 14:08:54,280 INFO [train.py:1046] (2/4) Epoch 48, batch 2550, loss[loss=0.1784, simple_loss=0.2597, pruned_loss=0.04858, over 24408.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2325, pruned_loss=0.03582, over 4727014.75 frames. ], batch size: 77, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:08:54,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:08:55,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1681466.6666666667, ans=0.1 2023-10-04 14:08:57,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:08:58,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:09:00,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:09:00,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 14:09:01,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:09:05,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 14:09:07,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:09:07,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1681533.3333333333, ans=0.125 2023-10-04 14:09:08,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:10,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:09:10,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 14:09:10,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:09:12,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:09:12,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:09:14,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:09:15,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 14:09:15,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:09:15,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:15,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 14:09:20,002 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-10-04 14:09:26,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:09:29,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.30 vs. limit=10.0 2023-10-04 14:09:30,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:09:30,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:31,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:09:32,460 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.01 vs. limit=15.0 2023-10-04 14:09:33,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:09:40,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:09:42,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:09:42,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:09:42,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:09:43,134 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.96 vs. limit=12.0 2023-10-04 14:09:43,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:09:44,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:09:48,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:09:48,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:52,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1681733.3333333333, ans=0.125 2023-10-04 14:09:53,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:09:53,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 14:09:53,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:09:53,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:55,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:09:58,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:10:00,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:05,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:10:06,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1681733.3333333333, ans=0.125 2023-10-04 14:10:07,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:08,547 INFO [train.py:1046] (2/4) Epoch 48, batch 2600, loss[loss=0.1711, simple_loss=0.2463, pruned_loss=0.0479, over 23866.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2334, pruned_loss=0.03629, over 4723670.83 frames. ], batch size: 195, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:10:08,723 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 14:10:12,882 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 14:10:12,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:10:12,943 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 14:10:14,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 14:10:14,906 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 14:10:17,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:10:17,695 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 14:10:19,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 14:10:20,566 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 14:10:21,868 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.037e+02 2.298e+02 2.610e+02 5.474e+02, threshold=4.596e+02, percent-clipped=1.0 2023-10-04 14:10:22,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:10:24,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 14:10:25,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 14:10:28,086 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:10:28,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 14:10:31,261 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 14:10:31,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 14:10:36,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:10:36,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:36,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:10:36,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 14:10:37,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1681933.3333333333, ans=0.0 2023-10-04 14:10:38,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:10:45,723 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 14:10:50,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:50,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:10:51,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 14:10:52,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:10:52,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:10:53,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 14:10:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:10:54,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:10:57,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1682000.0, ans=0.0 2023-10-04 14:10:58,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:00,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1682000.0, ans=0.2 2023-10-04 14:11:02,549 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 14:11:02,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:02,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:11:08,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:11:08,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:11:08,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 14:11:09,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:11:09,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1682066.6666666667, ans=0.125 2023-10-04 14:11:10,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:11:11,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:11:17,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 14:11:18,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:20,141 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:11:20,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-10-04 14:11:22,761 INFO [train.py:1046] (2/4) Epoch 48, batch 2650, loss[loss=0.1597, simple_loss=0.2344, pruned_loss=0.04248, over 23598.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2342, pruned_loss=0.03667, over 4714649.16 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:11:24,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 14:11:25,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:11:28,640 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 14:11:28,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:11:30,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:33,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:11:33,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:11:35,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.42 vs. limit=15.0 2023-10-04 14:11:36,389 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:37,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 14:11:37,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:11:37,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:11:39,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 14:11:39,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1682200.0, ans=0.2 2023-10-04 14:11:40,522 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 14:11:41,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:11:45,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 14:11:45,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:11:45,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 14:11:49,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:11:49,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:11:49,372 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:11:49,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:11:55,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 14:11:55,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 14:11:55,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1682266.6666666667, ans=0.0 2023-10-04 14:11:57,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:12:02,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 14:12:02,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:12:04,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:04,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:12:04,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:12:05,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:12:07,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:12:08,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:12:10,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:12:11,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:12:11,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:12:12,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:12,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:12:14,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:16,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:12:16,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:12:20,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:20,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:12:20,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:20,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 14:12:26,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:12:27,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:28,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:29,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1682400.0, ans=0.0 2023-10-04 14:12:30,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:30,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:12:32,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:34,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:12:36,115 INFO [train.py:1046] (2/4) Epoch 48, batch 2700, loss[loss=0.1518, simple_loss=0.2346, pruned_loss=0.03446, over 23597.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2349, pruned_loss=0.03658, over 4720753.10 frames. ], batch size: 93, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:12:36,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 14:12:39,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:12:41,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 14:12:43,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:12:43,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:44,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:46,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:12:46,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:47,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:12:47,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:12:47,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 14:12:48,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:12:49,271 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.735e+02 2.027e+02 2.284e+02 2.628e+02 4.660e+02, threshold=4.569e+02, percent-clipped=1.0 2023-10-04 14:12:49,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:12:49,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:12:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:53,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:12:55,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 14:12:55,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:12:59,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:12:59,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:06,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:13:06,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:13:07,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:13:07,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:13:09,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:12,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:13:12,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:13:12,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:13:15,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:15,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:13:20,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1682666.6666666667, ans=0.125 2023-10-04 14:13:25,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:13:26,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:13:30,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:13:30,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:34,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:35,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:35,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:13:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:38,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:38,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:13:41,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:13:41,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:41,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:43,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1682733.3333333333, ans=0.1 2023-10-04 14:13:45,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 14:13:45,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:49,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:13:49,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 14:13:49,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 14:13:51,617 INFO [train.py:1046] (2/4) Epoch 48, batch 2750, loss[loss=0.1471, simple_loss=0.2295, pruned_loss=0.03241, over 24420.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2344, pruned_loss=0.03626, over 4715587.30 frames. ], batch size: 58, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:13:51,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:54,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:13:55,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:57,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:57,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:13:57,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:01,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:03,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:14:03,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:14:03,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:03,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 14:14:03,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:14:03,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:14:07,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1682866.6666666667, ans=0.0 2023-10-04 14:14:08,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 14:14:09,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:14:10,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:10,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:14:12,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:14:13,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:14:13,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:14:14,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:14,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:19,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:14:19,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:14:21,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:14:21,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:22,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:14:28,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:30,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:14:30,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:36,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:36,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:14:36,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:14:43,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:14:43,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:14:43,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 14:14:46,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:49,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 14:14:53,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1683066.6666666667, ans=0.0 2023-10-04 14:14:54,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:14:55,780 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:14:57,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 14:14:57,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:14:59,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:14:59,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 14:15:01,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:15:05,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 14:15:06,638 INFO [train.py:1046] (2/4) Epoch 48, batch 2800, loss[loss=0.1454, simple_loss=0.2259, pruned_loss=0.03243, over 24445.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.233, pruned_loss=0.03597, over 4712306.46 frames. ], batch size: 58, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:15:06,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:06,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:08,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 14:15:08,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:08,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:09,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:09,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1683133.3333333333, ans=0.0 2023-10-04 14:15:09,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1683133.3333333333, ans=0.2 2023-10-04 14:15:10,854 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 14:15:10,855 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 14:15:15,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:15,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1683133.3333333333, ans=0.125 2023-10-04 14:15:16,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:15:16,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:15:19,685 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.049e+02 2.249e+02 2.706e+02 5.185e+02, threshold=4.498e+02, percent-clipped=5.0 2023-10-04 14:15:19,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:15:21,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 14:15:24,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 14:15:24,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 14:15:27,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:27,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:15:27,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:15:27,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.46 vs. limit=15.0 2023-10-04 14:15:30,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:15:30,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:30,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:15:32,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:15:33,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1683200.0, ans=0.125 2023-10-04 14:15:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:15:42,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:44,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1683266.6666666667, ans=0.125 2023-10-04 14:15:45,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:47,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:15:47,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:15:51,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:15:52,934 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 14:15:53,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:54,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:15:54,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:15:57,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:57,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:57,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.60 vs. limit=22.5 2023-10-04 14:16:00,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:16:02,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:16:03,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:03,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:16:04,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:16:04,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1683400.0, ans=0.1 2023-10-04 14:16:06,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:16:06,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:16:06,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 14:16:06,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:09,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:16:09,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:10,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 14:16:12,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:16:12,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:16:13,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 14:16:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:16:19,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:16:19,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:16:20,899 INFO [train.py:1046] (2/4) Epoch 48, batch 2850, loss[loss=0.1167, simple_loss=0.1703, pruned_loss=0.03157, over 19257.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2322, pruned_loss=0.03561, over 4709151.52 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:16:22,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:16:25,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:16:26,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:16:26,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:16:28,401 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:28,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:29,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:16:30,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1683466.6666666667, ans=0.125 2023-10-04 14:16:31,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 14:16:35,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 14:16:35,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:16:38,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 14:16:39,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:41,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1683533.3333333333, ans=0.125 2023-10-04 14:16:42,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 14:16:42,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1683533.3333333333, ans=0.125 2023-10-04 14:16:43,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 14:16:44,559 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:50,743 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.62 vs. limit=12.0 2023-10-04 14:16:57,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:58,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:16:58,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:17:00,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:17:00,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:17:00,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:17:01,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:17:02,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 14:17:06,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:17:06,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:17:08,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:17:08,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:10,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:10,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:13,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:17:15,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:17:15,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:17,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:21,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:17:24,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:17:24,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1683733.3333333333, ans=0.125 2023-10-04 14:17:25,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 14:17:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 14:17:27,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:17:27,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1683733.3333333333, ans=0.125 2023-10-04 14:17:28,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:17:28,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 14:17:28,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:17:30,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:17:30,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:17:30,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:17:30,287 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 14:17:31,494 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 14:17:31,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:17:32,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:34,197 INFO [train.py:1046] (2/4) Epoch 48, batch 2900, loss[loss=0.1669, simple_loss=0.2427, pruned_loss=0.04558, over 23680.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.03575, over 4712992.68 frames. ], batch size: 232, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:17:35,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:17:35,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:17:37,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:17:39,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 14:17:41,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:41,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 14:17:43,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 14:17:43,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:17:43,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:17:47,414 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.112e+02 2.348e+02 2.859e+02 4.205e+02, threshold=4.696e+02, percent-clipped=0.0 2023-10-04 14:17:47,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:47,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:17:50,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:17:52,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:55,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:17:55,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 14:17:55,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:17:56,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:59,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 14:17:59,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 14:18:02,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:18:02,360 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 14:18:03,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:18:05,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:18:05,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:18:05,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1683933.3333333333, ans=0.125 2023-10-04 14:18:08,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:18:09,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:18:13,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:18:16,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:18,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 14:18:19,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 14:18:19,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:18:19,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.74 vs. limit=12.0 2023-10-04 14:18:23,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:18:24,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 14:18:26,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:18:26,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1684000.0, ans=0.125 2023-10-04 14:18:29,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:18:37,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:18:37,508 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:18:39,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 14:18:42,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:42,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 14:18:43,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:18:43,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:18:47,731 INFO [train.py:1046] (2/4) Epoch 48, batch 2950, loss[loss=0.1528, simple_loss=0.2262, pruned_loss=0.03971, over 23529.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2336, pruned_loss=0.03605, over 4704909.91 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:18:51,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:18:52,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 14:18:52,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:18:52,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:53,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=22.5 2023-10-04 14:18:54,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:18:55,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:18:55,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 14:18:57,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 14:18:57,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:18:57,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:19:04,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:19:05,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:19:08,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:19:09,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:19:12,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:19:12,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:19:15,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:19:16,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:19:16,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:19:19,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 14:19:22,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 14:19:24,270 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 14:19:24,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:19:25,831 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 14:19:27,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 14:19:27,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:19:27,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:19:27,786 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 14:19:27,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:19:30,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 14:19:31,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:19:31,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:19:34,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:19:37,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:19:37,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:37,328 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 14:19:37,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:19:37,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 14:19:42,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:43,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1684333.3333333333, ans=0.2 2023-10-04 14:19:44,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:19:44,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 14:19:44,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:19:47,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 14:19:51,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:19:51,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:19:51,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:19:54,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:54,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:19:55,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1684400.0, ans=0.125 2023-10-04 14:19:56,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:19:56,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:19:56,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:19:58,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:19:58,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:19:58,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1684400.0, ans=0.125 2023-10-04 14:19:59,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:20:01,166 INFO [train.py:1046] (2/4) Epoch 48, batch 3000, loss[loss=0.1439, simple_loss=0.2229, pruned_loss=0.03245, over 23371.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03635, over 4709296.41 frames. ], batch size: 119, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:20:01,166 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 14:20:13,592 INFO [train.py:1078] (2/4) Epoch 48, validation: loss=0.3623, simple_loss=0.2785, pruned_loss=0.223, over 1125622.00 frames. 2023-10-04 14:20:13,593 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 14:20:13,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:20:13,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 14:20:13,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:20:16,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:20:17,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:20:20,996 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 14:20:21,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 14:20:22,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:20:22,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:20:24,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 14:20:24,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:20:27,082 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.063e+02 2.332e+02 2.863e+02 4.745e+02, threshold=4.665e+02, percent-clipped=1.0 2023-10-04 14:20:30,005 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:20:40,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:20:46,552 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.82 vs. limit=15.0 2023-10-04 14:20:47,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 14:20:49,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:20:51,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:20:51,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:20:51,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:20:52,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:20:53,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 14:20:54,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 14:20:55,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:20:57,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:20:59,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:20:59,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:21:00,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:21:00,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1684666.6666666667, ans=0.1 2023-10-04 14:21:04,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:21:04,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:21:04,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:21:06,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:21:10,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 14:21:11,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:21:11,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:11,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:21:14,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1684733.3333333333, ans=0.0 2023-10-04 14:21:15,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:16,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:17,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 14:21:17,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 14:21:18,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:21:18,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 14:21:18,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:21:20,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 14:21:25,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:21:25,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:21:25,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 14:21:27,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 14:21:27,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:21:27,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:21:29,209 INFO [train.py:1046] (2/4) Epoch 48, batch 3050, loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.03751, over 23872.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2349, pruned_loss=0.03659, over 4700961.28 frames. ], batch size: 212, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:21:30,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:30,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:21:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:30,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:21:32,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 14:21:33,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:21:36,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:21:36,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:21:39,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:42,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 14:21:44,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1684866.6666666667, ans=15.0 2023-10-04 14:21:45,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 14:21:46,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 14:21:46,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:21:51,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:21:51,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1684866.6666666667, ans=0.1 2023-10-04 14:21:55,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:55,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:21:55,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:21:58,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1684933.3333333333, ans=0.1 2023-10-04 14:22:00,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:22:00,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:22:00,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1684933.3333333333, ans=0.0 2023-10-04 14:22:01,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:01,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:22:01,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:22:04,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:22:04,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1684933.3333333333, ans=0.0 2023-10-04 14:22:05,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:08,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:08,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 14:22:08,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:22:08,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:22:12,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:22:13,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:22:13,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:22:14,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:17,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:22:17,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:23,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:25,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:22:25,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:27,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:22:29,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:22:30,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:22:30,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 14:22:31,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:22:31,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:34,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 14:22:35,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:36,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.42 vs. limit=22.5 2023-10-04 14:22:37,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1685066.6666666667, ans=0.125 2023-10-04 14:22:38,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1685066.6666666667, ans=0.1 2023-10-04 14:22:41,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:43,339 INFO [train.py:1046] (2/4) Epoch 48, batch 3100, loss[loss=0.1632, simple_loss=0.2501, pruned_loss=0.03822, over 24098.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.234, pruned_loss=0.03624, over 4713826.82 frames. ], batch size: 80, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:22:43,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:22:44,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:22:46,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 14:22:49,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 14:22:50,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 14:22:50,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:22:53,654 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:22:53,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:58,167 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.071e+02 2.302e+02 2.680e+02 4.838e+02, threshold=4.605e+02, percent-clipped=1.0 2023-10-04 14:22:58,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 14:23:02,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:06,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 14:23:11,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:23:11,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:12,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:23:12,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:23:14,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 14:23:17,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:23:17,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 14:23:17,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:23:18,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:18,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 14:23:20,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:23:22,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:23:23,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 14:23:23,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 14:23:24,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:25,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:28,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:23:28,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:28,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:23:31,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:23:31,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:23:33,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:23:33,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:23:33,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:33,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:23:38,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:23:39,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 14:23:41,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:23:42,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 14:23:44,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:23:44,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:44,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 14:23:56,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 14:23:57,941 INFO [train.py:1046] (2/4) Epoch 48, batch 3150, loss[loss=0.1625, simple_loss=0.2525, pruned_loss=0.03619, over 24237.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03599, over 4710531.76 frames. ], batch size: 74, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:23:59,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:23:59,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:00,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:24:00,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:24:02,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 14:24:03,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:04,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 14:24:06,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 14:24:08,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:09,780 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 14:24:09,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1685466.6666666667, ans=0.2 2023-10-04 14:24:12,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 14:24:12,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:24:12,772 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 14:24:14,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 14:24:14,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 14:24:14,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 14:24:14,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1685533.3333333333, ans=0.0 2023-10-04 14:24:16,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 14:24:16,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:16,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:24:17,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:18,826 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 14:24:19,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.18 vs. limit=15.0 2023-10-04 14:24:20,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:20,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:20,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:24:23,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:24:26,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1685600.0, ans=0.125 2023-10-04 14:24:29,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 14:24:29,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:24:32,085 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:24:33,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:24:33,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 14:24:36,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 14:24:36,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:24:37,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:24:37,968 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:24:39,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:39,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:24:39,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:24:40,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:24:40,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 14:24:42,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:24:42,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:42,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:24:42,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:24:43,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 14:24:45,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:24:46,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 14:24:46,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:48,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 14:24:49,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 14:24:49,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:24:50,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:24:50,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 14:24:52,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 14:24:52,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:56,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:24:57,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:57,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:25:00,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1685733.3333333333, ans=10.0 2023-10-04 14:25:02,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:25:03,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:04,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 14:25:09,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:25:09,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:25:12,326 INFO [train.py:1046] (2/4) Epoch 48, batch 3200, loss[loss=0.1513, simple_loss=0.2377, pruned_loss=0.0325, over 24364.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2322, pruned_loss=0.03559, over 4703689.63 frames. ], batch size: 77, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:25:12,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:13,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:25:13,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 14:25:16,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:25:21,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:25:25,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:27,011 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.180e+02 2.542e+02 3.306e+02 4.972e+02, threshold=5.085e+02, percent-clipped=5.0 2023-10-04 14:25:27,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1685866.6666666667, ans=0.2 2023-10-04 14:25:33,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:25:41,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 14:25:43,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:25:46,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 14:25:47,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:25:52,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:25:52,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:25:54,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:25:57,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 14:25:58,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 14:26:00,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 14:26:00,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.66 vs. limit=15.0 2023-10-04 14:26:04,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 14:26:06,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:26:14,249 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:14,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:26:14,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:14,348 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 14:26:14,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:26:19,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:26:20,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 14:26:20,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 14:26:21,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 14:26:23,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 14:26:24,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:26:26,709 INFO [train.py:1046] (2/4) Epoch 48, batch 3250, loss[loss=0.1578, simple_loss=0.2457, pruned_loss=0.03493, over 24668.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2326, pruned_loss=0.03555, over 4718421.57 frames. ], batch size: 68, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:26:26,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:26:28,743 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 14:26:28,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:26:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:28,883 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 14:26:30,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1686133.3333333333, ans=0.1 2023-10-04 14:26:32,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:26:35,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:26:40,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:26:41,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 14:26:41,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1686200.0, ans=0.1 2023-10-04 14:26:43,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:26:43,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:43,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:26:44,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:26:44,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:26:47,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:47,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:26:47,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:49,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:49,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:49,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:26:52,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:26:53,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:26:56,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:56,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:58,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:26:59,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:27:04,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 14:27:05,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:27:05,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:27:06,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:06,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:27:12,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:27:21,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:27:21,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:21,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 14:27:21,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:27:21,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:27:22,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:23,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-10-04 14:27:24,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 14:27:24,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 14:27:26,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:27:27,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:27,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:27:27,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 14:27:29,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:27:33,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:27:33,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:27:36,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 14:27:36,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:27:39,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:27:39,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 14:27:40,898 INFO [train.py:1046] (2/4) Epoch 48, batch 3300, loss[loss=0.1393, simple_loss=0.217, pruned_loss=0.03086, over 17881.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2331, pruned_loss=0.03546, over 4718847.44 frames. ], batch size: 38, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:27:43,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:27:43,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 14:27:45,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 14:27:46,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 14:27:47,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:50,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:27:51,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:27:51,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:52,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:27:53,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:27:56,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:27:57,693 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.042e+02 2.235e+02 2.474e+02 3.621e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-04 14:27:57,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:28:02,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 14:28:02,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:02,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:04,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:05,631 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 14:28:05,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:05,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:28:07,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:28:07,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:08,374 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 14:28:11,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:28:11,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:28:11,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1686600.0, ans=0.2 2023-10-04 14:28:13,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:13,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 14:28:14,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 14:28:16,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:28:17,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1686600.0, ans=0.125 2023-10-04 14:28:18,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1686600.0, ans=6.0 2023-10-04 14:28:18,899 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 14:28:20,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 14:28:21,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:28:23,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 14:28:24,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1686666.6666666667, ans=0.2 2023-10-04 14:28:26,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:28:28,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:28:28,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:28:30,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:30,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:30,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:28:32,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:28:34,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:28:34,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:36,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:28:37,525 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 14:28:37,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 14:28:37,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1686666.6666666667, ans=0.125 2023-10-04 14:28:39,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:28:40,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:28:40,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:41,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:41,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:43,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:28:45,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:45,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:28:46,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:47,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:28:50,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 14:28:50,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:50,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1686733.3333333333, ans=0.05 2023-10-04 14:28:51,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:53,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:28:53,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:28:54,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:56,573 INFO [train.py:1046] (2/4) Epoch 48, batch 3350, loss[loss=0.174, simple_loss=0.2553, pruned_loss=0.04634, over 24092.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2334, pruned_loss=0.0357, over 4722548.80 frames. ], batch size: 80, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:28:56,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:56,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:59,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:28:59,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1686800.0, ans=0.0 2023-10-04 14:29:00,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:02,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:29:05,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:07,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:29:08,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:29:08,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:29:10,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 14:29:11,635 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 14:29:12,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:29:17,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 14:29:17,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 14:29:18,880 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:29:18,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:29:20,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:20,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 14:29:20,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:20,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:29:23,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:25,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:25,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:26,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:29:28,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:29,411 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.25 vs. limit=15.0 2023-10-04 14:29:32,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:32,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:37,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:29:38,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:40,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:40,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:41,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:43,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 14:29:44,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:29:44,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 14:29:44,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:29:46,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 14:29:46,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:46,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1687000.0, ans=0.1 2023-10-04 14:29:47,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:54,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:56,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 14:29:56,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:29:57,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:29:59,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:30:01,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1687066.6666666667, ans=0.125 2023-10-04 14:30:04,631 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=22.5 2023-10-04 14:30:05,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:30:06,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 14:30:08,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:30:08,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:30:10,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:11,858 INFO [train.py:1046] (2/4) Epoch 48, batch 3400, loss[loss=0.152, simple_loss=0.2376, pruned_loss=0.0332, over 24676.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2339, pruned_loss=0.03583, over 4729554.46 frames. ], batch size: 65, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:30:11,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 14:30:11,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:30:11,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 14:30:12,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1687133.3333333333, ans=0.07 2023-10-04 14:30:14,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:30:14,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:30:16,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:30:16,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:30:16,189 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 14:30:19,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.96 vs. limit=15.0 2023-10-04 14:30:22,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 14:30:22,280 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 14:30:22,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:26,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:30:26,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:30:27,669 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.768e+02 2.082e+02 2.351e+02 2.856e+02 4.234e+02, threshold=4.702e+02, percent-clipped=0.0 2023-10-04 14:30:27,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:30:29,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:30:34,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:30:36,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 14:30:39,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:30:40,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1687266.6666666667, ans=0.1 2023-10-04 14:30:43,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:30:43,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:43,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:30:46,669 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.29 vs. limit=15.0 2023-10-04 14:30:47,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:30:51,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 14:30:57,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:57,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:57,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 14:30:57,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:30:57,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1687333.3333333333, ans=0.0 2023-10-04 14:30:58,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:00,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:31:00,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:31:02,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:31:06,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:31:06,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:31:06,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1687333.3333333333, ans=0.04949747468305833 2023-10-04 14:31:11,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:31:13,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 14:31:18,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:31:19,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1687400.0, ans=0.1 2023-10-04 14:31:23,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 14:31:26,011 INFO [train.py:1046] (2/4) Epoch 48, batch 3450, loss[loss=0.1446, simple_loss=0.2356, pruned_loss=0.02681, over 24607.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2336, pruned_loss=0.0357, over 4724106.78 frames. ], batch size: 68, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:31:27,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 14:31:27,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:31:28,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:31:28,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 14:31:29,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1687466.6666666667, ans=0.125 2023-10-04 14:31:30,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:31:33,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:31:38,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:31:40,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:31:40,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:31:40,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:43,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:44,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1687533.3333333333, ans=0.0 2023-10-04 14:31:50,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 14:31:56,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 14:31:56,594 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:31:56,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:31:57,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:04,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 14:32:04,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:32:05,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1687600.0, ans=0.2 2023-10-04 14:32:08,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:32:08,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:32:08,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1687666.6666666667, ans=0.04949747468305833 2023-10-04 14:32:09,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:32:11,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:32:12,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 14:32:14,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:32:16,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:32:18,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:32:19,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.09 vs. limit=10.0 2023-10-04 14:32:20,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 14:32:22,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:32:28,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:32:30,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:30,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-10-04 14:32:32,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:36,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:32:37,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:32:39,007 INFO [train.py:1046] (2/4) Epoch 48, batch 3500, loss[loss=0.1581, simple_loss=0.2385, pruned_loss=0.03883, over 23342.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2321, pruned_loss=0.03553, over 4714855.63 frames. ], batch size: 105, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:32:39,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:32:43,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:46,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:32:46,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 14:32:48,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1687800.0, ans=0.035 2023-10-04 14:32:49,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:32:52,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:32:53,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:53,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 14:32:54,995 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.040e+02 2.265e+02 2.652e+02 4.123e+02, threshold=4.530e+02, percent-clipped=0.0 2023-10-04 14:32:55,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.55 vs. limit=22.5 2023-10-04 14:32:58,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:32:59,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:32:59,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:32:59,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:00,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:33:01,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:02,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:33:02,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 14:33:05,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:05,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:33:07,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:33:12,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:12,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 14:33:12,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:33:15,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:33:15,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:33:16,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:18,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:33:18,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:33:19,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 14:33:20,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 14:33:20,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 14:33:20,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:33:22,976 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.23 vs. limit=15.0 2023-10-04 14:33:23,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:25,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:25,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:33:28,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:33:30,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:33:34,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:33:36,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 14:33:36,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 14:33:36,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:33:37,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:33:40,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:33:41,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:44,190 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 14:33:44,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:33:46,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:46,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 14:33:49,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 14:33:52,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:53,595 INFO [train.py:1046] (2/4) Epoch 48, batch 3550, loss[loss=0.1499, simple_loss=0.2406, pruned_loss=0.02961, over 24297.00 frames. ], tot_loss[loss=0.1507, simple_loss=0.2306, pruned_loss=0.03539, over 4698665.11 frames. ], batch size: 74, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:33:53,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:33:53,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:33:53,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:33:56,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:33:56,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1688133.3333333333, ans=0.1 2023-10-04 14:33:59,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1688133.3333333333, ans=0.0 2023-10-04 14:34:05,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:05,937 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.96 vs. limit=12.0 2023-10-04 14:34:08,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 14:34:10,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:34:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:34:13,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:13,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:34:13,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:34:18,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:34:19,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:34:19,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:19,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:34:20,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:34:25,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:34:25,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:34:28,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:34:28,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:29,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:34:29,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 14:34:29,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:30,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:32,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:34:35,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1688266.6666666667, ans=0.125 2023-10-04 14:34:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:34:37,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.44 vs. limit=10.0 2023-10-04 14:34:38,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:34:40,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:34:41,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 14:34:43,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:34:43,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1688333.3333333333, ans=0.125 2023-10-04 14:34:44,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 14:34:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:34:45,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1688333.3333333333, ans=0.0 2023-10-04 14:34:47,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:34:47,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:34:50,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 14:34:51,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:34:55,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:34:56,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 14:34:57,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:00,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:35:01,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 14:35:06,902 INFO [train.py:1046] (2/4) Epoch 48, batch 3600, loss[loss=0.1568, simple_loss=0.2327, pruned_loss=0.04047, over 23824.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2314, pruned_loss=0.03567, over 4702042.25 frames. ], batch size: 179, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:35:10,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 14:35:10,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:35:10,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:35:12,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1688466.6666666667, ans=0.2 2023-10-04 14:35:14,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:14,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:16,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:35:18,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:35:19,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:20,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:35:20,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1688466.6666666667, ans=0.1 2023-10-04 14:35:21,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:35:23,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:23,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 14:35:25,594 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 2.134e+02 2.513e+02 3.119e+02 5.278e+02, threshold=5.026e+02, percent-clipped=3.0 2023-10-04 14:35:25,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:35:27,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:29,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:35:32,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:35:32,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:35:34,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:35:34,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 14:35:34,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:35:36,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:36,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:35:40,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:35:41,928 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:35:42,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:35:43,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 14:35:49,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:35:50,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:35:52,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 14:35:53,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1688666.6666666667, ans=0.125 2023-10-04 14:35:56,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:35:59,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:35:59,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1688666.6666666667, ans=0.09899494936611666 2023-10-04 14:36:02,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:06,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:36:06,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:36:06,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 14:36:08,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 14:36:10,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 14:36:11,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:36:12,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:36:14,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 14:36:14,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:36:14,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:36:14,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:36:15,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 14:36:15,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 14:36:18,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:18,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 14:36:19,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1688733.3333333333, ans=0.07 2023-10-04 14:36:21,505 INFO [train.py:1046] (2/4) Epoch 48, batch 3650, loss[loss=0.201, simple_loss=0.274, pruned_loss=0.06399, over 19339.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2325, pruned_loss=0.03615, over 4696646.46 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:36:22,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 14:36:24,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:36:24,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1688800.0, ans=0.125 2023-10-04 14:36:28,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 14:36:29,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 14:36:35,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:36:35,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:36:35,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:36:40,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:36:40,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:36:42,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 14:36:43,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:36:44,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:36:45,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 14:36:46,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:36:46,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:36:46,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:36:49,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:36:51,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 14:36:52,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 14:36:53,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:36:55,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 14:36:57,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:36:57,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:36:58,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1688933.3333333333, ans=0.125 2023-10-04 14:37:03,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:37:05,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:37:05,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:37:06,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:37:06,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:37:09,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:37:12,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:37:12,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:14,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:37:16,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:37:17,528 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:37:17,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:37:21,745 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 14:37:25,848 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:37:25,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:37:27,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:37:27,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:28,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:37:29,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:31,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 14:37:31,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:34,168 INFO [train.py:1046] (2/4) Epoch 48, batch 3700, loss[loss=0.1666, simple_loss=0.2413, pruned_loss=0.04594, over 22766.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2342, pruned_loss=0.03673, over 4705349.28 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:37:34,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:37:35,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:37:36,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:37:39,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:39,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 14:37:39,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:41,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:37:42,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:37:46,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:37:51,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:37:51,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:37:52,596 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.073e+02 2.286e+02 2.591e+02 3.912e+02, threshold=4.572e+02, percent-clipped=0.0 2023-10-04 14:37:52,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:37:52,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:53,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:37:55,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:37:56,945 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 14:38:03,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:38:04,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:38:05,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:38:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 14:38:05,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:38:10,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:12,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 14:38:13,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:13,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:38:17,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:17,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:38:19,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:38:24,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:38:24,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 14:38:24,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:38:24,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 14:38:29,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:38:31,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:38:33,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:38:33,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 14:38:35,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:38:35,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:38:35,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-10-04 14:38:36,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:38:36,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:38:39,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:38:39,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1689400.0, ans=0.125 2023-10-04 14:38:40,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 14:38:41,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 14:38:43,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:38:43,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:38:45,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:38:45,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:38:46,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:48,516 INFO [train.py:1046] (2/4) Epoch 48, batch 3750, loss[loss=0.131, simple_loss=0.2109, pruned_loss=0.02554, over 24326.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2343, pruned_loss=0.03678, over 4709289.33 frames. ], batch size: 56, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:38:49,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:38:49,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:38:51,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 14:38:54,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 14:38:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:38:56,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 14:38:56,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1689466.6666666667, ans=0.1 2023-10-04 14:38:57,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:38:58,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:39:00,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:39:01,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:39:04,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:39:07,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:39:08,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:39:10,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:39:10,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1689533.3333333333, ans=0.125 2023-10-04 14:39:12,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:39:12,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 14:39:12,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:39:16,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:39:16,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:39:19,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 14:39:22,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 14:39:24,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:39:24,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:39:27,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:39:28,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1689600.0, ans=0.125 2023-10-04 14:39:31,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:39:32,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:39:35,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 14:39:35,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1689666.6666666667, ans=0.0 2023-10-04 14:39:36,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:39:40,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:39:40,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1689666.6666666667, ans=0.0 2023-10-04 14:39:42,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:39:45,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1689733.3333333333, ans=0.125 2023-10-04 14:39:46,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:39:49,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:39:51,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:39:54,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:39:55,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:39:56,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:40:00,938 INFO [train.py:1046] (2/4) Epoch 48, batch 3800, loss[loss=0.1453, simple_loss=0.2117, pruned_loss=0.03941, over 23450.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2354, pruned_loss=0.03722, over 4708302.23 frames. ], batch size: 285, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:40:03,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:40:07,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:07,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:40:09,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 14:40:10,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:40:12,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:40:12,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:40:14,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 14:40:14,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:15,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1689866.6666666667, ans=0.125 2023-10-04 14:40:16,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:40:16,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1689866.6666666667, ans=0.125 2023-10-04 14:40:17,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:40:17,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:40:17,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:18,858 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.051e+02 2.338e+02 2.944e+02 4.276e+02, threshold=4.676e+02, percent-clipped=0.0 2023-10-04 14:40:20,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 14:40:21,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.71 vs. limit=15.0 2023-10-04 14:40:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 14:40:26,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:40:27,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:40:29,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:40:30,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:40:30,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:40:30,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:32,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:33,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:39,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:40:39,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 14:40:40,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:40:40,821 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:40:41,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.79 vs. limit=15.0 2023-10-04 14:40:47,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:40:53,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:40:55,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 14:40:58,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 14:40:58,834 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:00,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:41:01,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:04,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 14:41:04,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1690066.6666666667, ans=0.1 2023-10-04 14:41:07,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 14:41:08,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 14:41:08,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:10,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:41:14,181 INFO [train.py:1046] (2/4) Epoch 48, batch 3850, loss[loss=0.1601, simple_loss=0.2488, pruned_loss=0.03571, over 24688.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03661, over 4704975.82 frames. ], batch size: 73, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:41:14,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:41:14,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:41:19,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:41:19,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 14:41:21,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:41:21,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:23,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.85 vs. limit=12.0 2023-10-04 14:41:25,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:41:27,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:29,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:41:29,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 14:41:35,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1690200.0, ans=0.0 2023-10-04 14:41:36,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:38,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:39,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:41:41,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:41:43,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:45,547 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:41:46,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:46,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:41:46,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:41:50,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:41:51,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:51,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:41:52,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 14:41:52,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 14:41:53,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:41:53,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:56,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:41:56,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:56,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 14:41:57,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.52 vs. limit=15.0 2023-10-04 14:41:59,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 14:42:00,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:01,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 14:42:03,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:42:04,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1690333.3333333333, ans=0.125 2023-10-04 14:42:06,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.25 vs. limit=6.0 2023-10-04 14:42:08,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:08,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:42:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:15,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 14:42:18,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 14:42:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:21,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:24,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:42:24,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:42:24,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:24,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1690400.0, ans=0.1 2023-10-04 14:42:24,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1690400.0, ans=0.1 2023-10-04 14:42:25,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:25,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:42:25,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 14:42:26,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:42:28,215 INFO [train.py:1046] (2/4) Epoch 48, batch 3900, loss[loss=0.1749, simple_loss=0.2608, pruned_loss=0.04447, over 24078.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2332, pruned_loss=0.03627, over 4705661.80 frames. ], batch size: 80, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:42:28,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 14:42:28,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.85 vs. limit=22.5 2023-10-04 14:42:30,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:30,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:31,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:42:31,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:33,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:42:34,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:34,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:35,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:42:35,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 14:42:35,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:40,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:42:40,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1690466.6666666667, ans=0.0 2023-10-04 14:42:41,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:42:41,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:42:42,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:42:44,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:42:44,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:44,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.44 vs. limit=12.0 2023-10-04 14:42:45,738 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.750e+02 2.072e+02 2.246e+02 2.580e+02 4.191e+02, threshold=4.491e+02, percent-clipped=0.0 2023-10-04 14:42:47,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:42:47,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 14:42:47,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:42:48,664 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=7.20 vs. limit=12.0 2023-10-04 14:42:49,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 14:42:50,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:51,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 14:42:53,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 14:42:53,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1690533.3333333333, ans=0.125 2023-10-04 14:42:54,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1690533.3333333333, ans=0.125 2023-10-04 14:42:57,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:42:57,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:42:57,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:42:59,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:03,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:43:04,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:43:06,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:43:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:43:07,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.08 vs. limit=15.0 2023-10-04 14:43:07,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:43:13,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:43:13,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:43:20,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:43:22,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:43:26,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1690733.3333333333, ans=0.0 2023-10-04 14:43:31,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:43:34,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:34,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 14:43:35,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 14:43:35,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:35,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 14:43:35,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1690733.3333333333, ans=0.2 2023-10-04 14:43:38,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:43:38,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 14:43:41,087 INFO [train.py:1046] (2/4) Epoch 48, batch 3950, loss[loss=0.1526, simple_loss=0.2299, pruned_loss=0.03767, over 23739.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2322, pruned_loss=0.03607, over 4701173.35 frames. ], batch size: 212, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:43:45,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:43:45,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 14:43:47,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:43:48,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:43:50,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:43:57,840 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 14:43:59,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:43:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 14:43:59,262 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 14:43:59,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:44:02,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:44:02,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:44:03,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:44:05,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 14:44:08,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:44:09,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:44:09,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:44:09,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:44:11,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:44:11,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1690933.3333333333, ans=0.125 2023-10-04 14:44:18,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-10-04 14:44:21,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:44:21,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:44:26,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 14:44:30,422 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 14:44:30,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 14:44:31,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:44:31,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:44:38,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1691000.0, ans=0.1 2023-10-04 14:44:39,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:44:40,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:44:40,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:44:40,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:44:40,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 14:44:46,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:44:46,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:44:51,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 14:44:55,216 INFO [train.py:1046] (2/4) Epoch 48, batch 4000, loss[loss=0.1729, simple_loss=0.2486, pruned_loss=0.04856, over 23838.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2327, pruned_loss=0.03586, over 4719503.26 frames. ], batch size: 179, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:44:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:01,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1691133.3333333333, ans=0.125 2023-10-04 14:45:05,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:09,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:45:09,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:45:10,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:11,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 14:45:11,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:45:11,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 14:45:11,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:45:12,665 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 2.159e+02 2.640e+02 3.092e+02 4.998e+02, threshold=5.279e+02, percent-clipped=1.0 2023-10-04 14:45:12,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 14:45:15,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:45:17,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:45:17,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:45:17,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:45:17,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:45:18,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 14:45:19,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:45:21,076 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 14:45:21,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-04 14:45:22,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:45:23,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:25,114 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 14:45:26,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:45:26,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:45:31,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 14:45:31,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:45:33,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:45:35,210 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 14:45:36,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:45:37,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 14:45:37,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:45:39,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:39,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:45:42,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:45:43,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:45:43,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:45:46,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 14:45:46,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:47,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 14:45:52,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:45:55,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 14:45:58,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:45:59,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:46:00,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:46:02,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:05,114 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:46:06,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:46:06,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 14:46:08,404 INFO [train.py:1046] (2/4) Epoch 48, batch 4050, loss[loss=0.1401, simple_loss=0.2206, pruned_loss=0.02985, over 23636.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2335, pruned_loss=0.0362, over 4711636.53 frames. ], batch size: 149, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:46:08,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:46:09,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:09,911 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:46:11,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:46:12,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:46:14,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1691466.6666666667, ans=0.0 2023-10-04 14:46:16,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:46:20,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:46:20,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:46:22,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:46:24,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:46:27,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:28,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:46:30,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 14:46:31,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 14:46:32,745 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 14:46:34,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:46:40,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 14:46:41,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:46:43,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:45,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:46,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:46:46,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:48,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1691600.0, ans=0.1 2023-10-04 14:46:50,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:46:53,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 14:46:53,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:46:54,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:46:56,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 14:47:00,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:47:07,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 14:47:08,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:47:08,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:47:10,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 14:47:10,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 14:47:10,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:11,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:47:13,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:13,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:47:18,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 14:47:18,851 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 14:47:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 14:47:21,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 14:47:21,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:23,030 INFO [train.py:1046] (2/4) Epoch 48, batch 4100, loss[loss=0.1991, simple_loss=0.2704, pruned_loss=0.0639, over 20022.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.235, pruned_loss=0.03664, over 4719470.46 frames. ], batch size: 389, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:47:23,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:23,137 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:23,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:47:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 14:47:25,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:47:27,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:47:27,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:28,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:47:30,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1691800.0, ans=0.0 2023-10-04 14:47:33,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:47:33,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:47:33,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:47:33,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 14:47:36,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:36,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:47:36,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:47:36,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:47:36,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 14:47:40,385 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.712e+02 2.089e+02 2.278e+02 2.511e+02 3.603e+02, threshold=4.556e+02, percent-clipped=0.0 2023-10-04 14:47:40,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:47:42,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 14:47:43,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:47:47,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:47:47,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 14:47:47,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:47:48,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:47:48,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:47:51,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 14:47:52,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:47:54,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:47:56,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 14:47:56,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:57,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.71 vs. limit=15.0 2023-10-04 14:47:58,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:48:00,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:48:05,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:05,263 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:48:05,613 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-04 14:48:10,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:48:11,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:48:19,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1692000.0, ans=0.125 2023-10-04 14:48:20,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:20,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:48:23,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:48:26,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:48:29,345 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:48:30,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:48:31,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:48:31,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:48:31,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:48:32,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1692066.6666666667, ans=0.125 2023-10-04 14:48:34,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 14:48:35,905 INFO [train.py:1046] (2/4) Epoch 48, batch 4150, loss[loss=0.1532, simple_loss=0.2467, pruned_loss=0.02986, over 24659.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2348, pruned_loss=0.0363, over 4736115.23 frames. ], batch size: 73, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:48:35,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:36,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 14:48:37,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 14:48:37,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 14:48:37,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1692133.3333333333, ans=0.2 2023-10-04 14:48:39,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:43,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:48:43,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:48,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:48:49,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:48:49,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:48:49,667 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:48:50,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:48:50,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:48:52,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:48:56,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:49:00,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:49:01,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 14:49:03,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 14:49:03,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:49:04,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 14:49:04,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:49:04,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:49:08,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:10,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:49:13,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 14:49:13,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1692266.6666666667, ans=0.2 2023-10-04 14:49:15,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:49:15,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.62 vs. limit=10.0 2023-10-04 14:49:17,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:49:17,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 14:49:17,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:49:19,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 14:49:22,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:49:22,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:49:24,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:25,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 14:49:25,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:49:25,541 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:49:28,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:49:29,142 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.24 vs. limit=15.0 2023-10-04 14:49:31,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 14:49:31,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:31,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:49:31,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:49:31,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 14:49:31,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:49:32,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:49:32,610 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:49:32,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1692333.3333333333, ans=0.125 2023-10-04 14:49:33,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:33,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 14:49:34,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:49:34,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1692400.0, ans=0.0 2023-10-04 14:49:39,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=15.0 2023-10-04 14:49:41,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:49:44,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 14:49:45,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:49:46,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1692400.0, ans=0.125 2023-10-04 14:49:46,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1692400.0, ans=0.5 2023-10-04 14:49:48,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:49:50,109 INFO [train.py:1046] (2/4) Epoch 48, batch 4200, loss[loss=0.1388, simple_loss=0.2259, pruned_loss=0.02588, over 24313.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2328, pruned_loss=0.03593, over 4712751.53 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:49:50,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:49:50,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:49:50,226 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:49:51,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1692466.6666666667, ans=0.0 2023-10-04 14:49:53,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 14:49:54,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 14:49:56,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:49:58,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:50:01,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:50:03,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:50:06,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:50:06,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:06,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 14:50:06,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:50:08,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:08,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:50:08,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:50:09,683 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.019e+02 2.306e+02 2.735e+02 4.885e+02, threshold=4.613e+02, percent-clipped=1.0 2023-10-04 14:50:09,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:50:10,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.82 vs. limit=15.0 2023-10-04 14:50:11,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 14:50:11,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:16,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:50:17,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:50:20,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:50:21,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:50:22,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:50:22,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 14:50:22,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:50:24,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:50:29,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:50:31,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:50:36,354 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-04 14:50:39,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:50:42,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 14:50:46,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:50:52,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:50:53,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:50:54,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 14:50:59,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.92 vs. limit=15.0 2023-10-04 14:51:01,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:51:02,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1692800.0, ans=0.1 2023-10-04 14:51:03,757 INFO [train.py:1046] (2/4) Epoch 48, batch 4250, loss[loss=0.1446, simple_loss=0.2176, pruned_loss=0.03581, over 23662.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2322, pruned_loss=0.03541, over 4724538.32 frames. ], batch size: 232, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:51:03,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:51:03,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:51:06,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:10,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:51:11,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1692800.0, ans=0.125 2023-10-04 14:51:12,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 14:51:12,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:51:13,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:18,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:51:21,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:21,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:24,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:51:24,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:51:25,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:25,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:27,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:29,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:51:32,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:51:32,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=1692933.3333333333, ans=0.2 2023-10-04 14:51:33,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 14:51:37,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 14:51:37,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:37,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:51:38,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:39,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:51:39,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:40,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:42,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:51:43,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:51:47,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:51:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:51:50,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 14:51:50,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:51:51,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 14:51:52,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:51:54,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:51:55,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:55,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:51:59,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 14:52:01,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:52:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:52:05,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:52:09,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:52:10,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:52:12,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:52:13,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:52:15,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:52:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:52:16,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 14:52:17,682 INFO [train.py:1046] (2/4) Epoch 48, batch 4300, loss[loss=0.1564, simple_loss=0.2378, pruned_loss=0.03749, over 23980.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2322, pruned_loss=0.03548, over 4725000.55 frames. ], batch size: 80, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:52:17,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:52:21,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:52:21,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1693133.3333333333, ans=0.125 2023-10-04 14:52:22,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:52:24,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1693133.3333333333, ans=0.2 2023-10-04 14:52:27,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:52:33,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:52:33,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 14:52:35,245 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:52:35,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:52:36,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:52:36,725 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 14:52:37,908 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.092e+02 2.342e+02 2.811e+02 4.039e+02, threshold=4.683e+02, percent-clipped=0.0 2023-10-04 14:52:40,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:52:42,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:52:44,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 14:52:44,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:52:44,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1693200.0, ans=0.125 2023-10-04 14:52:46,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 14:52:49,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:52:49,551 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:52:52,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:52:52,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:52:53,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:52:53,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:52:56,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:52:56,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 14:52:58,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 14:52:58,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1693266.6666666667, ans=0.125 2023-10-04 14:52:59,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:53:03,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:03,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:53:03,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:03,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:53:03,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 14:53:03,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 14:53:03,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 14:53:03,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1693333.3333333333, ans=0.125 2023-10-04 14:53:04,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:53:05,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 14:53:05,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 14:53:09,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:53:11,404 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 14:53:11,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:53:12,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:12,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:53:16,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 14:53:16,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:53:16,247 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:17,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:53:17,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:53:17,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:53:17,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1693400.0, ans=0.125 2023-10-04 14:53:21,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:53:24,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:25,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:53:30,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 14:53:30,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:53:31,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1693466.6666666667, ans=0.0 2023-10-04 14:53:31,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1693466.6666666667, ans=0.125 2023-10-04 14:53:32,111 INFO [train.py:1046] (2/4) Epoch 48, batch 4350, loss[loss=0.1637, simple_loss=0.2392, pruned_loss=0.0441, over 22802.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2325, pruned_loss=0.03558, over 4721915.64 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:53:32,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1693466.6666666667, ans=0.125 2023-10-04 14:53:33,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1693466.6666666667, ans=0.1 2023-10-04 14:53:34,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:53:37,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:40,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:53:40,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:53:44,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:53:46,606 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.66 vs. limit=22.5 2023-10-04 14:53:49,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:50,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:53:50,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:53:53,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:53:54,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:53:56,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1693533.3333333333, ans=0.125 2023-10-04 14:53:57,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:54:03,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 14:54:05,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:07,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:08,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1693600.0, ans=0.125 2023-10-04 14:54:10,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:12,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 14:54:15,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:16,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:54:19,737 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 14:54:22,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:54:23,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:54:24,354 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 14:54:25,657 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 14:54:25,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:54:25,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:26,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:54:28,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:54:29,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:54:29,778 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:54:31,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 14:54:31,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:31,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:33,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:33,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 14:54:34,655 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 14:54:34,661 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 14:54:34,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 14:54:34,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1693733.3333333333, ans=0.125 2023-10-04 14:54:39,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:54:39,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:54:39,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:54:40,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:54:41,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 14:54:43,431 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 14:54:43,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:44,716 INFO [train.py:1046] (2/4) Epoch 48, batch 4400, loss[loss=0.1363, simple_loss=0.2144, pruned_loss=0.02905, over 24630.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2338, pruned_loss=0.03614, over 4720226.34 frames. ], batch size: 60, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:54:46,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:54:46,211 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:47,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:50,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 14:54:50,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 14:54:51,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 14:54:51,746 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 14:54:53,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:54:53,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:54:56,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 14:54:58,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:59,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:59,086 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 14:55:01,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1693866.6666666667, ans=0.5 2023-10-04 14:55:03,067 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.088e+02 2.355e+02 2.772e+02 4.860e+02, threshold=4.710e+02, percent-clipped=1.0 2023-10-04 14:55:03,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:03,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 14:55:05,666 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 14:55:08,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 14:55:08,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 14:55:09,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 14:55:09,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:10,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:55:10,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.83 vs. limit=15.0 2023-10-04 14:55:11,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:55:11,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:55:12,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 14:55:12,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 14:55:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:14,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1693933.3333333333, ans=0.125 2023-10-04 14:55:17,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:55:17,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:55:17,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1693933.3333333333, ans=0.2 2023-10-04 14:55:18,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1693933.3333333333, ans=0.125 2023-10-04 14:55:19,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:19,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:19,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 14:55:21,115 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 14:55:25,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:31,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:55:33,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 14:55:36,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:55:36,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1694000.0, ans=0.125 2023-10-04 14:55:39,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:55:40,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1694000.0, ans=0.0 2023-10-04 14:55:43,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:55:43,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 14:55:44,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:55:44,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:55:44,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:55:45,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:55:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 14:55:50,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 14:55:51,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 14:55:51,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:55:51,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 14:55:52,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:55:56,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:55:57,896 INFO [train.py:1046] (2/4) Epoch 48, batch 4450, loss[loss=0.1631, simple_loss=0.2362, pruned_loss=0.04505, over 22811.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03672, over 4710458.87 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:55:58,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 14:55:58,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1694133.3333333333, ans=0.0 2023-10-04 14:56:02,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:56:04,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:05,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:56:12,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:12,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:56:14,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:16,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:56:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:56:17,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:56:19,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 14:56:19,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:56:19,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:20,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:56:20,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:56:20,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1694200.0, ans=0.0 2023-10-04 14:56:23,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:56:27,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:28,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:30,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:56:30,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:56:32,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:56:37,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:56:38,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 14:56:38,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 14:56:38,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:56:40,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:41,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 14:56:47,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:56:50,164 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:50,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 14:56:51,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:51,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:56:51,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:56:51,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:54,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:56,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:56:56,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 14:56:58,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:57:01,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:57:01,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:57:03,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:57:04,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:57:04,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:57:09,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 14:57:09,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:57:11,971 INFO [train.py:1046] (2/4) Epoch 48, batch 4500, loss[loss=0.1399, simple_loss=0.207, pruned_loss=0.03645, over 22813.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2353, pruned_loss=0.03699, over 4717034.42 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:57:14,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:57:15,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 14:57:15,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 14:57:18,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:57:19,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1694466.6666666667, ans=10.0 2023-10-04 14:57:22,223 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:57:22,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:57:22,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1694466.6666666667, ans=0.2 2023-10-04 14:57:23,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:57:23,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:57:25,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:57:25,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:57:30,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.084e+02 2.253e+02 2.565e+02 3.939e+02, threshold=4.505e+02, percent-clipped=0.0 2023-10-04 14:57:35,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1694533.3333333333, ans=0.0 2023-10-04 14:57:38,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:57:38,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:57:41,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:57:41,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:57:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:57:49,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:57:52,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:57:57,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:57:59,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:57:59,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 14:58:01,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:01,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:03,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:03,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:58:05,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:58:06,350 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 14:58:06,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:58:06,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:09,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:58:10,988 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:58:12,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:16,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:58:16,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:58:18,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 14:58:19,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 14:58:19,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 14:58:23,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 14:58:25,456 INFO [train.py:1046] (2/4) Epoch 48, batch 4550, loss[loss=0.1288, simple_loss=0.185, pruned_loss=0.03631, over 19569.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2342, pruned_loss=0.03691, over 4707206.55 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:58:26,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 14:58:28,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:58:31,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:58:31,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:58:34,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:58:37,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:58:39,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:40,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:58:40,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:58:40,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:40,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1694866.6666666667, ans=0.125 2023-10-04 14:58:45,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:58:45,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:58:49,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:58:52,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 14:58:52,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 14:58:52,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:58:53,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1694933.3333333333, ans=0.1 2023-10-04 14:58:55,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 14:58:57,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 14:58:59,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:59:00,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 14:59:02,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:59:06,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:07,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:07,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:59:08,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 14:59:10,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1695000.0, ans=0.125 2023-10-04 14:59:11,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:59:13,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:13,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:59:14,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:59:14,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1695000.0, ans=0.125 2023-10-04 14:59:16,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 14:59:17,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 14:59:17,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:59:17,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 14:59:20,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 14:59:20,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:59:22,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:23,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:59:23,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:23,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:59:24,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:59:26,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 14:59:27,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:59:27,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 14:59:27,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 14:59:27,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:59:27,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 14:59:30,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:59:30,489 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:59:32,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:59:32,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:33,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:59:35,833 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:59:37,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:59:38,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:40,350 INFO [train.py:1046] (2/4) Epoch 48, batch 4600, loss[loss=0.1595, simple_loss=0.2355, pruned_loss=0.04178, over 23840.00 frames. ], tot_loss[loss=0.153, simple_loss=0.233, pruned_loss=0.03646, over 4705771.44 frames. ], batch size: 179, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:59:40,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:59:43,168 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:59:43,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:59:44,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:59:45,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 14:59:45,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:59:50,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:59:51,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:59:52,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:58,276 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.165e+02 2.492e+02 2.948e+02 4.714e+02, threshold=4.983e+02, percent-clipped=2.0 2023-10-04 14:59:58,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 15:00:00,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:02,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:06,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:00:06,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:00:10,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 15:00:10,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:00:13,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:00:13,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1695266.6666666667, ans=0.0 2023-10-04 15:00:16,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:16,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:00:19,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:00:21,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 15:00:24,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:00:24,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1695333.3333333333, ans=0.0 2023-10-04 15:00:28,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:28,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:00:31,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:31,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 15:00:31,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:33,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 15:00:33,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:33,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:33,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:35,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:00:36,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:37,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 15:00:37,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 15:00:37,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 15:00:37,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:39,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:00:39,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:40,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:51,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:00:52,548 INFO [train.py:1046] (2/4) Epoch 48, batch 4650, loss[loss=0.1319, simple_loss=0.2165, pruned_loss=0.02361, over 24366.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03618, over 4709275.47 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:00:52,944 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:00:54,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:00:54,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:55,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:00:55,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:55,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:00:56,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:01:00,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 15:01:02,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:01:05,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 15:01:05,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:01:07,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 15:01:07,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:01:07,994 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 15:01:08,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 15:01:08,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:09,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:01:10,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-04 15:01:11,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:01:12,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:12,529 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 15:01:15,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:16,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 15:01:19,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:19,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:01:20,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 15:01:20,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:01:23,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:01:26,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:01:32,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:35,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:37,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:37,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:01:42,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 15:01:42,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 15:01:42,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 15:01:42,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 15:01:43,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:01:50,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:01:50,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:01:50,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 15:01:51,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:01:53,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:01:53,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:01:53,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1695733.3333333333, ans=0.125 2023-10-04 15:01:54,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:01:57,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:01:57,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:01:57,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:02:00,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:02:00,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:02:00,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:02:01,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 15:02:01,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:02:03,316 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 15:02:06,547 INFO [train.py:1046] (2/4) Epoch 48, batch 4700, loss[loss=0.1551, simple_loss=0.2499, pruned_loss=0.03009, over 24566.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2337, pruned_loss=0.03589, over 4723142.20 frames. ], batch size: 71, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:02:13,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:14,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:02:15,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:02:17,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:02:18,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:02:23,202 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.55 vs. limit=15.0 2023-10-04 15:02:24,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 15:02:24,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 15:02:25,309 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.030e+02 2.257e+02 2.587e+02 3.872e+02, threshold=4.514e+02, percent-clipped=0.0 2023-10-04 15:02:26,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:28,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:02:28,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:02:29,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1695866.6666666667, ans=0.04949747468305833 2023-10-04 15:02:31,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:32,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1695866.6666666667, ans=0.125 2023-10-04 15:02:32,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1695866.6666666667, ans=0.125 2023-10-04 15:02:36,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:02:37,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1695933.3333333333, ans=0.0 2023-10-04 15:02:38,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 15:02:40,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:02:45,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 15:02:46,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:02:49,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:02:51,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 15:02:53,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.22 vs. limit=15.0 2023-10-04 15:02:54,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:02:59,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:03:00,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 15:03:02,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:02,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:04,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:03:05,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:03:05,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 15:03:07,084 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 15:03:08,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:09,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:09,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:09,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 15:03:11,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:15,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 15:03:18,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:03:20,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:21,228 INFO [train.py:1046] (2/4) Epoch 48, batch 4750, loss[loss=0.1683, simple_loss=0.2389, pruned_loss=0.04887, over 22806.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2339, pruned_loss=0.03612, over 4735723.66 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:03:25,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:25,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:03:26,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 15:03:26,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:03:31,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 15:03:32,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:03:33,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:33,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:03:38,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 15:03:42,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:03:43,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 15:03:44,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:03:49,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:03:49,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:03:49,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:49,464 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 15:03:49,466 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 15:03:56,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 15:03:59,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:02,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:04,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:04:04,881 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 15:04:04,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:04:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:04:09,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:04:10,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 15:04:12,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 15:04:13,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:04:13,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:04:13,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:15,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 15:04:15,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 15:04:18,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 15:04:20,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:04:24,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:04:24,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 15:04:25,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:04:27,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:04:28,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:04:30,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:31,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:04:33,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:04:33,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 15:04:34,293 INFO [train.py:1046] (2/4) Epoch 48, batch 4800, loss[loss=0.1714, simple_loss=0.2557, pruned_loss=0.04353, over 24634.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2347, pruned_loss=0.0368, over 4724235.76 frames. ], batch size: 73, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:04:34,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 15:04:37,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 15:04:37,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1696466.6666666667, ans=0.125 2023-10-04 15:04:38,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:04:38,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:04:40,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 15:04:41,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1696466.6666666667, ans=0.2 2023-10-04 15:04:44,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:44,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:04:49,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:04:50,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:51,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:51,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 15:04:51,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:04:53,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:04:55,216 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.147e+02 2.492e+02 2.796e+02 5.306e+02, threshold=4.985e+02, percent-clipped=1.0 2023-10-04 15:04:55,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:04:59,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:01,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:01,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:05:04,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:04,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 15:05:04,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:04,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:06,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:08,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:10,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:10,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:05:12,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 15:05:12,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:15,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 15:05:15,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 15:05:17,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:17,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:05:17,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:05:17,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:05:17,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:05:20,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:05:21,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:05:23,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:05:24,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1696666.6666666667, ans=0.0 2023-10-04 15:05:26,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:28,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:05:31,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1696666.6666666667, ans=0.125 2023-10-04 15:05:32,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 15:05:33,830 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:33,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:33,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:05:35,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:38,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:05:40,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:05:40,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:40,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:05:42,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:05:42,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:05:46,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:05:46,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:46,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:48,161 INFO [train.py:1046] (2/4) Epoch 48, batch 4850, loss[loss=0.1592, simple_loss=0.2409, pruned_loss=0.03875, over 23294.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03672, over 4724719.37 frames. ], batch size: 93, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:05:48,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 15:05:51,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 15:05:51,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:51,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:51,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:05:51,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:55,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:06:02,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 15:06:03,582 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:06:09,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:06:09,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:06:10,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:06:13,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:06:13,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:06:15,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:06:15,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 15:06:18,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1696933.3333333333, ans=0.0 2023-10-04 15:06:19,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:06:21,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:06:22,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:06:22,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:06:22,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 15:06:25,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:06:25,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:25,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1696933.3333333333, ans=0.1 2023-10-04 15:06:30,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:30,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 15:06:30,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 15:06:32,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:06:37,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:06:37,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 15:06:39,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:06:39,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:06:40,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:06:40,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 15:06:40,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:42,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 15:06:43,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:06:44,583 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.79 vs. limit=12.0 2023-10-04 15:06:45,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:06:45,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 15:06:52,241 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.53 vs. limit=22.5 2023-10-04 15:06:54,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:58,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1697066.6666666667, ans=0.125 2023-10-04 15:06:58,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1697066.6666666667, ans=0.125 2023-10-04 15:07:00,051 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:07:00,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:00,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1697066.6666666667, ans=0.125 2023-10-04 15:07:03,389 INFO [train.py:1046] (2/4) Epoch 48, batch 4900, loss[loss=0.1576, simple_loss=0.2349, pruned_loss=0.04015, over 23840.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2334, pruned_loss=0.03626, over 4725300.25 frames. ], batch size: 195, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:07:06,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 15:07:06,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:07:11,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:12,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:07:12,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:07:12,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1697133.3333333333, ans=0.125 2023-10-04 15:07:15,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 15:07:16,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1697200.0, ans=0.0 2023-10-04 15:07:20,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 15:07:24,146 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.053e+02 2.279e+02 2.584e+02 3.777e+02, threshold=4.559e+02, percent-clipped=0.0 2023-10-04 15:07:24,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 15:07:24,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 15:07:25,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:07:25,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:07:25,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:07:25,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:25,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:07:27,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 15:07:28,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 15:07:30,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:07:30,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:07:32,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:07:34,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-10-04 15:07:35,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:07:35,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:36,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:36,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 15:07:36,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1697266.6666666667, ans=0.125 2023-10-04 15:07:37,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:07:39,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:39,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 15:07:39,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 15:07:45,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 15:07:45,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1697266.6666666667, ans=0.125 2023-10-04 15:07:47,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1697333.3333333333, ans=0.125 2023-10-04 15:07:48,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:07:48,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:07:49,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:07:49,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:50,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 15:07:50,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:07:51,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 15:07:53,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:07:57,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:08:00,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 15:08:01,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:08:01,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 15:08:02,547 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.46 vs. limit=22.5 2023-10-04 15:08:03,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 15:08:08,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:08:09,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:08:10,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 15:08:10,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:08:10,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:08:12,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:08:18,226 INFO [train.py:1046] (2/4) Epoch 48, batch 4950, loss[loss=0.1365, simple_loss=0.2109, pruned_loss=0.03106, over 23663.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2326, pruned_loss=0.0359, over 4721038.30 frames. ], batch size: 232, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:08:18,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:08:18,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:08:18,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:08:18,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 15:08:21,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:08:24,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:08:24,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:08:27,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 15:08:27,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 15:08:27,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:08:28,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 15:08:28,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:28,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:08:28,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:08:30,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:31,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:08:31,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:08:35,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:08:36,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:08:37,293 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=22.5 2023-10-04 15:08:38,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:39,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:08:42,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:08:46,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:46,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:08:48,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:49,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:49,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:08:51,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 15:08:53,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 15:08:55,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:57,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:08:57,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:08:57,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:08:57,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:08:58,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:09:01,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:09:05,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:09:06,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:09:07,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:08,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:09,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 15:09:10,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:09:11,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:09:16,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:09:17,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:09:17,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:09:17,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:18,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:09:20,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:09:20,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:09:22,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:09:22,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:09:22,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1697733.3333333333, ans=0.125 2023-10-04 15:09:23,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 15:09:27,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:09:32,405 INFO [train.py:1046] (2/4) Epoch 48, batch 5000, loss[loss=0.1442, simple_loss=0.2304, pruned_loss=0.02897, over 23608.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2328, pruned_loss=0.0358, over 4721567.13 frames. ], batch size: 149, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:09:32,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 15:09:32,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:09:38,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:38,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:09:39,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 15:09:41,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 15:09:45,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:09:46,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 15:09:46,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:09:46,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:09:47,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 15:09:49,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:49,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:09:51,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 15:09:51,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:09:51,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:09:52,449 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.112e+02 2.381e+02 2.670e+02 4.654e+02, threshold=4.763e+02, percent-clipped=1.0 2023-10-04 15:09:52,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 15:09:52,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 15:09:52,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:09:53,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 15:09:53,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:09:54,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:09:55,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:09:55,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 15:09:55,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 15:09:58,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 15:09:58,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:59,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:09:59,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 15:09:59,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:10:01,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:10:02,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:10:04,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 15:10:05,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 15:10:05,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:10:07,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:10:10,213 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 15:10:12,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:10:15,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:10:15,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:15,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1698000.0, ans=0.0 2023-10-04 15:10:17,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 15:10:17,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:10:18,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:10:19,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:10:21,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 15:10:22,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1698000.0, ans=0.5 2023-10-04 15:10:22,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1698000.0, ans=0.2 2023-10-04 15:10:23,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:10:24,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-10-04 15:10:24,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:10:26,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:10:30,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 15:10:34,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:35,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.40 vs. limit=6.0 2023-10-04 15:10:41,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:10:43,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:43,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:10:43,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=1698066.6666666667, ans=0.2 2023-10-04 15:10:44,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:10:44,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:10:44,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:10:45,698 INFO [train.py:1046] (2/4) Epoch 48, batch 5050, loss[loss=0.2082, simple_loss=0.2744, pruned_loss=0.07103, over 19209.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2333, pruned_loss=0.036, over 4714065.38 frames. ], batch size: 388, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:10:45,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:49,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:49,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 15:10:50,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:10:52,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:10:53,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:10:54,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 15:10:55,167 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:10:55,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:10:57,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:10:58,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1698133.3333333333, ans=0.0 2023-10-04 15:10:59,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:10:59,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:11:03,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1698200.0, ans=0.0 2023-10-04 15:11:09,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 15:11:10,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:11:10,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:11:12,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 15:11:12,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:11:13,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:13,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:11:14,007 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:11:15,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:11:15,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 15:11:16,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 15:11:16,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:18,734 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.93 vs. limit=15.0 2023-10-04 15:11:19,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:11:21,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1698266.6666666667, ans=0.0 2023-10-04 15:11:22,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:22,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 15:11:24,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:11:27,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 15:11:28,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:11:28,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:11:30,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:11:30,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:11:33,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:11:36,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:11:36,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:36,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:11:36,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:11:36,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 15:11:38,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:11:39,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:11:43,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:11:43,854 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 15:11:43,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:11:45,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:11:46,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:46,701 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 15:11:49,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:11:49,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 15:11:49,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:54,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:11:54,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:54,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 15:11:55,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 15:11:57,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:11:57,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:11:57,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1698400.0, ans=0.05 2023-10-04 15:11:59,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:12:00,419 INFO [train.py:1046] (2/4) Epoch 48, batch 5100, loss[loss=0.1556, simple_loss=0.2261, pruned_loss=0.04258, over 22749.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2339, pruned_loss=0.03623, over 4715439.52 frames. ], batch size: 322, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:12:03,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 15:12:04,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:12:07,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 15:12:08,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1698466.6666666667, ans=15.0 2023-10-04 15:12:09,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 15:12:09,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:12:11,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:12:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:12:14,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 15:12:14,318 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 15:12:18,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:12:19,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:12:20,880 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.048e+02 2.253e+02 2.620e+02 3.978e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-04 15:12:22,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:12:26,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 15:12:28,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:12:31,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:12:31,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 15:12:33,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1698600.0, ans=0.125 2023-10-04 15:12:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:35,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:35,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 15:12:36,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.89 vs. limit=15.0 2023-10-04 15:12:37,164 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 15:12:37,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:38,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 15:12:38,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 15:12:41,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:12:42,243 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.63 vs. limit=22.5 2023-10-04 15:12:50,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:12:52,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 15:12:52,071 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 15:12:53,304 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 15:12:54,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 15:12:54,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:57,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 15:13:02,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 15:13:03,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 15:13:05,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:13:07,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 15:13:09,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:13:10,758 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 15:13:14,211 INFO [train.py:1046] (2/4) Epoch 48, batch 5150, loss[loss=0.1562, simple_loss=0.234, pruned_loss=0.03924, over 23390.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03674, over 4712875.25 frames. ], batch size: 119, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:13:14,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1698800.0, ans=0.0 2023-10-04 15:13:16,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:13:16,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:13:16,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:13:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:13:17,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:13:18,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:13:18,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 15:13:18,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 15:13:20,252 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 15:13:20,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:13:20,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 15:13:21,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:13:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 15:13:23,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1698800.0, ans=0.0 2023-10-04 15:13:24,290 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:13:25,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:13:28,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:13:28,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 15:13:31,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:13:33,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:13:33,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1698866.6666666667, ans=0.0 2023-10-04 15:13:34,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:13:34,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:13:34,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:13:36,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:13:36,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:13:36,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 15:13:36,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1698866.6666666667, ans=0.1 2023-10-04 15:13:37,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:13:37,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:13:39,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:13:41,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 15:13:41,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:13:42,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1698933.3333333333, ans=0.125 2023-10-04 15:13:48,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:13:49,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 15:13:53,799 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:13:55,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1698933.3333333333, ans=0.0 2023-10-04 15:13:59,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:14:01,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:14:04,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:05,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:14:07,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 15:14:09,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:14:10,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:14:11,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:14:14,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:14,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:14:15,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 15:14:16,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1699066.6666666667, ans=0.1 2023-10-04 15:14:21,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:14:23,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:14:24,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:14:25,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:14:25,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:14:26,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1699133.3333333333, ans=0.0 2023-10-04 15:14:27,617 INFO [train.py:1046] (2/4) Epoch 48, batch 5200, loss[loss=0.1377, simple_loss=0.2197, pruned_loss=0.02785, over 24567.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2355, pruned_loss=0.03676, over 4709960.00 frames. ], batch size: 60, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:14:27,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:14:27,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:14:27,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:14:30,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:14:31,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:14:32,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.80 vs. limit=15.0 2023-10-04 15:14:36,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:14:39,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 15:14:39,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1699133.3333333333, ans=0.09899494936611666 2023-10-04 15:14:41,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:14:41,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:14:43,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:14:44,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:14:44,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:14:44,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1699200.0, ans=0.0 2023-10-04 15:14:46,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 15:14:48,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:14:49,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.200e+02 2.450e+02 2.998e+02 4.674e+02, threshold=4.900e+02, percent-clipped=1.0 2023-10-04 15:14:49,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:51,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 15:14:52,829 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:14:55,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:14:55,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:14:56,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 15:14:56,601 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 15:14:59,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 15:14:59,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:15:00,001 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 15:15:00,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:15:01,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:02,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:15:02,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 15:15:02,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:15:07,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:15:07,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1699266.6666666667, ans=0.125 2023-10-04 15:15:09,125 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 15:15:10,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 15:15:10,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 15:15:13,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 15:15:14,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:15:21,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:15:21,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 15:15:22,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:15:23,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1699333.3333333333, ans=0.125 2023-10-04 15:15:24,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:15:24,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:24,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:15:26,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:15:27,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1699400.0, ans=0.125 2023-10-04 15:15:28,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:15:31,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:15:33,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:15:33,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:33,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-10-04 15:15:37,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:38,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 15:15:39,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:15:40,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:15:40,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:41,700 INFO [train.py:1046] (2/4) Epoch 48, batch 5250, loss[loss=0.1543, simple_loss=0.2321, pruned_loss=0.03827, over 23692.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2346, pruned_loss=0.03665, over 4706747.67 frames. ], batch size: 149, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:15:41,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:15:41,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:15:44,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:15:46,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:15:46,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1699466.6666666667, ans=0.0 2023-10-04 15:15:48,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:15:49,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:15:51,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1699466.6666666667, ans=0.125 2023-10-04 15:15:54,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:55,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:15:59,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:16:00,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:16:04,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 15:16:06,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:16:07,534 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:16:16,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1699600.0, ans=0.2 2023-10-04 15:16:23,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1699666.6666666667, ans=0.0 2023-10-04 15:16:26,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1699666.6666666667, ans=0.0 2023-10-04 15:16:36,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1699733.3333333333, ans=0.2 2023-10-04 15:16:39,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1699733.3333333333, ans=0.2 2023-10-04 15:16:50,661 INFO [train.py:1046] (2/4) Epoch 48, batch 5300, loss[loss=0.1467, simple_loss=0.2286, pruned_loss=0.03239, over 24249.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.233, pruned_loss=0.03632, over 4703212.62 frames. ], batch size: 61, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:16:52,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1699800.0, ans=0.125 2023-10-04 15:16:54,109 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.72 vs. limit=6.0 2023-10-04 15:17:02,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1699866.6666666667, ans=0.0 2023-10-04 15:17:04,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:17:04,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 15:17:04,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 15:17:04,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:05,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:05,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:05,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:05,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:05,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:05,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:05,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:17:05,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:17:05,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 15:17:05,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 15:17:05,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 15:17:05,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:17:05,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 15:17:05,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 15:17:06,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:06,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:06,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:17:06,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:17:06,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:17:07,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:17:07,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:07,235 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:07,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:17:07,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:07,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:17:07,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:07,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:17:07,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 15:17:07,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:17:08,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:08,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 15:17:08,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 15:17:08,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:17:08,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:08,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 15:17:08,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 15:17:08,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:17:09,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:17:09,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:17:09,663 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 15:17:09,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 15:17:09,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:17:09,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:09,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 15:17:09,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 15:17:10,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 15:17:10,170 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:17:16,733 INFO [train.py:1046] (2/4) Epoch 49, batch 0, loss[loss=0.1367, simple_loss=0.2167, pruned_loss=0.02833, over 23719.00 frames. ], tot_loss[loss=0.1367, simple_loss=0.2167, pruned_loss=0.02833, over 23719.00 frames. ], batch size: 232, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:17:16,733 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 15:17:29,906 INFO [train.py:1078] (2/4) Epoch 49, validation: loss=0.3215, simple_loss=0.2741, pruned_loss=0.1844, over 1125622.00 frames. 2023-10-04 15:17:29,907 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 15:17:32,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 15:17:32,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:17:33,994 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.045e+02 2.321e+02 2.638e+02 8.969e+02, threshold=4.643e+02, percent-clipped=2.0 2023-10-04 15:17:34,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:17:38,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:38,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:17:39,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:39,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 15:17:41,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 15:17:41,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1699880.0, ans=0.125 2023-10-04 15:17:42,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1699946.6666666667, ans=0.0 2023-10-04 15:17:43,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:43,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:46,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:46,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:48,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:48,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:48,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:49,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:17:49,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:17:51,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 15:17:54,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:18:02,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:18:02,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:18:05,662 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 15:18:05,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1700013.3333333333, ans=0.0 2023-10-04 15:18:08,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1700013.3333333333, ans=0.125 2023-10-04 15:18:11,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:18:11,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:18:12,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:15,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:18:16,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1700080.0, ans=0.125 2023-10-04 15:18:19,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:19,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.73 vs. limit=22.5 2023-10-04 15:18:23,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 15:18:27,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 15:18:28,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:18:28,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:28,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:18:30,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:18:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 15:18:34,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:35,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:38,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1700146.6666666667, ans=0.0 2023-10-04 15:18:39,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.47 vs. limit=15.0 2023-10-04 15:18:39,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:18:42,435 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 15:18:42,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:18:43,768 INFO [train.py:1046] (2/4) Epoch 49, batch 50, loss[loss=0.154, simple_loss=0.236, pruned_loss=0.03602, over 23412.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2334, pruned_loss=0.03665, over 1064196.53 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:18:45,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:18:47,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:18:47,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 15:18:49,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:18:50,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:18:52,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:18:53,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:18:54,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:18:58,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 15:18:58,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:05,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:19:06,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 15:19:07,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 15:19:09,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:19:09,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:19:09,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:09,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1700280.0, ans=0.0 2023-10-04 15:19:10,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:19:10,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:19:10,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1700280.0, ans=0.0 2023-10-04 15:19:12,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:19:12,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:18,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:19:20,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:19:21,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:19:22,291 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-10-04 15:19:22,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 15:19:25,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:19:25,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:19:27,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 15:19:27,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:19:29,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 15:19:38,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:19:38,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:19:39,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:19:41,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:19:41,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:19:42,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 15:19:44,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 15:19:44,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:19:45,631 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:19:46,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:19:48,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:19:48,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 15:19:48,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 15:19:49,919 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 15:19:51,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:19:51,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:19:53,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 15:19:53,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 15:19:54,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:19:55,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:19:56,868 INFO [train.py:1046] (2/4) Epoch 49, batch 100, loss[loss=0.1391, simple_loss=0.2144, pruned_loss=0.03191, over 20379.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2359, pruned_loss=0.0365, over 1884009.09 frames. ], batch size: 44, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:19:58,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:19:58,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:20:00,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:20:00,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1700546.6666666667, ans=0.125 2023-10-04 15:20:03,028 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.158e+02 2.509e+02 3.581e+02 6.857e+02, threshold=5.017e+02, percent-clipped=12.0 2023-10-04 15:20:03,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:20:05,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:20:06,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 15:20:06,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:20:09,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:20:09,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:20:10,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:20:10,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:20:10,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:20:11,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 15:20:14,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:20:14,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:14,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:20:14,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:20:17,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 15:20:18,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:20,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:20:20,231 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:20:22,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:20:25,732 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 15:20:27,050 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 15:20:27,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:20:27,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:20:30,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:20:34,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:34,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:38,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:38,931 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 15:20:41,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 15:20:43,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:20:44,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:20:48,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:51,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:20:51,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1700746.6666666667, ans=0.125 2023-10-04 15:20:55,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:20:56,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:20:59,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:01,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:02,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:03,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:21:03,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:04,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 15:21:06,210 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 15:21:06,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:06,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:21:08,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:08,110 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:08,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 15:21:08,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:21:09,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1700813.3333333333, ans=0.2 2023-10-04 15:21:10,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:21:10,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:10,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:11,382 INFO [train.py:1046] (2/4) Epoch 49, batch 150, loss[loss=0.1353, simple_loss=0.2169, pruned_loss=0.02691, over 24438.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.236, pruned_loss=0.03666, over 2519723.86 frames. ], batch size: 58, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:21:11,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:12,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:21:12,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:21:14,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1700880.0, ans=0.125 2023-10-04 15:21:15,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:18,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:21:18,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:18,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:19,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:22,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:21:23,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:27,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 15:21:27,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 15:21:27,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 15:21:28,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1700946.6666666667, ans=0.125 2023-10-04 15:21:31,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:21:31,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:21:33,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:21:33,953 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:33,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:33,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:36,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:38,592 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 15:21:40,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1701013.3333333333, ans=0.1 2023-10-04 15:21:41,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:45,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:50,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:21:51,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 15:21:54,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:21:54,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:54,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:21:57,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:21:58,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:22:00,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:22:01,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:01,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 15:22:05,206 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.27 vs. limit=12.0 2023-10-04 15:22:07,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:07,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:09,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:22:09,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:22:09,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:13,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 15:22:15,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:22:19,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:22:20,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:22:21,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:22:21,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 15:22:21,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:22:22,007 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 15:22:24,638 INFO [train.py:1046] (2/4) Epoch 49, batch 200, loss[loss=0.1642, simple_loss=0.2535, pruned_loss=0.0374, over 24314.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2362, pruned_loss=0.03686, over 3017902.76 frames. ], batch size: 74, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:22:26,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:22:26,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1701213.3333333333, ans=0.125 2023-10-04 15:22:28,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:22:28,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:22:31,468 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.785e+02 2.091e+02 2.483e+02 2.820e+02 4.218e+02, threshold=4.965e+02, percent-clipped=0.0 2023-10-04 15:22:32,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 15:22:34,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:22:34,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:37,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 15:22:38,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:22:41,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:41,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:44,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:22:44,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:22:46,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:01,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:23:01,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:23:01,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1701346.6666666667, ans=0.125 2023-10-04 15:23:02,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:23:04,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:23:04,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 15:23:04,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:23:04,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1701346.6666666667, ans=0.2 2023-10-04 15:23:07,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:07,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:23:08,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:23:08,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:23:10,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 15:23:10,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:23:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:13,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:23:19,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:23:27,033 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:28,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:23:35,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:37,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 15:23:37,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1701546.6666666667, ans=0.1 2023-10-04 15:23:38,498 INFO [train.py:1046] (2/4) Epoch 49, batch 250, loss[loss=0.1355, simple_loss=0.2019, pruned_loss=0.03455, over 22740.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2351, pruned_loss=0.03653, over 3391537.65 frames. ], batch size: 322, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:23:38,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:38,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:23:38,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:23:40,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:23:41,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 15:23:42,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:23:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 15:23:44,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:46,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:23:47,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:47,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:49,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:23:50,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:52,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:23:55,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:23:57,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1701613.3333333333, ans=0.125 2023-10-04 15:24:03,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:24:07,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1701680.0, ans=0.95 2023-10-04 15:24:08,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:24:08,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:24:14,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:24:16,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:24:16,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:24:16,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:24:16,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1701680.0, ans=0.0 2023-10-04 15:24:17,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:24:17,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:24:17,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:24:21,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:24:23,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 15:24:23,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:24:26,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:24:26,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:24:26,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:24:27,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:24:27,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:24:27,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:24:29,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:24:32,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:24:32,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:24:34,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.57 vs. limit=6.0 2023-10-04 15:24:36,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:24:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:24:43,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:24:46,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1701813.3333333333, ans=0.1 2023-10-04 15:24:47,438 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-10-04 15:24:47,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:24:48,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1701813.3333333333, ans=0.125 2023-10-04 15:24:49,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:24:53,840 INFO [train.py:1046] (2/4) Epoch 49, batch 300, loss[loss=0.1421, simple_loss=0.2217, pruned_loss=0.03126, over 24323.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2326, pruned_loss=0.03601, over 3678783.72 frames. ], batch size: 61, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:24:53,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 15:24:53,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:24:54,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:24:58,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 15:24:58,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:24:59,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:24:59,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 15:25:00,859 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.248e+02 2.762e+02 3.176e+02 5.077e+02, threshold=5.525e+02, percent-clipped=1.0 2023-10-04 15:25:03,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:25:03,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:25:09,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:25:10,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 15:25:10,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:25:11,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:25:11,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 15:25:11,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:25:16,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:25:20,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:25:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 15:25:20,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1701946.6666666667, ans=0.0 2023-10-04 15:25:24,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 15:25:25,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:26,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:25:29,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:29,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 15:25:29,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:25:30,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:25:32,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:25:33,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:25:37,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:25:37,789 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 15:25:39,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:25:41,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:43,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 15:25:43,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:25:47,932 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:25:50,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:25:50,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 15:25:55,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:55,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:25:56,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:59,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:25:59,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 15:25:59,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:26:00,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:00,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 15:26:02,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:26:02,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:03,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:26:03,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:03,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:08,421 INFO [train.py:1046] (2/4) Epoch 49, batch 350, loss[loss=0.1463, simple_loss=0.2147, pruned_loss=0.03892, over 23644.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2314, pruned_loss=0.03567, over 3910660.85 frames. ], batch size: 232, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:26:09,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:26:09,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 15:26:12,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:14,891 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.74 vs. limit=15.0 2023-10-04 15:26:17,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:26:21,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:21,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:22,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 15:26:23,090 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.67 vs. limit=22.5 2023-10-04 15:26:23,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:26:25,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 15:26:26,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:28,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 15:26:28,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:32,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 15:26:33,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:26:35,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:35,784 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:26:37,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:26:37,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:26:37,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:26:38,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:26:38,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:38,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:26:41,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:26:41,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:46,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.24 vs. limit=10.0 2023-10-04 15:26:48,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:26:49,004 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:26:50,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:26:50,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:55,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 15:26:55,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:59,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:59,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:00,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:27:01,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 15:27:04,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:06,175 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 15:27:08,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 15:27:08,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:10,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:27:10,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 15:27:12,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:13,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1702480.0, ans=0.2 2023-10-04 15:27:15,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:27:15,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:17,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:17,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:19,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:21,201 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=12.0 2023-10-04 15:27:22,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1702546.6666666667, ans=0.0 2023-10-04 15:27:23,457 INFO [train.py:1046] (2/4) Epoch 49, batch 400, loss[loss=0.1574, simple_loss=0.2497, pruned_loss=0.03253, over 24622.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2314, pruned_loss=0.03612, over 4078213.82 frames. ], batch size: 68, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:27:23,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:27:26,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:27:27,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 15:27:27,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:27,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:29,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:27:30,603 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.007e+02 2.289e+02 2.741e+02 4.265e+02, threshold=4.578e+02, percent-clipped=0.0 2023-10-04 15:27:30,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:32,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:33,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:34,877 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 15:27:36,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 15:27:36,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:37,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 15:27:37,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:40,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:27:40,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:27:42,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 15:27:42,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:27:42,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:42,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:27:42,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:46,956 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 15:27:47,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 15:27:51,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:54,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:55,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 15:27:56,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 15:27:56,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1702680.0, ans=0.0 2023-10-04 15:27:59,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:28:02,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:07,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 15:28:11,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:28:12,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 15:28:15,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:28:17,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:28:17,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 15:28:17,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1702746.6666666667, ans=0.125 2023-10-04 15:28:17,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1702746.6666666667, ans=0.0 2023-10-04 15:28:20,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:28:22,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.99 vs. limit=22.5 2023-10-04 15:28:23,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:28:23,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:28:26,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:26,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 15:28:28,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:28:29,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 15:28:30,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:28:30,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:28:32,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 15:28:34,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:28:35,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:28:36,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:28:38,229 INFO [train.py:1046] (2/4) Epoch 49, batch 450, loss[loss=0.1712, simple_loss=0.2525, pruned_loss=0.0449, over 24336.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2317, pruned_loss=0.03622, over 4216718.25 frames. ], batch size: 77, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:28:38,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 15:28:38,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:28:39,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:28:39,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:28:39,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 15:28:41,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:28:41,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:28:43,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:28:54,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:54,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:28:56,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 15:28:58,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 15:28:59,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:29:01,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1702946.6666666667, ans=0.05 2023-10-04 15:29:02,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:29:03,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:06,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:29:06,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1703013.3333333333, ans=0.125 2023-10-04 15:29:08,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:29:10,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 15:29:10,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 15:29:10,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=1703013.3333333333, ans=0.1 2023-10-04 15:29:12,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 15:29:12,889 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:29:14,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:14,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:29:16,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1703013.3333333333, ans=0.0 2023-10-04 15:29:17,500 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 15:29:17,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 15:29:17,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:29:20,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:29:20,870 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 15:29:23,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:29:25,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:29:25,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 15:29:26,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 15:29:29,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:29:29,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1703080.0, ans=0.1 2023-10-04 15:29:31,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:29:32,331 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:29:33,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 15:29:37,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:29:37,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 15:29:39,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 15:29:41,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:29:44,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1703146.6666666667, ans=0.125 2023-10-04 15:29:45,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:29:46,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:29:48,054 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:29:49,400 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 15:29:51,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.12 vs. limit=15.0 2023-10-04 15:29:52,629 INFO [train.py:1046] (2/4) Epoch 49, batch 500, loss[loss=0.1455, simple_loss=0.2261, pruned_loss=0.03246, over 23802.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2332, pruned_loss=0.03633, over 4327542.83 frames. ], batch size: 212, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:29:54,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:54,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:29:56,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:29:56,113 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 15:29:58,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 15:29:58,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:30:00,587 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.999e+02 2.217e+02 2.687e+02 4.030e+02, threshold=4.434e+02, percent-clipped=0.0 2023-10-04 15:30:02,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:30:05,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:30:06,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:30:09,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:30:09,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:30:09,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:19,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:19,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:30:19,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:30:19,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:20,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 15:30:20,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:30:22,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1703346.6666666667, ans=0.95 2023-10-04 15:30:23,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:30:23,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:30:23,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:30:25,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:27,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 15:30:30,606 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 15:30:33,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:33,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:34,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:34,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:36,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:30:37,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 15:30:42,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:30:42,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:30:45,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:30:45,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1703413.3333333333, ans=0.0 2023-10-04 15:30:47,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:53,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:56,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 15:30:56,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:30:56,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:57,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1703480.0, ans=0.125 2023-10-04 15:31:00,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 15:31:01,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:31:01,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:31:07,457 INFO [train.py:1046] (2/4) Epoch 49, batch 550, loss[loss=0.1434, simple_loss=0.2269, pruned_loss=0.02996, over 24474.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.03638, over 4419581.36 frames. ], batch size: 63, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:31:07,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 15:31:08,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 15:31:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:09,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 15:31:09,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1703546.6666666667, ans=0.2 2023-10-04 15:31:09,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1703546.6666666667, ans=0.125 2023-10-04 15:31:10,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:31:10,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:12,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:12,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:12,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:31:13,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:31:14,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:31:16,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 15:31:16,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:31:20,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:20,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:23,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:31:23,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1703613.3333333333, ans=0.5 2023-10-04 15:31:24,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:27,845 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 15:31:29,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 15:31:32,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:31:36,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:31:36,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:31:38,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:31:40,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1703680.0, ans=0.125 2023-10-04 15:31:42,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:42,792 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 15:31:42,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:45,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 15:31:46,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:31:48,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:31:48,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:31:49,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:51,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 15:31:51,335 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:31:52,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 15:31:53,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:31:53,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:31:53,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:31:53,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:54,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1703746.6666666667, ans=0.125 2023-10-04 15:31:57,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:31:57,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:32:01,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:32:02,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:02,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 15:32:04,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:32:06,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:07,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:32:07,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:09,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:32:09,411 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 15:32:16,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 15:32:16,402 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:32:20,227 INFO [train.py:1046] (2/4) Epoch 49, batch 600, loss[loss=0.1513, simple_loss=0.2406, pruned_loss=0.03094, over 24369.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2339, pruned_loss=0.03641, over 4496156.74 frames. ], batch size: 74, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:32:20,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 15:32:21,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:32:21,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:32:21,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:22,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.01 vs. limit=15.0 2023-10-04 15:32:27,153 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 1.961e+02 2.215e+02 2.451e+02 3.698e+02, threshold=4.430e+02, percent-clipped=0.0 2023-10-04 15:32:27,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:32:29,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:32:31,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 15:32:33,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:32:37,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:32:38,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:41,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 15:32:41,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:32:47,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 15:32:49,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:32:49,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:49,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:32:54,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:32:54,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:32:56,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:58,368 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.19 vs. limit=15.0 2023-10-04 15:33:05,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:33:09,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:09,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:33:09,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:33:11,748 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.43 vs. limit=10.0 2023-10-04 15:33:14,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1704080.0, ans=0.09899494936611666 2023-10-04 15:33:14,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1704080.0, ans=0.125 2023-10-04 15:33:14,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1704080.0, ans=0.125 2023-10-04 15:33:16,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 15:33:19,564 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.97 vs. limit=15.0 2023-10-04 15:33:22,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:33:22,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:33:23,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1704146.6666666667, ans=0.1 2023-10-04 15:33:24,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 15:33:25,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:33:28,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 15:33:28,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1704146.6666666667, ans=0.0 2023-10-04 15:33:29,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:33:29,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:33:35,019 INFO [train.py:1046] (2/4) Epoch 49, batch 650, loss[loss=0.1379, simple_loss=0.2231, pruned_loss=0.02634, over 24590.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.233, pruned_loss=0.03626, over 4530780.47 frames. ], batch size: 60, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:33:36,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 15:33:36,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:33:40,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:33:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:33:44,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:33:45,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 15:33:47,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:51,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:33:51,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:33:54,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:33:56,108 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=22.5 2023-10-04 15:33:58,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 15:33:59,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:33:59,829 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:34:03,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:34:04,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 15:34:06,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:06,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:07,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:34:09,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:10,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:34:11,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1704346.6666666667, ans=0.0 2023-10-04 15:34:12,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:34:12,287 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 15:34:14,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:14,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:34:16,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:17,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:34:18,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:18,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:34:18,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 15:34:18,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1704413.3333333333, ans=0.125 2023-10-04 15:34:19,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:34:19,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:34:20,184 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:34:21,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:34:21,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:34:22,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:34:25,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 15:34:26,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 15:34:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:26,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:34:26,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:34:28,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:34:29,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:34:33,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1704480.0, ans=0.125 2023-10-04 15:34:36,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:37,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:34:37,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1704480.0, ans=0.125 2023-10-04 15:34:39,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:41,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:41,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 15:34:41,091 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:46,094 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:34:47,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:34:47,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:34:47,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:34:47,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:34:49,847 INFO [train.py:1046] (2/4) Epoch 49, batch 700, loss[loss=0.1458, simple_loss=0.2307, pruned_loss=0.03046, over 24318.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2323, pruned_loss=0.03614, over 4575460.51 frames. ], batch size: 61, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:34:51,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.64 vs. limit=15.0 2023-10-04 15:34:52,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 15:34:52,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 15:34:55,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 15:34:55,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:56,641 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.097e+02 2.386e+02 2.709e+02 4.404e+02, threshold=4.772e+02, percent-clipped=0.0 2023-10-04 15:34:56,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:34:59,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 15:35:04,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:35:07,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:35:08,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:35:10,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:35:10,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:35:14,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:35:16,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 15:35:16,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:35:17,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1704680.0, ans=0.0 2023-10-04 15:35:19,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 15:35:20,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 15:35:23,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1704680.0, ans=0.2 2023-10-04 15:35:24,543 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:35:24,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:35:25,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:35:29,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:35:31,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 15:35:35,366 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:35:36,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:35:36,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:35:36,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 15:35:40,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:35:42,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:35:45,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:35:50,598 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-10-04 15:35:51,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:35:51,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 15:35:53,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 15:35:55,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 15:35:56,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:35:58,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1704813.3333333333, ans=0.2 2023-10-04 15:35:59,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:35:59,500 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:36:02,142 INFO [train.py:1046] (2/4) Epoch 49, batch 750, loss[loss=0.1516, simple_loss=0.2404, pruned_loss=0.03137, over 24475.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2316, pruned_loss=0.03592, over 4601745.80 frames. ], batch size: 66, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:36:02,251 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:02,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 15:36:04,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 15:36:05,683 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 15:36:05,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 15:36:07,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 15:36:07,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 15:36:08,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:36:09,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 15:36:09,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:11,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:36:13,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:15,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:36:16,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:36:16,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:36:16,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1704946.6666666667, ans=0.125 2023-10-04 15:36:19,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:36:19,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:36:20,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:36:21,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1704946.6666666667, ans=0.2 2023-10-04 15:36:22,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:22,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:36:22,749 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.28 vs. limit=10.0 2023-10-04 15:36:23,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 15:36:24,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:36:24,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:36:27,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:36:27,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:36:28,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 15:36:28,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:36:29,586 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.60 vs. limit=15.0 2023-10-04 15:36:32,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 15:36:32,277 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 15:36:34,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 15:36:34,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:36:34,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:36:35,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:36:40,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:36:40,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:36:40,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:36:43,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1705013.3333333333, ans=0.0 2023-10-04 15:36:45,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:47,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:36:47,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 15:36:48,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:36:48,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 15:36:49,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:36:52,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:36:52,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 15:36:53,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:36:58,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:59,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:36:59,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:02,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:37:07,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 15:37:07,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:37:08,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:10,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:11,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:13,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:37:13,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:37:16,978 INFO [train.py:1046] (2/4) Epoch 49, batch 800, loss[loss=0.144, simple_loss=0.2285, pruned_loss=0.02973, over 19786.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2324, pruned_loss=0.03601, over 4633249.59 frames. ], batch size: 43, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:37:21,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:37:21,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:22,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:37:22,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:24,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.062e+02 2.313e+02 2.836e+02 4.157e+02, threshold=4.626e+02, percent-clipped=0.0 2023-10-04 15:37:24,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:24,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:25,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:28,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:29,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:37:32,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 15:37:34,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:36,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:36,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:37:37,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:37:37,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 15:37:37,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:37,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 15:37:42,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:42,245 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:37:45,617 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:48,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:48,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:37:51,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:51,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:55,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:37:56,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:37:56,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 15:37:59,285 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 15:37:59,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 15:37:59,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:37:59,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:00,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:01,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:38:06,621 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 15:38:06,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 15:38:08,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:38:08,317 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:38:09,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1705413.3333333333, ans=0.0 2023-10-04 15:38:10,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:38:16,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:38:19,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:38:19,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 15:38:21,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:38:23,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 15:38:26,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1705480.0, ans=0.2 2023-10-04 15:38:28,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:38:28,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1705480.0, ans=0.2 2023-10-04 15:38:30,714 INFO [train.py:1046] (2/4) Epoch 49, batch 850, loss[loss=0.1422, simple_loss=0.2218, pruned_loss=0.0313, over 23675.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2334, pruned_loss=0.03638, over 4660433.10 frames. ], batch size: 135, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:38:30,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:38:30,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 15:38:30,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:38:32,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:32,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 15:38:32,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:34,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:38:36,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:38:38,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:38:39,405 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:38:41,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 15:38:41,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 15:38:41,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 15:38:41,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1705546.6666666667, ans=0.125 2023-10-04 15:38:44,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:38:45,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:38:47,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:38:47,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:48,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:38:53,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:54,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:38:54,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 15:38:56,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 15:39:00,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:39:02,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 15:39:06,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 15:39:07,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 15:39:09,686 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 15:39:09,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:39:09,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:39:09,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 15:39:12,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:13,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:13,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 15:39:17,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:39:17,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:39:18,891 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:39:18,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:39:19,816 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=15.0 2023-10-04 15:39:20,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:39:20,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:39:21,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 15:39:25,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:39:25,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:39:26,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:39:26,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:39:27,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:39:29,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:29,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1705813.3333333333, ans=0.2 2023-10-04 15:39:32,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:39:33,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:39:34,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:39:34,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:39:42,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:39:42,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1705813.3333333333, ans=0.1 2023-10-04 15:39:44,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:39:45,385 INFO [train.py:1046] (2/4) Epoch 49, batch 900, loss[loss=0.145, simple_loss=0.2276, pruned_loss=0.03123, over 24504.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03649, over 4663206.24 frames. ], batch size: 63, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:39:45,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 15:39:45,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:39:46,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:39:48,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 15:39:53,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:39:55,910 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.096e+02 2.361e+02 2.721e+02 4.850e+02, threshold=4.722e+02, percent-clipped=1.0 2023-10-04 15:39:56,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:39:56,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1705880.0, ans=0.2 2023-10-04 15:39:57,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 15:40:00,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:40:00,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 15:40:02,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 15:40:03,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:40:03,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:03,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:40:03,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:40:13,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:13,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:40:13,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:40:14,447 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.45 vs. limit=15.0 2023-10-04 15:40:15,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1706013.3333333333, ans=0.0 2023-10-04 15:40:16,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:20,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 15:40:21,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:40:26,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:40:26,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:40:26,528 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 15:40:28,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 15:40:31,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:40:31,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:40:32,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:40:34,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1706080.0, ans=0.125 2023-10-04 15:40:38,025 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:38,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:40:42,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1706080.0, ans=0.125 2023-10-04 15:40:43,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 15:40:43,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:44,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 15:40:46,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:40:46,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:46,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:40:46,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:40:49,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1706146.6666666667, ans=0.125 2023-10-04 15:40:51,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 15:40:51,054 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 15:40:52,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 15:40:52,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 15:40:56,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:57,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1706146.6666666667, ans=0.1 2023-10-04 15:40:59,273 INFO [train.py:1046] (2/4) Epoch 49, batch 950, loss[loss=0.1563, simple_loss=0.2335, pruned_loss=0.0395, over 23803.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03674, over 4671616.18 frames. ], batch size: 179, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:41:00,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 15:41:02,761 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.72 vs. limit=15.0 2023-10-04 15:41:05,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:06,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:07,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:07,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:41:11,198 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 15:41:14,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:14,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:41:15,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:15,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:41:15,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 15:41:18,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:41:19,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:21,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1706280.0, ans=0.2 2023-10-04 15:41:22,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 15:41:22,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:41:25,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:25,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:41:25,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:41:28,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 15:41:31,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:41:33,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:41:34,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:41:39,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:41:39,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:43,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 15:41:45,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 15:41:45,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:41:45,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:41:47,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:47,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:41:51,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 15:41:51,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:41:55,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:41:55,310 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:55,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 15:41:55,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:56,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:41:56,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 15:42:00,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:42:04,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:42:08,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:42:08,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 15:42:08,382 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 15:42:08,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1706480.0, ans=0.09899494936611666 2023-10-04 15:42:12,428 INFO [train.py:1046] (2/4) Epoch 49, batch 1000, loss[loss=0.1437, simple_loss=0.222, pruned_loss=0.03268, over 23623.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03643, over 4698410.37 frames. ], batch size: 135, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:42:12,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:42:15,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1706546.6666666667, ans=0.2 2023-10-04 15:42:16,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 15:42:17,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:17,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1706546.6666666667, ans=0.0 2023-10-04 15:42:19,790 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.75 vs. limit=12.0 2023-10-04 15:42:21,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:42:23,144 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.037e+02 2.263e+02 2.669e+02 4.122e+02, threshold=4.525e+02, percent-clipped=0.0 2023-10-04 15:42:23,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 15:42:23,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 15:42:23,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1706546.6666666667, ans=0.125 2023-10-04 15:42:26,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:27,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:42:28,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:30,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1706613.3333333333, ans=0.0 2023-10-04 15:42:31,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 15:42:34,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 15:42:36,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 15:42:37,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:42:41,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 15:42:42,954 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 15:42:42,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 15:42:44,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:46,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:53,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:55,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:42:56,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:56,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:56,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 15:42:56,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:42:58,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:42:58,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:59,529 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 15:43:03,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 15:43:04,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 15:43:06,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 15:43:08,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:43:08,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1706746.6666666667, ans=0.125 2023-10-04 15:43:12,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.53 vs. limit=15.0 2023-10-04 15:43:15,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:15,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:43:16,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:17,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:43:19,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 15:43:20,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:43:20,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 15:43:21,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 15:43:21,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:43:21,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:43:24,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:43:27,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:43:28,538 INFO [train.py:1046] (2/4) Epoch 49, batch 1050, loss[loss=0.1694, simple_loss=0.2576, pruned_loss=0.04055, over 24023.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.03618, over 4705982.16 frames. ], batch size: 80, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:43:28,616 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:43:28,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1706880.0, ans=0.125 2023-10-04 15:43:31,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:43:32,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:43:35,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:43:35,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:38,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:43:41,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:43:43,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:43:45,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:43:45,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:43:46,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:43:46,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:43:46,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 15:43:48,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:43:48,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 15:43:51,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:43:51,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 15:43:51,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:43:58,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:59,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:43:59,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:44:02,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 15:44:02,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 15:44:02,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:44:06,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 15:44:06,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1707013.3333333333, ans=0.0 2023-10-04 15:44:10,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 15:44:10,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:14,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 15:44:16,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 15:44:16,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:44:16,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:44:21,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:44:25,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 15:44:25,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 15:44:25,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1707080.0, ans=0.125 2023-10-04 15:44:25,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1707080.0, ans=0.125 2023-10-04 15:44:26,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 15:44:27,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:44:27,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:44:29,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 15:44:31,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1707146.6666666667, ans=0.0 2023-10-04 15:44:32,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:44:34,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:44:34,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:44:35,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:44:35,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:37,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1707146.6666666667, ans=0.95 2023-10-04 15:44:39,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:39,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 15:44:41,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:44:41,312 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 15:44:41,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 15:44:42,563 INFO [train.py:1046] (2/4) Epoch 49, batch 1100, loss[loss=0.1566, simple_loss=0.2423, pruned_loss=0.03544, over 23482.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2335, pruned_loss=0.03607, over 4707404.10 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:44:42,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:44:44,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:44:47,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1707213.3333333333, ans=0.125 2023-10-04 15:44:49,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:44:49,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1707213.3333333333, ans=0.125 2023-10-04 15:44:53,337 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.768e+02 2.066e+02 2.358e+02 2.772e+02 6.166e+02, threshold=4.716e+02, percent-clipped=1.0 2023-10-04 15:44:53,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:44:54,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:44:54,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:44:56,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 15:44:57,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:44:59,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:45:03,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:45:05,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:45:07,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 15:45:07,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 15:45:09,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:45:09,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:45:11,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:45:13,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:45:19,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:45:23,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 15:45:24,507 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 15:45:24,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:26,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:27,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:45:28,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:45:28,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1707413.3333333333, ans=0.0 2023-10-04 15:45:30,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 15:45:30,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:45:30,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:45:30,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:45:31,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:31,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 15:45:35,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:45:35,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 15:45:36,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1707413.3333333333, ans=0.1 2023-10-04 15:45:39,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:45:44,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:45:47,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 15:45:47,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:45:48,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:49,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten.whitening_limit, batch_count=1707480.0, ans=15.0 2023-10-04 15:45:50,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:45:50,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:45:51,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 15:45:54,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:45:54,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:45:54,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 15:45:55,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:45:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 15:45:56,829 INFO [train.py:1046] (2/4) Epoch 49, batch 1150, loss[loss=0.1714, simple_loss=0.237, pruned_loss=0.0529, over 19310.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2334, pruned_loss=0.03612, over 4707998.60 frames. ], batch size: 388, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:45:56,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:45:56,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:45:58,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:46:00,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-10-04 15:46:03,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:05,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:46:06,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:46:08,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:46:08,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 15:46:08,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:46:10,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 15:46:12,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:12,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:46:18,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 15:46:19,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=15.15 vs. limit=15.0 2023-10-04 15:46:20,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:46:23,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:24,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:25,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 15:46:26,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:46:26,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:46:29,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 15:46:30,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:46:32,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:46:40,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:48,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:48,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 15:46:49,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:46:49,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:46:54,990 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 15:46:57,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:02,597 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 15:47:05,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:06,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:47:06,741 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:47:08,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:47:09,608 INFO [train.py:1046] (2/4) Epoch 49, batch 1200, loss[loss=0.1667, simple_loss=0.2589, pruned_loss=0.03727, over 24661.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.234, pruned_loss=0.03616, over 4713364.56 frames. ], batch size: 73, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:47:11,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:47:17,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:47:17,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:47:19,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:47:19,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:19,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:47:20,502 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.013e+02 2.245e+02 2.650e+02 4.852e+02, threshold=4.489e+02, percent-clipped=1.0 2023-10-04 15:47:20,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:47:22,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:47:22,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-10-04 15:47:23,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:47:23,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:24,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 15:47:28,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 15:47:32,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:47:35,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:47:38,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:47:39,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:47:39,653 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 15:47:39,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:42,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1708013.3333333333, ans=0.125 2023-10-04 15:47:48,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:47:48,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:47:48,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 15:47:49,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:47:50,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1708013.3333333333, ans=0.125 2023-10-04 15:47:50,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1708013.3333333333, ans=0.125 2023-10-04 15:47:53,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 15:47:56,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 15:47:56,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:58,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:59,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:01,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:48:01,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:48:02,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:48:02,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:48:02,570 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 15:48:03,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:48:05,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:48:05,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 15:48:08,077 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:48:08,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:08,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1708146.6666666667, ans=0.125 2023-10-04 15:48:13,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:48:14,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:48:15,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1708146.6666666667, ans=0.125 2023-10-04 15:48:16,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 15:48:17,480 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.36 vs. limit=15.0 2023-10-04 15:48:22,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 15:48:22,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1708213.3333333333, ans=0.125 2023-10-04 15:48:23,777 INFO [train.py:1046] (2/4) Epoch 49, batch 1250, loss[loss=0.1668, simple_loss=0.2431, pruned_loss=0.04525, over 23725.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2348, pruned_loss=0.0364, over 4717993.71 frames. ], batch size: 232, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:48:23,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:48:25,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:48:28,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:48:31,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:48:33,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 15:48:35,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.68 vs. limit=15.0 2023-10-04 15:48:37,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:48:38,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:48:39,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 15:48:39,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1708280.0, ans=0.1 2023-10-04 15:48:40,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:48:41,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:48:42,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1708280.0, ans=0.125 2023-10-04 15:48:42,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1708280.0, ans=0.125 2023-10-04 15:48:44,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:48:44,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:48:46,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:48:46,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:48:49,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:48:50,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1708280.0, ans=0.0 2023-10-04 15:48:52,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:48:52,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:48:53,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:48:53,314 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:48:54,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:48:57,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:59,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:49:03,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1708346.6666666667, ans=0.2 2023-10-04 15:49:04,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 15:49:04,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:49:07,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:49:07,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 15:49:09,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:49:09,185 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 15:49:09,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:10,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:13,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:49:13,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1708413.3333333333, ans=0.125 2023-10-04 15:49:15,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:49:17,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:49:18,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 15:49:18,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 15:49:18,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 15:49:20,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1708413.3333333333, ans=0.025 2023-10-04 15:49:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:49:23,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 15:49:23,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:25,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 15:49:25,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:49:25,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1708480.0, ans=0.125 2023-10-04 15:49:29,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 15:49:29,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:49:29,182 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:49:29,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 15:49:31,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:49:32,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 15:49:35,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:49:36,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:49:37,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:49:38,251 INFO [train.py:1046] (2/4) Epoch 49, batch 1300, loss[loss=0.1446, simple_loss=0.2255, pruned_loss=0.03186, over 23794.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2354, pruned_loss=0.03673, over 4708392.85 frames. ], batch size: 150, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:49:41,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:49:42,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:49:42,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1708546.6666666667, ans=0.05 2023-10-04 15:49:43,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 15:49:47,690 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.055e+02 2.239e+02 2.568e+02 3.936e+02, threshold=4.477e+02, percent-clipped=0.0 2023-10-04 15:49:49,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:49:49,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:49:49,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1708546.6666666667, ans=0.05 2023-10-04 15:49:51,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:49:51,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:52,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:49:53,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 15:49:58,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:49:59,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:50:01,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 15:50:04,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:50:07,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:08,895 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:50:09,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1708680.0, ans=0.2 2023-10-04 15:50:10,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:50:10,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:10,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1708680.0, ans=0.125 2023-10-04 15:50:10,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1708680.0, ans=0.0 2023-10-04 15:50:11,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:50:11,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:50:13,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 15:50:14,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1708680.0, ans=0.125 2023-10-04 15:50:17,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:50:17,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:50:19,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 15:50:19,434 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:50:22,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:50:23,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:50:24,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 15:50:26,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:50:26,171 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 15:50:27,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:50:31,328 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:50:31,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:50:34,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 15:50:35,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 15:50:35,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1708813.3333333333, ans=0.125 2023-10-04 15:50:36,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 15:50:40,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:50:43,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 15:50:45,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:49,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-10-04 15:50:51,096 INFO [train.py:1046] (2/4) Epoch 49, batch 1350, loss[loss=0.1498, simple_loss=0.2136, pruned_loss=0.04303, over 22662.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.234, pruned_loss=0.03666, over 4698846.30 frames. ], batch size: 322, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:50:52,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 15:50:55,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:50:57,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:00,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:51:00,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:51:03,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:51:03,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:51:03,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1708880.0, ans=0.125 2023-10-04 15:51:03,983 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=15.0 2023-10-04 15:51:08,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:51:09,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 15:51:10,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:51:12,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:51:15,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 15:51:15,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:51:16,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:51:16,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 15:51:19,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 15:51:21,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 15:51:21,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:21,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 15:51:31,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:41,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:43,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:51:43,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 15:51:44,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:51:44,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 15:51:46,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:51:46,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:51:46,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1709080.0, ans=0.125 2023-10-04 15:51:47,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:51:50,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 15:51:52,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:51:58,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 15:51:59,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 15:52:05,605 INFO [train.py:1046] (2/4) Epoch 49, batch 1400, loss[loss=0.1592, simple_loss=0.233, pruned_loss=0.04268, over 23796.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2326, pruned_loss=0.03598, over 4695821.35 frames. ], batch size: 179, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:52:07,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 15:52:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:52:11,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:52:12,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:52:14,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1709213.3333333333, ans=0.04949747468305833 2023-10-04 15:52:16,622 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.734e+02 2.032e+02 2.354e+02 2.703e+02 4.112e+02, threshold=4.708e+02, percent-clipped=0.0 2023-10-04 15:52:18,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 15:52:19,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 15:52:23,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1709280.0, ans=0.125 2023-10-04 15:52:27,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1709280.0, ans=0.1 2023-10-04 15:52:30,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:52:30,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:52:33,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:52:33,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:52:36,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:52:38,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 15:52:46,576 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:52:46,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:52:49,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 15:52:51,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:52:51,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1709413.3333333333, ans=0.125 2023-10-04 15:52:52,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:52:54,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:52:54,161 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:52:55,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:52:55,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:52:55,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:52:55,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1709413.3333333333, ans=0.0 2023-10-04 15:52:58,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 15:52:58,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:53:02,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:06,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:53:11,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 15:53:12,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1709480.0, ans=0.125 2023-10-04 15:53:13,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:53:13,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:53:16,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 15:53:16,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:18,646 INFO [train.py:1046] (2/4) Epoch 49, batch 1450, loss[loss=0.1547, simple_loss=0.2316, pruned_loss=0.0389, over 22738.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03573, over 4705600.78 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:53:18,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:53:19,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1709546.6666666667, ans=0.0 2023-10-04 15:53:23,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:53:23,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:53:23,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:23,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 15:53:26,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1709546.6666666667, ans=0.125 2023-10-04 15:53:31,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:31,361 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:53:32,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:53:32,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 15:53:34,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:53:35,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 15:53:35,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:35,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:35,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 15:53:37,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:53:38,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:53:39,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 15:53:39,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:40,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:53:42,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:44,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:46,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.80 vs. limit=22.5 2023-10-04 15:53:47,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:53:47,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:53:48,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:50,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:52,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:53:52,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:52,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:53:56,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 15:53:58,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:54:02,407 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 15:54:03,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:54:05,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:54:06,456 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:06,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 15:54:11,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:12,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 15:54:13,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 15:54:13,875 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:17,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:54:19,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:54:21,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 15:54:22,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 15:54:23,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 15:54:25,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:25,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:54:32,980 INFO [train.py:1046] (2/4) Epoch 49, batch 1500, loss[loss=0.1502, simple_loss=0.2226, pruned_loss=0.03887, over 23837.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2326, pruned_loss=0.03586, over 4703606.54 frames. ], batch size: 179, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:54:37,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 15:54:37,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:54:37,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:54:38,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:39,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:54:41,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:54:41,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 15:54:43,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:54:44,515 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.098e+02 2.317e+02 2.688e+02 4.133e+02, threshold=4.633e+02, percent-clipped=0.0 2023-10-04 15:54:44,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:54:44,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:54:44,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:54:46,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:54:47,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:54:53,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:54:53,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 15:54:54,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:54:54,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:54:56,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:57,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 15:55:00,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 15:55:03,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:55:03,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 15:55:06,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:55:10,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:55:10,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:55:10,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:55:11,314 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-04 15:55:12,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 15:55:13,569 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:55:13,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:55:14,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 15:55:15,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:55:21,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:55:21,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 15:55:21,874 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.12 vs. limit=15.0 2023-10-04 15:55:25,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:55:27,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:55:31,845 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 15:55:31,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:31,887 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 15:55:33,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:55:35,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:55:35,258 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 15:55:36,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:55:40,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 15:55:41,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:44,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:55:44,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:45,890 INFO [train.py:1046] (2/4) Epoch 49, batch 1550, loss[loss=0.151, simple_loss=0.2435, pruned_loss=0.02923, over 24446.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2329, pruned_loss=0.03575, over 4704403.09 frames. ], batch size: 69, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:55:45,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:55:45,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:46,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:55:47,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 15:55:49,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 15:55:49,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:55:49,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 15:55:50,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 15:55:53,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:55:53,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1710213.3333333333, ans=0.0 2023-10-04 15:55:55,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:55:55,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:55:56,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:55:57,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:55:58,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:56:01,155 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 15:56:01,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:01,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:56:01,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:56:03,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1710280.0, ans=0.0 2023-10-04 15:56:04,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:56:04,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 15:56:05,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:56:05,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 15:56:07,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 15:56:07,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 15:56:07,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:09,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:12,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:56:13,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 15:56:13,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 15:56:14,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-10-04 15:56:22,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:25,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:56:25,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:56:25,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:56:26,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 15:56:31,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:56:33,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:33,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1710413.3333333333, ans=0.1 2023-10-04 15:56:34,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:56:36,592 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.47 vs. limit=22.5 2023-10-04 15:56:37,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:56:37,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:37,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 15:56:37,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:56:40,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:56:40,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:40,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 15:56:40,417 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 15:56:44,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:56:47,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 15:56:52,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:56:54,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:54,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 15:56:55,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:56:56,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:56:56,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:56:56,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:56:58,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:57:01,261 INFO [train.py:1046] (2/4) Epoch 49, batch 1600, loss[loss=0.1625, simple_loss=0.2385, pruned_loss=0.04328, over 22665.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2327, pruned_loss=0.03564, over 4719515.87 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 15:57:03,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:03,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 15:57:03,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 15:57:05,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1710546.6666666667, ans=10.0 2023-10-04 15:57:06,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 15:57:10,457 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:57:11,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 15:57:11,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:57:13,212 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.000e+02 2.241e+02 2.538e+02 3.324e+02, threshold=4.482e+02, percent-clipped=0.0 2023-10-04 15:57:14,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:57:16,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1710613.3333333333, ans=0.0 2023-10-04 15:57:19,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:57:22,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 15:57:24,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1710613.3333333333, ans=0.0 2023-10-04 15:57:25,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:57:25,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 15:57:25,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:26,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 15:57:26,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1710613.3333333333, ans=0.125 2023-10-04 15:57:33,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 15:57:44,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:57:44,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 15:57:45,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:57:45,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:57:45,732 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:57:48,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 15:57:51,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 15:57:52,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:57:52,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:54,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:56,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:57:58,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:57:59,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:58:00,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:58:06,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:58:08,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:58:09,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 15:58:09,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:58:11,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 15:58:15,894 INFO [train.py:1046] (2/4) Epoch 49, batch 1650, loss[loss=0.1659, simple_loss=0.251, pruned_loss=0.04044, over 24342.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2332, pruned_loss=0.0358, over 4727244.53 frames. ], batch size: 77, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:58:17,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:58:17,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:58:18,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:58:18,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 15:58:18,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 15:58:18,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 15:58:20,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 15:58:23,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1710880.0, ans=0.125 2023-10-04 15:58:24,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:58:24,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:58:26,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:58:26,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:58:27,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:58:29,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 15:58:32,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:58:32,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:58:32,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:58:32,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:58:35,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 15:58:35,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 15:58:38,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1710946.6666666667, ans=0.125 2023-10-04 15:58:39,981 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:58:41,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:58:50,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 15:58:51,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:58:52,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 15:58:55,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:58:59,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1711080.0, ans=0.1 2023-10-04 15:59:00,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:59:00,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:59:02,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:03,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:59:03,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:05,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:05,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=22.5 2023-10-04 15:59:06,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:06,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:59:06,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:59:07,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:59:09,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:59:12,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:59:14,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 15:59:15,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:59:15,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 15:59:17,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 15:59:17,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 15:59:18,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:59:19,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:59:19,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:20,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:20,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 15:59:24,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:25,618 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:59:26,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:28,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 15:59:29,734 INFO [train.py:1046] (2/4) Epoch 49, batch 1700, loss[loss=0.1476, simple_loss=0.2199, pruned_loss=0.03762, over 23647.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03611, over 4709928.44 frames. ], batch size: 256, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:59:33,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:33,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:59:33,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 15:59:35,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:59:35,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:59:35,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:37,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:59:37,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:59:37,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 15:59:40,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:59:44,086 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.122e+02 2.407e+02 2.844e+02 4.213e+02, threshold=4.814e+02, percent-clipped=0.0 2023-10-04 15:59:47,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:50,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:59:55,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:59:55,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:59:55,746 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:59:55,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:59:58,568 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 16:00:00,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:00:00,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:02,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:00:02,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:00:05,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 16:00:05,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 16:00:05,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.11 vs. limit=15.0 2023-10-04 16:00:08,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:09,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 16:00:12,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:00:18,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:20,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:20,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:00:22,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:00:22,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 16:00:22,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:00:25,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:25,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 16:00:26,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:00:26,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:00:28,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:28,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:00:30,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:00:30,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:00:31,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:31,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:00:32,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:35,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:00:35,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 16:00:38,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1711480.0, ans=0.125 2023-10-04 16:00:39,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:40,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:00:41,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1711480.0, ans=0.05 2023-10-04 16:00:42,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 16:00:45,048 INFO [train.py:1046] (2/4) Epoch 49, batch 1750, loss[loss=0.1481, simple_loss=0.2188, pruned_loss=0.03874, over 22679.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2318, pruned_loss=0.03597, over 4703621.29 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:00:48,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:49,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:00:50,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:00:50,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 16:00:52,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:55,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:00:55,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:00,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 16:01:01,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:03,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 16:01:03,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:01:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:01:09,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:01:09,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 16:01:12,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:01:13,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 16:01:20,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:01:21,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:01:21,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:01:24,700 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:24,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:01:27,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:01:27,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:30,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:01:30,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:01:32,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 16:01:34,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:01:36,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 16:01:37,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:01:40,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:40,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:01:46,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:01:46,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 16:01:47,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:48,459 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.22 vs. limit=10.0 2023-10-04 16:01:48,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:01:53,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:55,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:01:57,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:01:57,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 16:01:57,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:01:58,775 INFO [train.py:1046] (2/4) Epoch 49, batch 1800, loss[loss=0.1455, simple_loss=0.2239, pruned_loss=0.03359, over 23621.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2316, pruned_loss=0.03572, over 4711477.26 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:01:58,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:01:58,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:01:58,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:01:58,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:01:58,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:02:03,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:02:04,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:02:06,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:02:09,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:02:11,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:02:12,224 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.071e+02 2.300e+02 2.752e+02 3.980e+02, threshold=4.601e+02, percent-clipped=0.0 2023-10-04 16:02:12,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:02:14,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:02:15,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:15,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:17,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:02:20,036 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:02:20,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 16:02:21,368 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:25,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:29,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 16:02:32,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 16:02:32,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 16:02:32,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:02:34,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:34,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:02:35,712 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:02:43,286 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 16:02:44,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.28 vs. limit=15.0 2023-10-04 16:02:44,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:02:46,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:48,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 16:02:48,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 16:02:49,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:02:49,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:02:51,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:02:55,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 16:03:00,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:03:00,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 16:03:02,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:03:02,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:02,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:03:03,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 16:03:06,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:03:06,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:03:09,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 16:03:09,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:12,564 INFO [train.py:1046] (2/4) Epoch 49, batch 1850, loss[loss=0.1342, simple_loss=0.2203, pruned_loss=0.02407, over 24445.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2317, pruned_loss=0.0358, over 4714254.40 frames. ], batch size: 58, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:03:12,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:03:12,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:03:12,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:03:15,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:03:15,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:03:16,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.26 vs. limit=15.0 2023-10-04 16:03:18,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:03:18,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:03:19,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1712213.3333333333, ans=0.0 2023-10-04 16:03:21,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:03:22,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:03:28,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:03:28,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 16:03:31,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 16:03:32,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1712280.0, ans=0.0 2023-10-04 16:03:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 16:03:38,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:03:38,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 16:03:38,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 16:03:41,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.96 vs. limit=10.0 2023-10-04 16:03:46,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:03:47,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 16:03:50,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:03:50,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:03:54,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 16:03:54,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:54,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:03:56,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:03:56,372 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:03:58,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:04:00,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:04:04,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:04:04,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:05,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:04:05,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:07,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:04:08,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:04:11,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 16:04:11,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:04:11,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1712480.0, ans=0.125 2023-10-04 16:04:14,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:04:15,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:04:15,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 16:04:15,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 16:04:18,376 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 16:04:18,457 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 16:04:19,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:04:19,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:04:21,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:04:21,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:21,370 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 16:04:22,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:04:22,588 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:23,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:04:25,278 INFO [train.py:1046] (2/4) Epoch 49, batch 1900, loss[loss=0.144, simple_loss=0.2207, pruned_loss=0.03361, over 23412.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2327, pruned_loss=0.03603, over 4720068.14 frames. ], batch size: 285, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:04:25,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:04:26,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:04:26,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 16:04:28,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:29,914 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 16:04:29,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:04:31,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:35,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:38,341 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.723e+02 2.070e+02 2.219e+02 2.494e+02 3.485e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 16:04:38,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:04:39,841 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 16:04:41,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 16:04:42,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:04:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:04:43,930 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 16:04:43,965 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 16:04:48,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 16:04:50,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:04:51,232 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.60 vs. limit=12.0 2023-10-04 16:04:52,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 16:04:55,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 16:05:02,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1712680.0, ans=0.125 2023-10-04 16:05:03,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 16:05:06,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 16:05:06,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:07,795 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 16:05:07,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 16:05:09,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 16:05:09,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 16:05:09,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:05:11,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1712746.6666666667, ans=0.0 2023-10-04 16:05:13,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 16:05:16,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:05:21,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:05:21,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 16:05:21,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:05:24,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 16:05:25,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1712813.3333333333, ans=0.125 2023-10-04 16:05:26,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:05:31,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:05:31,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:05:33,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:05:33,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:05:34,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:05:34,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:05:36,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:05:38,999 INFO [train.py:1046] (2/4) Epoch 49, batch 1950, loss[loss=0.1577, simple_loss=0.2353, pruned_loss=0.04003, over 23392.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2329, pruned_loss=0.03593, over 4715186.74 frames. ], batch size: 285, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:05:39,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:05:39,039 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:05:40,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:05:40,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:05:40,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:05:41,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:05:44,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:05:46,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:05:46,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1712880.0, ans=0.125 2023-10-04 16:05:48,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:48,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:05:50,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 16:05:52,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 16:05:52,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:53,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:55,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:05:57,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:05:57,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:05:58,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1712946.6666666667, ans=0.1 2023-10-04 16:05:58,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1712946.6666666667, ans=0.125 2023-10-04 16:06:00,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:06:02,691 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:06:02,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:06:02,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:06:02,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:06,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:07,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:06:07,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:07,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:06:07,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 16:06:09,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:06:09,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:06:10,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:14,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:15,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:06:23,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:06:25,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:06:25,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:06:26,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 16:06:26,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:06:26,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1713080.0, ans=0.0 2023-10-04 16:06:31,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:06:32,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:06:32,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:06:34,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.40 vs. limit=15.0 2023-10-04 16:06:38,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:40,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:41,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:43,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:44,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:06:45,872 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:47,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 16:06:47,213 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:06:49,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:49,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 16:06:51,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:06:53,174 INFO [train.py:1046] (2/4) Epoch 49, batch 2000, loss[loss=0.1433, simple_loss=0.2323, pruned_loss=0.02717, over 24506.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.0362, over 4707646.74 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:06:55,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:06:55,582 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:06:55,800 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.70 vs. limit=15.0 2023-10-04 16:06:57,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:06:57,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:06:58,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:06:59,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:04,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 16:07:05,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:07:07,143 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.055e+02 2.276e+02 2.688e+02 4.506e+02, threshold=4.553e+02, percent-clipped=2.0 2023-10-04 16:07:08,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:07:08,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 16:07:10,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:07:10,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:07:12,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:07:14,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 16:07:15,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:16,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:17,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:19,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 16:07:19,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:07:20,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 16:07:20,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:07:25,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:07:27,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:07:27,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:27,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:07:28,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:07:29,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 16:07:30,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1713346.6666666667, ans=0.0 2023-10-04 16:07:32,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 16:07:32,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:07:32,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:38,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:40,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:07:40,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:07:40,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:07:41,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=15.0 2023-10-04 16:07:41,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:07:41,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:43,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:07:43,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:43,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1713413.3333333333, ans=0.125 2023-10-04 16:07:44,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:48,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:07:48,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 16:07:53,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:07:56,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:59,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:59,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:08:02,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:04,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:08:04,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:06,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:08:06,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:08:08,076 INFO [train.py:1046] (2/4) Epoch 49, batch 2050, loss[loss=0.1545, simple_loss=0.2348, pruned_loss=0.03715, over 23375.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2335, pruned_loss=0.03636, over 4702081.96 frames. ], batch size: 93, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:08:08,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:09,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:12,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:08:13,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:17,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:08:20,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:08:20,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:22,052 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:08:22,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 16:08:22,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:08:25,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:08:25,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:08:34,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:08:34,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:37,736 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 16:08:37,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1713680.0, ans=0.125 2023-10-04 16:08:39,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:40,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 16:08:40,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:08:42,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:08:44,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:08:44,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:08:46,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:08:47,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:08:48,759 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:08:48,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:08:52,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:08:52,365 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:08:53,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:08:55,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:08:56,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:09:00,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:09:00,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1713746.6666666667, ans=0.125 2023-10-04 16:09:06,121 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:09:07,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 16:09:13,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:09:13,412 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:09:16,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:09:18,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 16:09:21,476 INFO [train.py:1046] (2/4) Epoch 49, batch 2100, loss[loss=0.1471, simple_loss=0.2272, pruned_loss=0.03355, over 24302.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03576, over 4701164.76 frames. ], batch size: 56, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:09:21,606 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 16:09:21,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:23,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:09:23,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:09:26,807 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:09:26,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 16:09:26,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 16:09:29,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:09:32,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:09:32,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:09:35,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:35,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:09:35,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 16:09:37,105 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.123e+02 2.389e+02 2.758e+02 4.259e+02, threshold=4.778e+02, percent-clipped=0.0 2023-10-04 16:09:37,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:09:37,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 16:09:37,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 16:09:40,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:09:40,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:09:40,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 16:09:40,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 16:09:46,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 16:09:46,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:09:48,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:09:48,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:09:52,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:09:52,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 16:09:53,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:09:53,580 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 16:09:56,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 16:09:56,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1714013.3333333333, ans=0.0 2023-10-04 16:09:57,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:57,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 16:09:57,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 16:09:57,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 16:09:59,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:10:02,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:10:05,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:10:06,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:10:06,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:08,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:08,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 16:10:08,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:10,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:10,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:10,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 16:10:11,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 16:10:13,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 16:10:15,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:10:17,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1714080.0, ans=0.2 2023-10-04 16:10:18,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.66 vs. limit=10.0 2023-10-04 16:10:18,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:10:18,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 16:10:23,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:25,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:10:25,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:10:25,973 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:10:27,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 16:10:27,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:10:31,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:31,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:10:32,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:10:32,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:33,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=12.0 2023-10-04 16:10:33,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 16:10:34,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 16:10:34,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:10:36,845 INFO [train.py:1046] (2/4) Epoch 49, batch 2150, loss[loss=0.1724, simple_loss=0.254, pruned_loss=0.04537, over 23266.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2315, pruned_loss=0.03566, over 4685767.93 frames. ], batch size: 93, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:10:36,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:36,906 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:10:36,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:10:36,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:10:42,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 16:10:45,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:10:45,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:48,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:10:48,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:10:48,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:10:53,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:53,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:10:53,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:10:56,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-10-04 16:10:57,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:10:58,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 16:11:03,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:03,278 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:11:05,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:05,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:05,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:11:07,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:11:07,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:11:07,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:11:08,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1714346.6666666667, ans=0.0 2023-10-04 16:11:09,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 16:11:12,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:11:13,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:11:13,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:15,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:11:16,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:11:19,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:11:19,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:11:20,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:20,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 16:11:20,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:11:24,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:24,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:25,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:27,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:11:28,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:29,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:29,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 16:11:31,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 16:11:32,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:11:32,371 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 16:11:34,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:34,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:11:37,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 16:11:37,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:11:37,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 16:11:37,093 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 16:11:37,093 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 16:11:37,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 16:11:38,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:40,225 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:11:40,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:11:40,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:41,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:11:43,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:43,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:49,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1714546.6666666667, ans=0.125 2023-10-04 16:11:50,541 INFO [train.py:1046] (2/4) Epoch 49, batch 2200, loss[loss=0.1472, simple_loss=0.213, pruned_loss=0.04073, over 19354.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2322, pruned_loss=0.03572, over 4694260.82 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:11:52,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:11:52,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 16:11:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:11:59,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:59,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1714546.6666666667, ans=0.1 2023-10-04 16:12:00,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:12:00,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:02,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:12:03,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:12:03,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:12:03,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 16:12:05,034 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.091e+02 2.261e+02 2.558e+02 4.417e+02, threshold=4.522e+02, percent-clipped=0.0 2023-10-04 16:12:10,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 16:12:10,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:12:14,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1714613.3333333333, ans=0.05 2023-10-04 16:12:16,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 16:12:18,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:18,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1714613.3333333333, ans=0.125 2023-10-04 16:12:19,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:12:19,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:12:24,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:12:24,443 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 16:12:28,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:12:31,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:31,320 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 16:12:35,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:12:37,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:12:38,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1714746.6666666667, ans=0.025 2023-10-04 16:12:40,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:12:42,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:42,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1714746.6666666667, ans=0.0 2023-10-04 16:12:44,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 16:12:44,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:46,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 16:12:49,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:49,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:12:49,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:49,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1714813.3333333333, ans=0.125 2023-10-04 16:12:52,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:12:52,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:12:52,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:52,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:53,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:12:53,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:12:55,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:12:58,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:12:58,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:00,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:13:01,634 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 16:13:02,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:13:04,315 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 16:13:05,467 INFO [train.py:1046] (2/4) Epoch 49, batch 2250, loss[loss=0.1834, simple_loss=0.2591, pruned_loss=0.05391, over 19632.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03576, over 4699622.60 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:13:05,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:13:05,641 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 16:13:07,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:08,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:13:10,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:11,910 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 16:13:13,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1714880.0, ans=0.125 2023-10-04 16:13:14,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:13:17,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:13:21,146 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:13:23,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:13:26,608 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:26,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:13:28,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:13:30,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 16:13:30,034 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:13:31,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:13:32,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 16:13:32,916 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:13:34,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:35,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:13:38,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:40,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:13:40,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:13:43,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 16:13:43,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1715013.3333333333, ans=0.0 2023-10-04 16:13:45,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:48,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:13:48,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1715013.3333333333, ans=0.125 2023-10-04 16:13:48,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1715013.3333333333, ans=0.1 2023-10-04 16:13:49,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1715080.0, ans=0.125 2023-10-04 16:13:54,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:13:54,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:13:55,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:55,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:13:58,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:58,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:14:04,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:14:04,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1715146.6666666667, ans=0.0 2023-10-04 16:14:06,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:14:06,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1715146.6666666667, ans=0.125 2023-10-04 16:14:08,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:14:09,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:14:10,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:14:10,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1715146.6666666667, ans=0.125 2023-10-04 16:14:15,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:14:18,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:14:18,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 16:14:18,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:18,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:14:21,492 INFO [train.py:1046] (2/4) Epoch 49, batch 2300, loss[loss=0.1637, simple_loss=0.2411, pruned_loss=0.0431, over 23433.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2329, pruned_loss=0.03588, over 4708415.87 frames. ], batch size: 93, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:14:21,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 16:14:24,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:14:24,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:30,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:30,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:14:33,156 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 16:14:34,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:14:34,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1715280.0, ans=0.125 2023-10-04 16:14:35,845 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.800e+02 2.150e+02 2.484e+02 2.874e+02 4.816e+02, threshold=4.968e+02, percent-clipped=1.0 2023-10-04 16:14:41,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:14:41,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:14:43,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:14:43,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:14:43,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 16:14:43,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:14:46,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:14:47,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:14:51,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:14:52,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1715346.6666666667, ans=0.2 2023-10-04 16:14:53,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:14:56,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:00,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1715346.6666666667, ans=0.125 2023-10-04 16:15:02,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:15:03,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:15:04,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:15:06,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:15:10,090 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.46 vs. limit=15.0 2023-10-04 16:15:10,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:15:10,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:15:12,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:15:12,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 16:15:15,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:15:15,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:15,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:17,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:15:17,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:15:19,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 16:15:19,074 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:15:20,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 16:15:20,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:15:20,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:20,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 16:15:26,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:15:29,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:15:33,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:15:33,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:15:35,176 INFO [train.py:1046] (2/4) Epoch 49, batch 2350, loss[loss=0.1494, simple_loss=0.2437, pruned_loss=0.02755, over 24317.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.03587, over 4716370.39 frames. ], batch size: 74, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:15:35,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:15:36,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:15:36,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:15:36,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:15:38,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 16:15:41,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1715546.6666666667, ans=10.0 2023-10-04 16:15:45,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:15:45,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 16:15:48,620 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:15:49,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 16:15:53,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:54,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:54,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:54,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:15:54,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:15:56,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 16:15:58,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:16:03,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 16:16:04,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:16:08,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:16:08,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:16:10,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:16:13,701 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 16:16:13,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:16:15,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:16:15,180 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:16:15,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:16:20,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:16:23,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 16:16:23,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:16:24,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:16:24,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:16:27,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 16:16:28,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:16:30,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 16:16:31,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:16:36,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 16:16:40,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 16:16:41,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:16:41,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:16:41,887 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 16:16:41,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 16:16:45,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 16:16:46,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:16:49,293 INFO [train.py:1046] (2/4) Epoch 49, batch 2400, loss[loss=0.1464, simple_loss=0.2283, pruned_loss=0.03225, over 24363.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2334, pruned_loss=0.03613, over 4702151.41 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:16:52,123 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:16:52,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1715880.0, ans=0.125 2023-10-04 16:16:55,459 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:16:56,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:16:57,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 16:16:57,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1715880.0, ans=0.0 2023-10-04 16:16:58,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 16:17:03,998 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.122e+02 2.325e+02 2.661e+02 3.983e+02, threshold=4.649e+02, percent-clipped=0.0 2023-10-04 16:17:05,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:17:05,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:17:08,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 16:17:08,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:17:09,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:09,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 16:17:13,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:14,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1715946.6666666667, ans=0.1 2023-10-04 16:17:15,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 16:17:15,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1715946.6666666667, ans=0.0 2023-10-04 16:17:20,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1716013.3333333333, ans=0.0 2023-10-04 16:17:21,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:17:21,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.65 vs. limit=22.5 2023-10-04 16:17:24,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 16:17:27,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:17:27,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:31,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:17:32,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1716080.0, ans=0.2 2023-10-04 16:17:33,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 16:17:33,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:17:40,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:41,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:17:45,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:17:45,237 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:17:45,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:17:46,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:17:46,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:46,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:17:46,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:17:48,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-10-04 16:17:52,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:17:52,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:17:53,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 16:17:55,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 16:17:55,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:17:57,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:57,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 16:17:58,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 16:17:58,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 16:17:58,509 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 16:17:58,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 16:17:59,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:18:01,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:01,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:02,544 INFO [train.py:1046] (2/4) Epoch 49, batch 2450, loss[loss=0.1435, simple_loss=0.229, pruned_loss=0.02895, over 24629.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2328, pruned_loss=0.03584, over 4703512.83 frames. ], batch size: 65, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:18:02,639 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 16:18:02,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:04,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:18:07,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:18:07,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:07,588 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:18:11,394 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:11,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:11,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1716213.3333333333, ans=0.0 2023-10-04 16:18:12,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 16:18:19,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:18:19,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:22,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:18:22,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:18:22,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:18:23,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 16:18:26,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:28,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:18:29,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:18:33,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:18:33,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:35,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:35,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:37,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 16:18:38,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:18:39,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=12.0 2023-10-04 16:18:40,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.61 vs. limit=15.0 2023-10-04 16:18:47,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:47,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:48,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:18:50,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:18:50,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:50,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:18:51,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 16:18:53,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:55,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:18:58,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:58,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:19:02,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:19:02,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 16:19:02,887 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:19:04,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:19:04,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 16:19:04,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:19:06,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:19:08,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:19:11,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:19:11,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:19:16,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 16:19:17,554 INFO [train.py:1046] (2/4) Epoch 49, batch 2500, loss[loss=0.1639, simple_loss=0.2392, pruned_loss=0.0443, over 23870.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2321, pruned_loss=0.03545, over 4719999.89 frames. ], batch size: 195, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:19:17,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:19:23,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:19:24,315 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.83 vs. limit=22.5 2023-10-04 16:19:32,427 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.204e+02 2.495e+02 3.004e+02 4.519e+02, threshold=4.991e+02, percent-clipped=0.0 2023-10-04 16:19:32,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:19:32,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:19:33,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:19:33,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 16:19:41,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:19:41,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:19:43,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:19:43,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:19:44,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 16:19:46,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:46,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:19:47,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 16:19:47,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:47,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 16:19:47,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:19:52,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:19:52,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1716680.0, ans=0.125 2023-10-04 16:19:53,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:19:56,565 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:19:56,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 16:19:57,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:19:59,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:02,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:02,432 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:20:06,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:10,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:20:14,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:20:17,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 16:20:17,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:20:17,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:20:20,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:20:20,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:20:21,806 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 16:20:21,807 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 16:20:21,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 16:20:26,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:27,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 16:20:27,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 16:20:27,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:20:29,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 16:20:31,902 INFO [train.py:1046] (2/4) Epoch 49, batch 2550, loss[loss=0.1533, simple_loss=0.2269, pruned_loss=0.03986, over 23712.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2321, pruned_loss=0.03547, over 4719920.64 frames. ], batch size: 212, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:20:32,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 16:20:35,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1716880.0, ans=0.125 2023-10-04 16:20:36,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:20:36,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:20:38,044 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:20:38,199 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:20:39,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 16:20:41,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:20:43,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 16:20:45,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:20:48,584 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:50,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:20:51,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 16:20:51,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:20:51,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:20:52,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:54,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:20:54,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 16:20:55,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:20:55,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:55,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 16:20:57,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1716946.6666666667, ans=0.1 2023-10-04 16:21:08,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:21:13,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.15 vs. limit=12.0 2023-10-04 16:21:13,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:13,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:13,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:21:14,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:21:18,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1717080.0, ans=0.125 2023-10-04 16:21:21,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:21:24,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:21:24,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:21:24,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:21:24,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:21:24,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:21:28,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:28,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:32,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:21:32,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 16:21:32,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:21:32,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:34,283 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:21:35,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1717146.6666666667, ans=0.2 2023-10-04 16:21:37,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:21:37,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:21:43,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:21:43,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1717146.6666666667, ans=0.125 2023-10-04 16:21:45,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:21:45,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1717213.3333333333, ans=0.2 2023-10-04 16:21:46,533 INFO [train.py:1046] (2/4) Epoch 49, batch 2600, loss[loss=0.1422, simple_loss=0.2167, pruned_loss=0.03387, over 19897.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2322, pruned_loss=0.03554, over 4721933.17 frames. ], batch size: 43, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:21:47,839 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 16:21:49,350 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 16:21:49,374 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:21:51,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 16:21:51,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 16:21:51,115 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 16:21:53,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:53,852 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 16:21:53,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1717213.3333333333, ans=0.1 2023-10-04 16:21:56,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 16:21:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 16:21:59,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:22:00,745 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.147e+02 2.547e+02 3.024e+02 6.453e+02, threshold=5.093e+02, percent-clipped=2.0 2023-10-04 16:22:00,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 16:22:02,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 16:22:03,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:22:03,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 16:22:06,372 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 16:22:06,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 16:22:08,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1717280.0, ans=0.1 2023-10-04 16:22:13,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:13,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:13,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:22:13,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 16:22:16,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:22:16,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1717346.6666666667, ans=0.125 2023-10-04 16:22:21,694 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 16:22:22,516 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=17.51 vs. limit=15.0 2023-10-04 16:22:29,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:29,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:29,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 16:22:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:22:30,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:22:32,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 16:22:34,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:22:34,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:22:36,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:22:41,048 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 16:22:41,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:22:41,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:22:48,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:22:48,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:22:48,893 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 16:22:48,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:50,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1717480.0, ans=0.125 2023-10-04 16:22:51,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:22:51,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:22:54,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1717480.0, ans=0.125 2023-10-04 16:22:55,051 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.89 vs. limit=22.5 2023-10-04 16:22:56,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 16:22:56,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:58,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1717480.0, ans=0.125 2023-10-04 16:22:59,243 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:23:00,770 INFO [train.py:1046] (2/4) Epoch 49, batch 2650, loss[loss=0.1706, simple_loss=0.2442, pruned_loss=0.04852, over 22691.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2327, pruned_loss=0.03556, over 4723612.22 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:23:02,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 16:23:02,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:03,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:23:04,906 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 16:23:04,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:06,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:09,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:23:10,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:23:12,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:23:14,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 16:23:14,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:23:14,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:23:18,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 16:23:20,353 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 16:23:23,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:23:25,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 16:23:25,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:27,620 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 16:23:30,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:30,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:23:31,974 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:32,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:23:37,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 16:23:37,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 16:23:40,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:23:42,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 16:23:42,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:44,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:23:44,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:23:44,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:44,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:23:47,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:47,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1717746.6666666667, ans=0.125 2023-10-04 16:23:48,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:23:48,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:49,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1717746.6666666667, ans=0.1 2023-10-04 16:23:50,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:23:50,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:23:53,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:53,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:23:54,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:55,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1717746.6666666667, ans=0.0 2023-10-04 16:23:55,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1717746.6666666667, ans=0.125 2023-10-04 16:23:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:23:56,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:23:58,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1717813.3333333333, ans=0.0 2023-10-04 16:23:59,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:01,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:24:01,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:24:01,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 16:24:04,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1717813.3333333333, ans=0.125 2023-10-04 16:24:05,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:24:06,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:08,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:09,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:09,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:24:10,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:13,977 INFO [train.py:1046] (2/4) Epoch 49, batch 2700, loss[loss=0.1524, simple_loss=0.2289, pruned_loss=0.03798, over 20116.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2338, pruned_loss=0.03562, over 4723229.01 frames. ], batch size: 43, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:24:14,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:24:14,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 16:24:15,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:24:16,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 16:24:20,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:24:20,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:20,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:22,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:24:22,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:24:22,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:24:23,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:24:23,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 16:24:24,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:24:27,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:24:28,699 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.037e+02 2.270e+02 2.571e+02 4.005e+02, threshold=4.540e+02, percent-clipped=0.0 2023-10-04 16:24:28,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:24:30,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:33,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:24:33,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1717946.6666666667, ans=0.125 2023-10-04 16:24:33,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1717946.6666666667, ans=0.0 2023-10-04 16:24:34,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 16:24:36,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:24:40,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:24:40,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:24:41,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1718013.3333333333, ans=0.05 2023-10-04 16:24:46,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:24:46,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:24:46,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:24:46,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:24:49,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:24:51,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:24:51,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:24:51,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:24:57,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:57,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:24:59,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1718080.0, ans=0.125 2023-10-04 16:25:04,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:25:05,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:25:09,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:25:09,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:12,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:25:13,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:15,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:25:15,447 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:16,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:25:16,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:25:20,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:25:21,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:25:21,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:25:24,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 16:25:24,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:26,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:25:26,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 16:25:27,702 INFO [train.py:1046] (2/4) Epoch 49, batch 2750, loss[loss=0.1425, simple_loss=0.2204, pruned_loss=0.03232, over 24270.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2339, pruned_loss=0.0355, over 4728390.80 frames. ], batch size: 56, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:25:27,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 16:25:27,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:30,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:32,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:32,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1718213.3333333333, ans=0.2 2023-10-04 16:25:35,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:35,798 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.98 vs. limit=15.0 2023-10-04 16:25:36,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:25:36,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:39,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:25:39,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:25:40,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:25:40,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:40,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 16:25:40,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:25:40,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:46,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 16:25:47,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:25:47,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:49,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:25:49,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:25:49,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:51,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:25:53,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:53,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:56,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:25:56,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:25:56,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:25:57,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:58,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:26:05,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:26:08,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:26:08,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:11,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:26:11,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:26:11,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:26:15,522 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.84 vs. limit=15.0 2023-10-04 16:26:18,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:26:18,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:26:18,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 16:26:23,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:24,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 16:26:29,108 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:26:31,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:26:31,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 16:26:33,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:26:35,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:26:35,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 16:26:36,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:26:39,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 16:26:39,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:26:40,614 INFO [train.py:1046] (2/4) Epoch 49, batch 2800, loss[loss=0.1553, simple_loss=0.2387, pruned_loss=0.03596, over 23446.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2321, pruned_loss=0.0357, over 4713854.59 frames. ], batch size: 93, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:26:40,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:26:40,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 16:26:41,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:26:41,945 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:44,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:26:45,879 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 16:26:45,880 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 16:26:47,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:48,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1718546.6666666667, ans=0.125 2023-10-04 16:26:51,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:26:52,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:26:55,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:26:56,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1718613.3333333333, ans=0.2 2023-10-04 16:26:56,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1718613.3333333333, ans=0.1 2023-10-04 16:26:57,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.030e+02 2.375e+02 2.876e+02 4.786e+02, threshold=4.750e+02, percent-clipped=2.0 2023-10-04 16:26:58,602 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 16:27:00,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 16:27:00,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 16:27:01,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:02,947 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:27:02,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:03,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1718613.3333333333, ans=0.0 2023-10-04 16:27:07,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:07,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:08,933 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:27:09,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:27:15,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:27:15,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:27:16,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1718680.0, ans=0.0 2023-10-04 16:27:19,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:20,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:27:20,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:26,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:27:26,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 16:27:27,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:27:28,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:28,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:27:31,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:27:32,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:37,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:27:38,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:27:38,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:38,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:27:39,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:27:40,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:27:41,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:42,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 16:27:42,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:27:44,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:27:44,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:27:45,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 16:27:46,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:46,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:27:48,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:27:49,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 16:27:52,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:52,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:27:54,865 INFO [train.py:1046] (2/4) Epoch 49, batch 2850, loss[loss=0.1503, simple_loss=0.221, pruned_loss=0.03978, over 23639.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.232, pruned_loss=0.03547, over 4708286.39 frames. ], batch size: 232, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:27:54,966 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:27:55,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1718880.0, ans=0.04949747468305833 2023-10-04 16:27:58,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:00,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:28:00,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:02,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:28:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:04,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:28:05,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1718880.0, ans=0.125 2023-10-04 16:28:06,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:28:07,049 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 16:28:13,938 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 16:28:13,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:15,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 16:28:16,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:19,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 16:28:19,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 16:28:21,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:32,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:32,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1719013.3333333333, ans=0.125 2023-10-04 16:28:33,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:28:33,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:28:35,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:28:35,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:28:35,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:28:38,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:28:38,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1719080.0, ans=0.125 2023-10-04 16:28:39,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 16:28:40,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:28:40,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:28:40,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:42,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:45,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:45,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:46,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:48,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:28:49,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:28:49,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:49,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1719080.0, ans=0.09899494936611666 2023-10-04 16:28:52,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:54,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:28:58,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:29:01,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 16:29:01,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 16:29:03,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:29:04,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:04,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 16:29:04,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:29:06,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:06,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:06,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:29:06,684 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 16:29:06,723 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 16:29:06,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:29:06,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1719146.6666666667, ans=0.0 2023-10-04 16:29:08,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:09,288 INFO [train.py:1046] (2/4) Epoch 49, batch 2900, loss[loss=0.1471, simple_loss=0.2395, pruned_loss=0.02738, over 24595.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2319, pruned_loss=0.03517, over 4723100.06 frames. ], batch size: 71, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:29:12,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:29:12,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:13,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:29:14,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 16:29:17,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:29:17,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 16:29:18,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1719213.3333333333, ans=0.05 2023-10-04 16:29:19,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 16:29:21,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:29:21,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:29:23,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:29:24,587 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.064e+02 2.211e+02 2.560e+02 4.990e+02, threshold=4.422e+02, percent-clipped=1.0 2023-10-04 16:29:24,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:29:28,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:29:28,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:29:31,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:29:31,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 16:29:32,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.19 vs. limit=22.5 2023-10-04 16:29:33,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:29:33,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:36,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 16:29:38,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 16:29:38,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.26 vs. limit=22.5 2023-10-04 16:29:40,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:40,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 16:29:40,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:29:43,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:29:43,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:29:43,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1719346.6666666667, ans=0.2 2023-10-04 16:29:46,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:29:47,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:51,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:54,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:29:55,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 16:29:57,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 16:29:57,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:30:00,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.37 vs. limit=15.0 2023-10-04 16:30:00,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:30:02,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 16:30:03,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:30:10,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:30:16,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:30:16,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:30:18,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 16:30:21,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:21,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 16:30:22,470 INFO [train.py:1046] (2/4) Epoch 49, batch 2950, loss[loss=0.1397, simple_loss=0.2221, pruned_loss=0.02861, over 24588.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2325, pruned_loss=0.03554, over 4722006.20 frames. ], batch size: 60, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:30:22,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:30:23,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:30:28,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:30:31,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.28 vs. limit=15.0 2023-10-04 16:30:32,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 16:30:33,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:30:33,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:33,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:30:33,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1719546.6666666667, ans=0.1 2023-10-04 16:30:34,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:30:36,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 16:30:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 16:30:38,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:30:38,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:30:43,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.68 vs. limit=15.0 2023-10-04 16:30:44,311 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:30:46,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:30:48,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:30:48,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:30:51,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:30:51,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:30:52,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:54,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:54,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:30:55,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 16:31:01,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 16:31:02,858 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 16:31:02,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:31:04,722 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 16:31:04,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1719680.0, ans=0.07 2023-10-04 16:31:06,593 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 16:31:06,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:31:07,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:31:07,988 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 16:31:07,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:31:09,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 16:31:10,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:31:10,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:31:12,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:31:14,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:31:14,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:15,575 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 16:31:15,622 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:31:15,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 16:31:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:22,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:31:22,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 16:31:22,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:31:22,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1719813.3333333333, ans=0.0 2023-10-04 16:31:24,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 16:31:26,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:31:28,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:31:30,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:31:31,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.97 vs. limit=22.5 2023-10-04 16:31:31,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:31,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:31:32,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:31:32,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:32,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:31:34,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:31:36,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:31:36,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:31:37,787 INFO [train.py:1046] (2/4) Epoch 49, batch 3000, loss[loss=0.1513, simple_loss=0.2361, pruned_loss=0.0332, over 24476.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.03594, over 4723096.27 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:31:37,787 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 16:31:46,519 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.5789, 3.2795, 2.8431, 2.8574], device='cuda:2') 2023-10-04 16:31:49,872 INFO [train.py:1078] (2/4) Epoch 49, validation: loss=0.3542, simple_loss=0.2825, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-04 16:31:49,873 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 16:31:49,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:50,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 16:31:51,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:54,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:31:54,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:31:57,311 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 16:31:57,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.47 vs. limit=15.0 2023-10-04 16:31:58,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 16:32:00,221 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:32:00,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:32:02,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 16:32:02,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:32:06,781 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.107e+02 2.317e+02 2.666e+02 4.231e+02, threshold=4.633e+02, percent-clipped=0.0 2023-10-04 16:32:08,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:32:17,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:32:22,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 16:32:26,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:32:27,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:32:27,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1720013.3333333333, ans=0.125 2023-10-04 16:32:28,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:32:28,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:32:30,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-04 16:32:31,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:32:31,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 16:32:33,963 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.10 vs. limit=15.0 2023-10-04 16:32:34,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 16:32:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:32:36,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:32:39,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:32:39,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:32:41,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:41,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:32:43,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:32:45,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:32:45,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:32:45,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:32:46,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 16:32:48,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:32:49,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:32:49,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:32:51,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1720146.6666666667, ans=0.1 2023-10-04 16:32:53,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:53,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:56,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 16:32:56,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 16:32:56,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:32:56,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 16:32:58,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:33:01,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 16:33:02,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:33:03,778 INFO [train.py:1046] (2/4) Epoch 49, batch 3050, loss[loss=0.1562, simple_loss=0.2397, pruned_loss=0.03636, over 23603.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2342, pruned_loss=0.03633, over 4727919.75 frames. ], batch size: 85, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:33:03,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:33:03,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 16:33:03,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 16:33:03,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:33:05,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:33:05,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:33:05,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:33:07,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:07,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:33:09,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1720213.3333333333, ans=0.125 2023-10-04 16:33:10,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 16:33:13,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:33:15,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:16,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:33:19,942 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.79 vs. limit=22.5 2023-10-04 16:33:20,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:22,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 16:33:25,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1720280.0, ans=0.0 2023-10-04 16:33:29,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 16:33:29,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 16:33:29,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:33:32,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:33:34,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:34,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:35,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:38,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:33:38,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:33:38,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:33:38,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:38,656 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:40,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:40,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:33:43,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:33:43,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 16:33:43,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1720346.6666666667, ans=0.035 2023-10-04 16:33:45,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:45,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:33:45,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1720346.6666666667, ans=0.1 2023-10-04 16:33:48,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:33:49,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:33:50,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:33:50,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:33:54,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:55,837 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:00,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:01,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:34:01,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:34:05,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:34:05,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:34:05,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:34:06,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 16:34:08,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:34:09,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:11,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 16:34:11,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1720480.0, ans=0.125 2023-10-04 16:34:13,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:17,315 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:18,651 INFO [train.py:1046] (2/4) Epoch 49, batch 3100, loss[loss=0.1374, simple_loss=0.2043, pruned_loss=0.03518, over 23457.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2336, pruned_loss=0.03589, over 4737492.02 frames. ], batch size: 285, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:34:18,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:34:21,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:34:22,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 16:34:25,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 16:34:25,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 16:34:25,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1720546.6666666667, ans=0.1 2023-10-04 16:34:26,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:34:28,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.29 vs. limit=15.0 2023-10-04 16:34:30,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:34:30,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:33,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 16:34:34,929 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.156e+02 2.470e+02 2.977e+02 4.757e+02, threshold=4.941e+02, percent-clipped=2.0 2023-10-04 16:34:35,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1720613.3333333333, ans=0.125 2023-10-04 16:34:36,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:41,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 16:34:42,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1720613.3333333333, ans=0.2 2023-10-04 16:34:45,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:34:46,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:34:46,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:34:46,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:34:46,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 16:34:48,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:34:48,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 16:34:48,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:34:50,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:52,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 16:34:54,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:34:56,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1720680.0, ans=0.125 2023-10-04 16:34:58,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:34:58,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 16:34:59,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 16:35:01,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:01,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:35:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:02,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:02,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:35:03,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:35:03,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:35:09,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:35:09,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:35:09,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:09,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 16:35:13,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:35:14,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 16:35:17,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:35:18,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 16:35:18,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:19,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:20,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 16:35:30,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 16:35:31,988 INFO [train.py:1046] (2/4) Epoch 49, batch 3150, loss[loss=0.1455, simple_loss=0.2213, pruned_loss=0.03486, over 23653.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2326, pruned_loss=0.03586, over 4729492.46 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:35:32,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:33,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:33,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:35:33,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:35:34,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 16:35:36,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:37,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:35:39,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 16:35:41,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:42,875 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 16:35:45,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 16:35:45,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:35:47,035 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 16:35:48,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 16:35:49,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 16:35:49,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 16:35:49,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 16:35:50,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1720946.6666666667, ans=0.125 2023-10-04 16:35:51,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:51,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:35:51,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:53,861 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 16:35:54,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1720946.6666666667, ans=0.07 2023-10-04 16:35:55,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:55,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:55,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:35:57,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:36:00,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1721013.3333333333, ans=0.0 2023-10-04 16:36:01,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 16:36:01,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:36:02,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:36:02,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:36:03,822 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.93 vs. limit=22.5 2023-10-04 16:36:04,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 16:36:07,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 16:36:07,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1721013.3333333333, ans=0.2 2023-10-04 16:36:09,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:36:09,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 16:36:09,136 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:36:10,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:36:10,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:36:12,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:36:12,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:36:13,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 16:36:14,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1721013.3333333333, ans=0.125 2023-10-04 16:36:14,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.83 vs. limit=12.0 2023-10-04 16:36:15,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:36:15,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:16,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:36:16,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:36:17,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 16:36:19,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:20,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 16:36:20,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:21,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 16:36:23,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 16:36:26,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:36:26,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:27,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 16:36:28,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 16:36:30,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:36:32,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:36:33,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:33,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:36:39,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:36:40,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:42,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 16:36:44,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1721213.3333333333, ans=0.07 2023-10-04 16:36:45,685 INFO [train.py:1046] (2/4) Epoch 49, batch 3200, loss[loss=0.1395, simple_loss=0.2244, pruned_loss=0.02729, over 24505.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2328, pruned_loss=0.03562, over 4725572.84 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:36:48,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:36:48,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:36:51,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:51,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:36:51,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 16:36:52,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:37:00,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:37:01,620 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.995e+02 2.234e+02 2.632e+02 4.209e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 16:37:07,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:37:18,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 16:37:18,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:37:24,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 16:37:24,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:37:24,497 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:37:28,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:37:28,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:37:30,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:37:33,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 16:37:34,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 16:37:34,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1721413.3333333333, ans=0.125 2023-10-04 16:37:35,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 16:37:36,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1721413.3333333333, ans=0.0 2023-10-04 16:37:38,645 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 16:37:41,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:37:46,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:37:47,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:37:48,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:37:48,076 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 16:37:48,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:37:48,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1721480.0, ans=0.125 2023-10-04 16:37:51,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:37:52,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 16:37:52,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 16:37:53,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 16:37:55,485 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:37:56,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 16:37:57,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:37:59,231 INFO [train.py:1046] (2/4) Epoch 49, batch 3250, loss[loss=0.1634, simple_loss=0.2371, pruned_loss=0.04484, over 23756.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2321, pruned_loss=0.03538, over 4734041.15 frames. ], batch size: 212, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:38:00,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:38:00,011 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 16:38:00,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:00,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:01,388 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 16:38:04,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:38:07,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:38:07,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1721546.6666666667, ans=0.1 2023-10-04 16:38:12,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:38:12,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 16:38:14,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:14,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:38:14,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:38:15,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:38:16,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:38:18,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:38:19,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:19,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:38:21,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:22,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:38:25,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:25,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:27,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:28,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:38:28,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:38:33,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.74 vs. limit=15.0 2023-10-04 16:38:33,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 16:38:34,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1721680.0, ans=0.2 2023-10-04 16:38:35,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:38:35,215 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:38:35,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:37,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:38:41,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:38:44,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1721746.6666666667, ans=0.0 2023-10-04 16:38:50,002 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:38:51,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:51,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 16:38:51,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:38:51,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:38:52,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:54,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 16:38:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 16:38:55,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:38:56,364 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.61 vs. limit=10.0 2023-10-04 16:38:56,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:58,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:38:58,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1721813.3333333333, ans=0.0 2023-10-04 16:38:59,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 16:38:59,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:39:04,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:39:04,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:39:05,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 16:39:05,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:08,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:39:08,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 16:39:12,831 INFO [train.py:1046] (2/4) Epoch 49, batch 3300, loss[loss=0.1601, simple_loss=0.2361, pruned_loss=0.04202, over 23854.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2334, pruned_loss=0.03581, over 4738364.11 frames. ], batch size: 195, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:39:12,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:39:12,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 16:39:14,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.23 vs. limit=15.0 2023-10-04 16:39:15,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 16:39:15,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 16:39:17,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:39:19,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:39:21,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:39:21,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:22,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:39:23,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:39:26,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:27,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:39:29,111 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.089e+02 2.371e+02 2.866e+02 4.389e+02, threshold=4.743e+02, percent-clipped=0.0 2023-10-04 16:39:30,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 16:39:31,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:39:31,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:32,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:34,026 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 16:39:35,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:39:35,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:39:35,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1721946.6666666667, ans=0.125 2023-10-04 16:39:36,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:39:36,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:39:36,673 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 16:39:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:39:41,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:39:42,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:42,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 16:39:44,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 16:39:44,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:45,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:39:47,602 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 16:39:50,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 16:39:50,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:39:53,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 16:39:56,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:39:57,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:39:58,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:39:59,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:00,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:40:00,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:40:00,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:40:02,697 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:40:02,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:40:04,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:40:05,377 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 16:40:05,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 16:40:07,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1722080.0, ans=0.125 2023-10-04 16:40:08,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:40:08,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:40:08,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:11,476 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:40:11,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:12,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.51 vs. limit=12.0 2023-10-04 16:40:13,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:40:14,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:14,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:40:15,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:40:17,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:40:19,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1722146.6666666667, ans=0.0 2023-10-04 16:40:22,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 16:40:22,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:22,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:40:25,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:40:26,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:27,992 INFO [train.py:1046] (2/4) Epoch 49, batch 3350, loss[loss=0.1513, simple_loss=0.2316, pruned_loss=0.03553, over 24673.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2339, pruned_loss=0.03581, over 4754983.36 frames. ], batch size: 65, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:40:28,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:28,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:40:33,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:33,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:40:36,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:38,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:40:39,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:41,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:40:42,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 16:40:44,358 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 16:40:44,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:46,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1722280.0, ans=0.1 2023-10-04 16:40:47,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 16:40:47,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 16:40:47,296 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:40:47,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:40:49,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:40:50,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.90 vs. limit=10.0 2023-10-04 16:40:50,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 16:40:50,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:50,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:40:53,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:56,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:56,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:56,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:41:00,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:03,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:03,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1722346.6666666667, ans=0.125 2023-10-04 16:41:03,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1722346.6666666667, ans=0.125 2023-10-04 16:41:04,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:07,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:41:09,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:41:10,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:10,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:16,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 16:41:16,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:41:16,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 16:41:16,416 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:41:17,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 16:41:19,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:20,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:22,783 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:41:26,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:26,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 16:41:28,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:41:29,978 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:41:30,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:41:35,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:41:35,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 16:41:36,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:41:36,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:41:38,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:40,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 16:41:40,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:40,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 16:41:41,485 INFO [train.py:1046] (2/4) Epoch 49, batch 3400, loss[loss=0.1591, simple_loss=0.2411, pruned_loss=0.03853, over 24363.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2349, pruned_loss=0.0361, over 4747088.10 frames. ], batch size: 77, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:41:41,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:41:41,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:41:42,296 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.70 vs. limit=22.5 2023-10-04 16:41:42,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:41:44,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:41:44,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 16:41:49,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 16:41:49,678 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 16:41:49,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:41:52,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1722546.6666666667, ans=0.0 2023-10-04 16:41:53,011 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=22.5 2023-10-04 16:41:53,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:41:53,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:41:55,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:41:57,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:41:59,746 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.147e+02 2.480e+02 2.893e+02 4.349e+02, threshold=4.960e+02, percent-clipped=0.0 2023-10-04 16:42:02,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:42:02,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 16:42:08,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:42:11,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:42:11,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:42:12,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:42:17,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:42:22,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 16:42:26,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:42:28,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:42:28,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 16:42:28,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:42:28,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:42:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:42:31,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:42:34,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:42:38,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:42:38,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:42:42,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:42:44,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 16:42:49,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:42:51,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1722813.3333333333, ans=0.05 2023-10-04 16:42:53,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1722813.3333333333, ans=0.125 2023-10-04 16:42:55,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 16:42:56,865 INFO [train.py:1046] (2/4) Epoch 49, batch 3450, loss[loss=0.1461, simple_loss=0.2402, pruned_loss=0.02598, over 24311.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2356, pruned_loss=0.03608, over 4736183.71 frames. ], batch size: 74, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:42:58,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 16:42:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:42:59,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:43:01,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 16:43:01,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:43:04,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:43:05,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=15.0 2023-10-04 16:43:08,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:43:10,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:43:11,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:43:11,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:13,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:19,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 16:43:19,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1722946.6666666667, ans=0.1 2023-10-04 16:43:26,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 16:43:26,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:43:26,627 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:43:28,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:43:35,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 16:43:35,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:43:39,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:43:39,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:43:41,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1723080.0, ans=0.125 2023-10-04 16:43:42,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:43:43,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:43:45,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 16:43:45,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:43:46,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:51,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:43:52,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 16:43:57,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:43:57,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1723146.6666666667, ans=0.125 2023-10-04 16:43:57,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1723146.6666666667, ans=0.1 2023-10-04 16:44:00,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:44:01,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:02,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:03,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1723146.6666666667, ans=0.125 2023-10-04 16:44:07,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:07,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:44:08,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:44:10,144 INFO [train.py:1046] (2/4) Epoch 49, batch 3500, loss[loss=0.1427, simple_loss=0.2315, pruned_loss=0.02697, over 24505.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2344, pruned_loss=0.03572, over 4735198.43 frames. ], batch size: 63, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:44:10,191 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:44:12,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:17,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:44:17,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 16:44:18,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:44:21,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1723213.3333333333, ans=0.05 2023-10-04 16:44:22,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 16:44:25,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:25,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 16:44:28,204 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.425e+02 2.856e+02 5.490e+02, threshold=4.850e+02, percent-clipped=2.0 2023-10-04 16:44:28,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:44:29,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:44:31,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:44:31,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:44:31,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:44:32,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:33,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:44:33,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 16:44:35,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1723280.0, ans=0.125 2023-10-04 16:44:37,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:37,139 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:44:38,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:44:42,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:42,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 16:44:44,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:44:47,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:44:47,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:44:48,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:50,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:44:50,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:44:53,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 16:44:55,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 16:44:55,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 16:44:55,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:44:56,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:56,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:44:56,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:45:00,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:45:01,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:45:06,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:45:08,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 16:45:08,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 16:45:08,385 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:11,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:45:11,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:45:12,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:45:14,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 16:45:15,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:45:17,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:45:17,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 16:45:19,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 16:45:22,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:45:23,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:45:23,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:23,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:25,083 INFO [train.py:1046] (2/4) Epoch 49, batch 3550, loss[loss=0.1597, simple_loss=0.2486, pruned_loss=0.03536, over 24355.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2328, pruned_loss=0.03554, over 4716994.29 frames. ], batch size: 77, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:45:26,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:45:30,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1723546.6666666667, ans=0.125 2023-10-04 16:45:35,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:38,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 16:45:40,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:45:41,364 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:45:43,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:44,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:45:44,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:45:47,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:47,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:45:48,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:48,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:45:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:45:53,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:45:53,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:55,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:45:55,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:56,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:45:58,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 16:45:58,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:58,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:59,027 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.92 vs. limit=12.0 2023-10-04 16:45:59,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:45:59,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1723680.0, ans=0.0 2023-10-04 16:46:03,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:05,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:46:05,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:07,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 16:46:07,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:46:09,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 16:46:09,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:46:12,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:46:12,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:46:15,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 16:46:15,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:46:20,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1723746.6666666667, ans=0.125 2023-10-04 16:46:22,468 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:46:22,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.30 vs. limit=15.0 2023-10-04 16:46:23,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:46:23,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 16:46:24,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:29,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:46:30,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 16:46:37,413 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 16:46:37,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:46:37,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:46:38,826 INFO [train.py:1046] (2/4) Epoch 49, batch 3600, loss[loss=0.1471, simple_loss=0.2328, pruned_loss=0.03065, over 24625.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.232, pruned_loss=0.03569, over 4707280.58 frames. ], batch size: 65, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:46:38,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:40,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:40,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:46:40,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1723880.0, ans=0.125 2023-10-04 16:46:44,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:46:45,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:47,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:46:48,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:46:48,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:48,302 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 16:46:53,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:46:54,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:57,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:46:59,088 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.052e+02 2.311e+02 2.640e+02 4.130e+02, threshold=4.623e+02, percent-clipped=0.0 2023-10-04 16:47:00,615 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:47:01,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:47:03,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:47:04,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 16:47:06,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:47:06,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1723946.6666666667, ans=10.0 2023-10-04 16:47:07,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:47:08,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:47:10,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:11,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:47:13,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:47:14,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 16:47:20,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:47:22,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:47:23,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 16:47:28,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:47:32,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:34,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:40,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:47:40,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:47:40,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 16:47:42,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 16:47:44,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 16:47:46,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:47:46,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:47:47,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 16:47:47,531 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:47:48,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:47:48,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:47:50,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 16:47:50,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 16:47:51,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-04 16:47:53,501 INFO [train.py:1046] (2/4) Epoch 49, batch 3650, loss[loss=0.1416, simple_loss=0.2194, pruned_loss=0.03191, over 23587.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2324, pruned_loss=0.03579, over 4700973.79 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:47:53,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:53,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 16:47:56,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1724213.3333333333, ans=0.0 2023-10-04 16:47:59,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 16:48:01,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:48:01,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1724213.3333333333, ans=0.125 2023-10-04 16:48:03,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-04 16:48:03,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 16:48:05,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 16:48:08,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:48:08,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:48:08,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:48:08,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1724280.0, ans=0.125 2023-10-04 16:48:11,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:48:11,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:48:11,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1724280.0, ans=0.125 2023-10-04 16:48:13,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 16:48:13,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:48:13,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:48:15,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 16:48:15,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:48:15,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:48:15,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:18,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:48:20,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 16:48:22,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 16:48:22,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:48:23,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 16:48:26,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:48:26,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:48:30,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.57 vs. limit=15.0 2023-10-04 16:48:32,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:48:35,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:35,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:48:36,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:48:36,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:48:38,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:48:40,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:48:42,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:48:42,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:48:43,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:48:45,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:45,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:48:51,612 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 16:48:54,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:48:56,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:48:57,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:48:58,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:48:59,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:49:00,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:02,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 16:49:02,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:49:04,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:49:06,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:49:07,597 INFO [train.py:1046] (2/4) Epoch 49, batch 3700, loss[loss=0.1422, simple_loss=0.2259, pruned_loss=0.02928, over 24676.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2327, pruned_loss=0.03577, over 4719277.85 frames. ], batch size: 65, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:49:07,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:49:07,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1724546.6666666667, ans=0.0 2023-10-04 16:49:10,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:10,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 16:49:10,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:49:11,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 16:49:11,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:49:12,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1724546.6666666667, ans=0.125 2023-10-04 16:49:13,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1724546.6666666667, ans=0.125 2023-10-04 16:49:14,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:49:19,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:49:19,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:19,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:49:20,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:21,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:49:22,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:23,981 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 16:49:26,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.043e+02 2.235e+02 2.541e+02 4.177e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-04 16:49:31,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:49:32,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:49:34,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:49:34,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 16:49:34,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:49:37,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1724680.0, ans=0.0 2023-10-04 16:49:38,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:39,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 16:49:41,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:42,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:49:43,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:44,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:49:46,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 16:49:51,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:49:51,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 16:49:51,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:51,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 16:49:57,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:49:57,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:49:59,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:01,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 16:50:04,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:50:04,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:50:04,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:50:04,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:07,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1724813.3333333333, ans=0.0 2023-10-04 16:50:09,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:50:11,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 16:50:12,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 16:50:12,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:50:13,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:15,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:50:15,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:50:18,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:50:19,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:50:20,891 INFO [train.py:1046] (2/4) Epoch 49, batch 3750, loss[loss=0.1656, simple_loss=0.2579, pruned_loss=0.03669, over 24458.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2344, pruned_loss=0.03644, over 4714425.10 frames. ], batch size: 69, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:50:21,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:50:22,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 16:50:22,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 16:50:26,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:50:27,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 16:50:27,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:50:29,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:29,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:31,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:50:33,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:50:36,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:50:37,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:50:40,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:43,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:50:44,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 16:50:44,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:50:46,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:50:46,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:50:50,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 16:50:55,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 16:50:56,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:50:56,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:50:58,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:03,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:04,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:51:04,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1725080.0, ans=0.0 2023-10-04 16:51:06,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1725080.0, ans=0.05 2023-10-04 16:51:08,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 16:51:11,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:14,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:51:14,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:51:17,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:51:22,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:51:23,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:51:25,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:51:28,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:51:29,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:51:36,221 INFO [train.py:1046] (2/4) Epoch 49, batch 3800, loss[loss=0.1476, simple_loss=0.2257, pruned_loss=0.03475, over 22481.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2335, pruned_loss=0.03648, over 4707112.46 frames. ], batch size: 49, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:51:37,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:51:42,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:42,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1725213.3333333333, ans=0.125 2023-10-04 16:51:43,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:51:43,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 16:51:44,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:47,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:51:47,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:51:48,515 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-10-04 16:51:50,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 16:51:50,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:51,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:51:53,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:53,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:51:54,676 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.120e+02 2.292e+02 2.756e+02 3.883e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 16:51:54,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:51:54,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 16:51:59,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 16:51:59,467 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:52:02,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:52:03,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1725280.0, ans=0.0 2023-10-04 16:52:05,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:52:05,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:52:07,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:52:07,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:52:10,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:11,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:52:16,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:52:16,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 16:52:17,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:52:24,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:52:28,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:52:31,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 16:52:33,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 16:52:34,496 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:52:35,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:52:37,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:37,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 16:52:38,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.50 vs. limit=10.0 2023-10-04 16:52:42,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 16:52:42,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 16:52:43,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:44,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:52:45,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1725480.0, ans=0.0 2023-10-04 16:52:49,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:52:50,408 INFO [train.py:1046] (2/4) Epoch 49, batch 3850, loss[loss=0.1498, simple_loss=0.2362, pruned_loss=0.03168, over 24468.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03573, over 4708169.87 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:52:50,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:52:54,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:52:56,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 16:52:57,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:52:57,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:53:01,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:53:05,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:06,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:53:07,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 16:53:13,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:13,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1725613.3333333333, ans=0.125 2023-10-04 16:53:14,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:53:15,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.86 vs. limit=22.5 2023-10-04 16:53:16,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:53:17,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:53:20,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:21,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:53:21,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:21,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:53:21,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1725680.0, ans=0.0 2023-10-04 16:53:23,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:24,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:24,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:24,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:53:25,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 16:53:26,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 16:53:26,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:53:27,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:30,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:30,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:30,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 16:53:35,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 16:53:35,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:38,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 16:53:38,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1725746.6666666667, ans=0.07 2023-10-04 16:53:40,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:53:44,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:45,690 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:49,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:50,411 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.96 vs. limit=22.5 2023-10-04 16:53:51,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 16:53:53,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 16:53:55,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:55,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:56,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1725813.3333333333, ans=0.0 2023-10-04 16:53:57,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:53:57,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:53:59,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:59,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:59,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:53:59,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 16:54:00,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:54:02,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 16:54:02,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:02,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:54:03,469 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:54:05,410 INFO [train.py:1046] (2/4) Epoch 49, batch 3900, loss[loss=0.1612, simple_loss=0.2406, pruned_loss=0.04087, over 23981.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2315, pruned_loss=0.03553, over 4691600.26 frames. ], batch size: 80, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:54:05,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:54:06,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:54:06,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:54:08,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:54:08,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 16:54:08,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:11,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:54:12,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:54:13,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:54:13,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:54:15,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:54:15,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:15,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1725880.0, ans=0.125 2023-10-04 16:54:16,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:54:16,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 16:54:16,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:54:19,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 16:54:19,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:20,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 16:54:23,187 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.321e+02 2.754e+02 3.506e+02 6.937e+02, threshold=5.508e+02, percent-clipped=5.0 2023-10-04 16:54:23,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 16:54:27,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:54:28,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:54:28,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:54:30,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:54:33,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:54:36,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:54:37,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:54:37,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:54:37,748 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:54:43,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:54:43,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1726013.3333333333, ans=0.0 2023-10-04 16:54:44,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:54:51,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:54:53,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:55:02,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:55:06,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:55:06,873 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 16:55:08,187 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 16:55:08,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:55:09,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 16:55:10,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:55:12,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 16:55:16,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:55:17,733 INFO [train.py:1046] (2/4) Epoch 49, batch 3950, loss[loss=0.1595, simple_loss=0.2347, pruned_loss=0.0421, over 23803.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2318, pruned_loss=0.0352, over 4708009.75 frames. ], batch size: 164, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 16:55:17,836 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 16:55:17,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:55:20,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:55:23,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:55:29,275 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 16:55:30,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:55:31,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 16:55:31,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1726280.0, ans=0.125 2023-10-04 16:55:32,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 16:55:32,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:55:36,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:55:36,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:55:36,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:55:40,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 16:55:41,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:55:42,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:55:43,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:55:43,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1726280.0, ans=0.0 2023-10-04 16:55:44,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:55:44,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:55:46,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1726346.6666666667, ans=0.125 2023-10-04 16:55:55,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:55:55,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:56:01,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 16:56:05,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 16:56:05,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 16:56:06,787 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.05 vs. limit=15.0 2023-10-04 16:56:07,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:56:08,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:56:14,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1726413.3333333333, ans=0.025 2023-10-04 16:56:16,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:56:16,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:56:17,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:56:17,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:56:18,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 16:56:22,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=15.0 2023-10-04 16:56:22,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:56:24,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:56:27,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 16:56:28,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1726480.0, ans=0.0 2023-10-04 16:56:31,256 INFO [train.py:1046] (2/4) Epoch 49, batch 4000, loss[loss=0.1656, simple_loss=0.252, pruned_loss=0.0396, over 24393.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2327, pruned_loss=0.03543, over 4720639.95 frames. ], batch size: 77, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 16:56:36,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:38,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1726546.6666666667, ans=0.1 2023-10-04 16:56:43,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:46,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:56:48,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:56:48,166 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:48,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 16:56:49,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:56:49,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 16:56:50,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:56:50,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 16:56:52,173 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.984e+02 2.160e+02 2.381e+02 3.286e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-04 16:56:52,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:56:52,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1726613.3333333333, ans=0.125 2023-10-04 16:56:55,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:56:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:56:55,127 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:56:56,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:56:56,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:56:58,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:57:00,724 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 16:57:00,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:57:00,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:03,997 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 16:57:05,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:57:05,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:57:13,678 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 16:57:13,719 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:57:16,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:57:17,877 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 16:57:19,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:57:19,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 16:57:19,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:57:20,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:21,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:57:23,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:57:24,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:57:24,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:57:27,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 16:57:27,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:28,999 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 16:57:31,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:57:32,484 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.45 vs. limit=12.0 2023-10-04 16:57:35,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 16:57:36,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:57:37,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:57:37,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:57:39,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:57:44,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:57:45,683 INFO [train.py:1046] (2/4) Epoch 49, batch 4050, loss[loss=0.1537, simple_loss=0.2463, pruned_loss=0.03054, over 24658.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2335, pruned_loss=0.03549, over 4723758.58 frames. ], batch size: 73, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 16:57:47,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 16:57:47,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 16:57:47,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1726880.0, ans=0.125 2023-10-04 16:57:48,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:57:48,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:57:49,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:57:51,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.70 vs. limit=15.0 2023-10-04 16:57:52,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:57:54,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:57:56,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:57:57,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.59 vs. limit=22.5 2023-10-04 16:57:59,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1726946.6666666667, ans=0.125 2023-10-04 16:58:00,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:58:00,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:58:01,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.29 vs. limit=15.0 2023-10-04 16:58:02,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:58:03,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:58:07,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:58:09,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:58:14,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 16:58:14,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 16:58:14,960 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 16:58:15,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1727013.3333333333, ans=0.0 2023-10-04 16:58:17,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:58:23,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1727013.3333333333, ans=0.0 2023-10-04 16:58:24,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 16:58:24,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:58:27,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:58:30,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:58:30,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:58:30,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:58:30,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1727080.0, ans=0.1 2023-10-04 16:58:34,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:58:37,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 16:58:37,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:58:38,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:58:40,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 16:58:46,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:58:52,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 16:58:54,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:58:54,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:58:56,314 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.84 vs. limit=15.0 2023-10-04 16:58:56,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 16:58:56,841 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 16:58:56,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:58:58,162 INFO [train.py:1046] (2/4) Epoch 49, batch 4100, loss[loss=0.1521, simple_loss=0.2421, pruned_loss=0.03109, over 24646.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2346, pruned_loss=0.03565, over 4733673.79 frames. ], batch size: 65, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 16:58:59,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:59:01,050 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:01,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:59:03,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.29 vs. limit=6.0 2023-10-04 16:59:04,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1727213.3333333333, ans=0.0 2023-10-04 16:59:07,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 16:59:08,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 16:59:09,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 16:59:10,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 16:59:10,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:11,404 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:11,437 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:11,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:59:12,831 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 16:59:17,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:59:17,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:59:17,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:17,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:59:20,690 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.809e+02 2.160e+02 2.515e+02 2.947e+02 5.173e+02, threshold=5.031e+02, percent-clipped=2.0 2023-10-04 16:59:23,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:59:23,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1727280.0, ans=0.125 2023-10-04 16:59:24,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:59:25,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:59:26,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 16:59:26,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:26,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:59:26,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:59:27,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:59:27,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 16:59:30,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:59:30,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1727346.6666666667, ans=0.0 2023-10-04 16:59:31,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 16:59:33,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:59:34,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:59:34,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 16:59:34,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:59:36,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:59:36,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1727346.6666666667, ans=0.2 2023-10-04 16:59:38,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:59:39,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 16:59:40,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:59:40,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:59:42,384 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 16:59:42,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:43,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:59:45,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:59:50,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:59:52,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1727413.3333333333, ans=0.125 2023-10-04 16:59:53,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:59:54,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:56,289 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.75 vs. limit=15.0 2023-10-04 16:59:58,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1727480.0, ans=0.1 2023-10-04 17:00:01,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:01,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:00:05,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:00:06,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1727480.0, ans=0.0 2023-10-04 17:00:08,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:00:11,544 INFO [train.py:1046] (2/4) Epoch 49, batch 4150, loss[loss=0.1492, simple_loss=0.2214, pruned_loss=0.03848, over 23635.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2349, pruned_loss=0.03624, over 4722781.22 frames. ], batch size: 256, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:00:11,664 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:00:13,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:00:14,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:00:14,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:00:18,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 17:00:19,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:00:19,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 17:00:19,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 17:00:19,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 17:00:21,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:00:25,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:00:25,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:29,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:00:31,159 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:00:32,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:00:33,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:00:33,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:00:33,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:00:38,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:43,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:00:43,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 17:00:43,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1727680.0, ans=0.2 2023-10-04 17:00:46,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 17:00:46,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:00:47,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 17:00:47,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:00:47,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:00:47,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1727680.0, ans=0.125 2023-10-04 17:00:49,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:00:51,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:00:55,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 17:00:58,248 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:00:59,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:01,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 17:01:01,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:01:02,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 17:01:03,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:01:05,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:01:07,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:07,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 17:01:07,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:07,198 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:01:09,066 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-10-04 17:01:09,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:01:12,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 17:01:12,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:12,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:01:12,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:01:14,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 17:01:14,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:01:14,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 17:01:14,738 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:01:18,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:18,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 17:01:18,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:01:24,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:01:24,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1727813.3333333333, ans=0.0 2023-10-04 17:01:25,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 17:01:26,911 INFO [train.py:1046] (2/4) Epoch 49, batch 4200, loss[loss=0.1279, simple_loss=0.1975, pruned_loss=0.02909, over 23587.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03607, over 4723040.84 frames. ], batch size: 256, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:01:27,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:01:29,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:01:31,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:01:31,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:01:31,238 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:01:32,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 17:01:37,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 17:01:38,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:39,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:40,108 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:01:41,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1727946.6666666667, ans=0.1 2023-10-04 17:01:42,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:01:44,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 17:01:45,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:01:45,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:47,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 17:01:47,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:47,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.49 vs. limit=22.5 2023-10-04 17:01:49,211 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.800e+02 2.163e+02 2.357e+02 2.679e+02 4.452e+02, threshold=4.714e+02, percent-clipped=0.0 2023-10-04 17:01:49,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:50,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:01:50,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:01:51,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:01:53,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 17:01:54,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:59,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:01:59,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:02:01,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:02:04,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:02:06,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:02:06,128 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 17:02:06,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:02:07,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:02:12,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:02:12,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1728080.0, ans=0.1 2023-10-04 17:02:13,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:02:18,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:02:18,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1728080.0, ans=0.025 2023-10-04 17:02:22,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 17:02:24,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:02:26,370 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.70 vs. limit=12.0 2023-10-04 17:02:28,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:02:28,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1728146.6666666667, ans=0.0 2023-10-04 17:02:29,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:31,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1728146.6666666667, ans=0.0 2023-10-04 17:02:32,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 17:02:36,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:02:40,191 INFO [train.py:1046] (2/4) Epoch 49, batch 4250, loss[loss=0.16, simple_loss=0.2502, pruned_loss=0.03493, over 24621.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2331, pruned_loss=0.03578, over 4727399.18 frames. ], batch size: 68, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:02:40,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:02:40,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:02:42,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:48,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:02:48,214 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 17:02:48,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:02:51,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:53,460 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:02:54,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:02:55,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.28 vs. limit=6.0 2023-10-04 17:02:58,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:02:58,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:01,417 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:03:01,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:03:01,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1728280.0, ans=0.2 2023-10-04 17:03:01,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1728280.0, ans=0.0 2023-10-04 17:03:02,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:04,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:05,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:10,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:03:11,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:13,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 17:03:16,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 17:03:16,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:16,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:03:17,965 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:19,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:03:19,431 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:19,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:24,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:03:24,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:03:28,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:03:30,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:30,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 17:03:31,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:03:31,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 17:03:34,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:03:35,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:03:37,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:37,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:03:39,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 17:03:41,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:03:42,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:03:46,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:48,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:48,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:03:52,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:03:54,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:03:54,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:03:54,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:03:54,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 17:03:55,780 INFO [train.py:1046] (2/4) Epoch 49, batch 4300, loss[loss=0.1349, simple_loss=0.2118, pruned_loss=0.02897, over 23608.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2328, pruned_loss=0.03562, over 4716840.02 frames. ], batch size: 149, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:03:57,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:04:02,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:04:02,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:04:04,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:04:12,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:04:12,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 17:04:13,693 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:04:15,156 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:04:16,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:04:16,476 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 17:04:16,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1728613.3333333333, ans=0.125 2023-10-04 17:04:17,833 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.072e+02 2.260e+02 2.526e+02 3.398e+02, threshold=4.520e+02, percent-clipped=0.0 2023-10-04 17:04:17,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:04:19,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:04:24,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 17:04:24,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:04:24,117 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 17:04:26,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:04:28,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:04:30,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:04:30,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:04:31,775 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:04:31,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1728680.0, ans=0.125 2023-10-04 17:04:33,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:04:33,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:04:34,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 17:04:35,273 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.30 vs. limit=15.0 2023-10-04 17:04:35,909 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 17:04:37,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:04:40,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:40,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:04:40,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:40,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:04:40,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 17:04:40,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 17:04:41,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 17:04:41,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:04:43,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 17:04:43,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 17:04:46,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:04:49,329 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 17:04:50,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:04:52,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:04:52,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:04:54,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 17:04:54,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:04:54,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:55,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:04:56,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:04:57,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:04:59,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:05:03,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:04,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:04,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:05:08,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 17:05:08,716 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:05:10,014 INFO [train.py:1046] (2/4) Epoch 49, batch 4350, loss[loss=0.1534, simple_loss=0.2467, pruned_loss=0.03008, over 24527.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2337, pruned_loss=0.03575, over 4718196.58 frames. ], batch size: 71, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:05:14,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:05:17,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:18,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1728880.0, ans=0.125 2023-10-04 17:05:21,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:05:21,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:05:25,568 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:05:26,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:05:29,612 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:32,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:05:34,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:05:37,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:05:37,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.34 vs. limit=15.0 2023-10-04 17:05:38,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:05:38,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:05:43,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 17:05:45,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:05:45,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:50,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:52,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 17:05:55,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:05:56,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:06:00,753 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 17:06:02,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:03,822 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:06:03,891 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 17:06:03,947 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 17:06:03,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:06:05,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:05,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:06:05,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:05,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1729080.0, ans=0.1 2023-10-04 17:06:06,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:06:07,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:06:10,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 17:06:10,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:10,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:06:12,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:13,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 17:06:16,120 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 17:06:16,124 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 17:06:16,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 17:06:19,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:06:19,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:06:20,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:20,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:06:23,581 INFO [train.py:1046] (2/4) Epoch 49, batch 4400, loss[loss=0.1412, simple_loss=0.232, pruned_loss=0.02519, over 24457.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2341, pruned_loss=0.03579, over 4714359.24 frames. ], batch size: 69, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:06:24,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.14 vs. limit=6.0 2023-10-04 17:06:25,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 17:06:27,021 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 17:06:27,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:30,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:06:30,497 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:31,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:06:33,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 17:06:33,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 17:06:34,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 17:06:34,701 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 17:06:35,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:06:35,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:06:37,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 17:06:40,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:40,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:40,722 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 17:06:43,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:43,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 17:06:44,835 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 17:06:46,253 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.847e+02 2.222e+02 2.528e+02 3.076e+02 4.813e+02, threshold=5.055e+02, percent-clipped=1.0 2023-10-04 17:06:46,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 17:06:46,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 17:06:47,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 17:06:47,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:47,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1729280.0, ans=0.125 2023-10-04 17:06:49,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:51,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:51,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:06:52,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 17:06:52,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 17:06:54,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:54,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1729346.6666666667, ans=0.125 2023-10-04 17:06:55,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:06:55,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:57,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:57,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:57,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 17:06:58,793 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 17:07:03,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:07,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:07:11,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 17:07:14,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:07:17,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:07:18,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:07:18,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 17:07:18,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:07:18,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:07:18,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:07:20,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:07:22,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1729480.0, ans=0.0 2023-10-04 17:07:25,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1729480.0, ans=0.1 2023-10-04 17:07:26,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 17:07:28,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1729480.0, ans=0.2 2023-10-04 17:07:29,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 17:07:30,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 17:07:31,592 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=22.5 2023-10-04 17:07:32,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:07:32,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 17:07:32,763 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:07:35,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:07:38,760 INFO [train.py:1046] (2/4) Epoch 49, batch 4450, loss[loss=0.1486, simple_loss=0.2463, pruned_loss=0.0255, over 24330.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2352, pruned_loss=0.03604, over 4727653.18 frames. ], batch size: 74, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:07:38,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 17:07:40,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1729546.6666666667, ans=0.0 2023-10-04 17:07:41,801 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:07:44,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:44,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:07:51,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:07:51,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:07:54,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:57,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:08:00,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:08:00,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:08:02,161 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 17:08:02,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:08:03,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:03,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:08:03,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:08:06,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:08:11,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:11,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:12,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:08:12,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:08:15,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:08:18,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 17:08:19,626 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 17:08:19,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 17:08:19,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:08:22,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:08:23,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 17:08:28,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:08:32,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:32,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 17:08:32,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:32,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:08:32,661 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:08:32,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:08:33,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:35,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=22.5 2023-10-04 17:08:36,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:08:36,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 17:08:38,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:08:41,246 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:08:41,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:08:42,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:44,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:08:45,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:08:48,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 17:08:49,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:08:52,468 INFO [train.py:1046] (2/4) Epoch 49, batch 4500, loss[loss=0.1502, simple_loss=0.2411, pruned_loss=0.02965, over 24549.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2348, pruned_loss=0.03581, over 4720572.97 frames. ], batch size: 71, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:08:55,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:08:57,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 17:08:57,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 17:08:59,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:09:05,000 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:09:06,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:09:07,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:09:07,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:09:07,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:08,366 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-10-04 17:09:08,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:12,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1729946.6666666667, ans=15.0 2023-10-04 17:09:15,227 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.089e+02 2.391e+02 2.805e+02 4.651e+02, threshold=4.782e+02, percent-clipped=0.0 2023-10-04 17:09:20,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:09:20,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:09:22,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:09:23,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:09:25,173 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:09:28,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:09:33,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:09:33,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1730013.3333333333, ans=0.0 2023-10-04 17:09:37,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:09:41,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:09:41,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 17:09:41,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:43,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:09:45,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:09:45,100 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:09:47,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:49,185 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 17:09:49,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:09:49,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:53,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:09:53,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:09:56,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:57,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:09:57,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:10:00,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 17:10:00,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1730146.6666666667, ans=0.125 2023-10-04 17:10:04,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 17:10:04,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 17:10:07,349 INFO [train.py:1046] (2/4) Epoch 49, batch 4550, loss[loss=0.1537, simple_loss=0.2443, pruned_loss=0.03154, over 24446.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2346, pruned_loss=0.03565, over 4725506.79 frames. ], batch size: 69, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:10:07,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 17:10:10,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 17:10:10,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:10:13,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:10:13,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:10:16,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:10:16,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1730213.3333333333, ans=0.0 2023-10-04 17:10:20,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:10:21,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:10:23,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:10:23,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:10:23,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:24,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:10:26,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:10:29,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:10:33,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 17:10:33,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 17:10:34,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:10:35,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 17:10:37,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1730346.6666666667, ans=0.125 2023-10-04 17:10:40,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 17:10:41,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:10:42,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 17:10:44,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:10:47,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1730346.6666666667, ans=0.0 2023-10-04 17:10:48,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:50,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:50,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:10:51,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 17:10:51,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1730413.3333333333, ans=0.2 2023-10-04 17:10:54,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:10:57,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:57,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:10:58,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:10:59,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 17:10:59,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 17:10:59,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:11:01,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 17:11:03,743 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 17:11:03,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:11:05,205 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:05,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:11:06,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:11:06,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:11:07,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:11:09,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 17:11:10,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:11:10,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 17:11:10,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 17:11:10,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:11:10,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 17:11:14,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:11:14,917 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:11:15,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1730480.0, ans=0.125 2023-10-04 17:11:18,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:11:18,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:11:19,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:11:20,943 INFO [train.py:1046] (2/4) Epoch 49, batch 4600, loss[loss=0.1615, simple_loss=0.2512, pruned_loss=0.03587, over 23952.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2326, pruned_loss=0.03559, over 4714397.53 frames. ], batch size: 80, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:11:20,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:11:22,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:11:25,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:25,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:11:27,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:11:29,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:11:29,327 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:31,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 17:11:33,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:11:33,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1730546.6666666667, ans=0.125 2023-10-04 17:11:37,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:11:37,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:40,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:43,293 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.059e+02 2.232e+02 2.627e+02 4.263e+02, threshold=4.464e+02, percent-clipped=0.0 2023-10-04 17:11:45,530 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.87 vs. limit=22.5 2023-10-04 17:11:46,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 17:11:47,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:51,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:52,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:11:52,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:54,923 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.72 vs. limit=5.0 2023-10-04 17:11:59,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 17:11:59,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:12:00,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:05,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:07,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:12:08,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:12:12,989 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 17:12:13,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:12:17,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:18,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:12:20,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:20,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 17:12:20,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:21,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 17:12:21,959 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:23,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:24,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:26,042 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:12:26,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:27,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 17:12:27,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 17:12:28,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 17:12:28,913 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:29,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:12:30,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:31,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:32,150 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.70 vs. limit=15.0 2023-10-04 17:12:34,805 INFO [train.py:1046] (2/4) Epoch 49, batch 4650, loss[loss=0.1521, simple_loss=0.2497, pruned_loss=0.02721, over 24283.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2321, pruned_loss=0.03532, over 4716790.73 frames. ], batch size: 74, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:12:40,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:12:44,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:44,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:44,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:12:44,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:44,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:12:45,856 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:47,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 17:12:52,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:12:54,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 17:12:54,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:54,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 17:12:56,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:12:56,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 17:12:56,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 17:12:56,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:57,753 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:12:59,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1730946.6666666667, ans=0.0 2023-10-04 17:13:00,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:13:00,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1730946.6666666667, ans=0.0 2023-10-04 17:13:01,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:01,941 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 17:13:05,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:05,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 17:13:09,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:09,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:13:10,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 17:13:12,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:13:14,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:13:17,599 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:13:22,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1731080.0, ans=0.0 2023-10-04 17:13:23,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:25,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:26,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:26,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:13:29,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 17:13:29,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 17:13:30,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 17:13:30,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 17:13:31,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:13:39,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:13:39,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:13:41,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 17:13:41,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:13:42,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:13:42,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:13:43,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:13:44,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1731146.6666666667, ans=0.125 2023-10-04 17:13:46,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:13:46,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:13:48,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:49,402 INFO [train.py:1046] (2/4) Epoch 49, batch 4700, loss[loss=0.1391, simple_loss=0.2196, pruned_loss=0.0293, over 24454.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2326, pruned_loss=0.03519, over 4719770.08 frames. ], batch size: 58, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:13:50,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:13:50,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:13:50,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:13:52,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 17:13:53,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:13:53,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 17:13:55,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1731213.3333333333, ans=0.1 2023-10-04 17:13:58,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1731213.3333333333, ans=0.04949747468305833 2023-10-04 17:14:02,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:02,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1731280.0, ans=0.1 2023-10-04 17:14:03,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:14:03,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:03,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:14:07,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:14:11,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 17:14:11,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 17:14:12,379 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.767e+02 2.015e+02 2.160e+02 2.418e+02 3.879e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-04 17:14:13,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:15,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:14:16,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:14:18,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1731346.6666666667, ans=0.1 2023-10-04 17:14:19,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:22,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1731346.6666666667, ans=0.0 2023-10-04 17:14:23,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:14:25,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:14:25,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1731346.6666666667, ans=0.125 2023-10-04 17:14:27,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:14:33,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 17:14:35,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:14:37,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:41,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 17:14:43,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:14:46,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:14:47,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 17:14:47,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:47,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:50,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:50,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:14:52,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 17:14:52,084 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 17:14:53,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:56,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:56,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:56,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 17:14:56,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:15:00,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 17:15:02,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:15:03,630 INFO [train.py:1046] (2/4) Epoch 49, batch 4750, loss[loss=0.1471, simple_loss=0.2302, pruned_loss=0.03196, over 24466.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2329, pruned_loss=0.03553, over 4702035.92 frames. ], batch size: 63, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:15:03,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:08,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:09,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:15:10,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 17:15:12,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:12,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1731546.6666666667, ans=0.125 2023-10-04 17:15:15,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 17:15:16,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:15:16,820 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:15:18,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:15:24,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 17:15:29,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:15:31,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 17:15:31,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:15:33,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:15:33,883 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:15:35,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:35,212 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 17:15:35,215 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 17:15:41,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 17:15:44,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:46,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:15:47,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:15:47,905 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 17:15:47,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:15:50,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:15:53,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:15:54,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 17:15:54,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 17:15:56,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:56,263 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:15:57,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:57,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:15:57,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 17:16:00,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 17:16:03,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:05,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:16:05,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 17:16:06,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:16:08,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:09,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:16:11,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:11,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:16:16,080 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:16:16,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 17:16:18,818 INFO [train.py:1046] (2/4) Epoch 49, batch 4800, loss[loss=0.1621, simple_loss=0.2444, pruned_loss=0.03993, over 24057.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2336, pruned_loss=0.03553, over 4720347.10 frames. ], batch size: 80, lr: 2.06e-03, grad_scale: 32.0 2023-10-04 17:16:18,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 17:16:18,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 17:16:20,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:16:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:16:23,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 17:16:27,349 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:27,395 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:31,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:16:34,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:16:34,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:34,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 17:16:36,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:16:38,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:16:38,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:16:40,914 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.105e+02 2.314e+02 2.580e+02 3.922e+02, threshold=4.627e+02, percent-clipped=0.0 2023-10-04 17:16:43,670 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:16:45,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:45,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:16:47,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:49,001 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 17:16:49,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:50,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:16:51,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:54,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:56,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:56,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:16:57,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:16:58,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:00,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 17:17:00,301 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 17:17:01,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:01,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:17:01,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1732080.0, ans=0.125 2023-10-04 17:17:03,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:17:03,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:17:03,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:17:03,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:17:04,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:17:07,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:17:10,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:13,468 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:18,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 17:17:18,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:17:18,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:19,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:17:19,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:21,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1732146.6666666667, ans=0.0 2023-10-04 17:17:24,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:17:25,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:17:25,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:25,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1732146.6666666667, ans=0.0 2023-10-04 17:17:26,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:17:26,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:17:28,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:17:31,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:31,453 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:31,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:17:32,801 INFO [train.py:1046] (2/4) Epoch 49, batch 4850, loss[loss=0.1468, simple_loss=0.2246, pruned_loss=0.03452, over 23413.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2339, pruned_loss=0.03553, over 4726143.02 frames. ], batch size: 119, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:17:32,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 17:17:34,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 17:17:34,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:17:34,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:17:35,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:17:35,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:39,130 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:42,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=12.0 2023-10-04 17:17:46,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 17:17:47,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:54,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:17:55,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:17:55,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:56,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:58,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:17:58,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:17:59,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 17:18:02,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:18:04,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:18:04,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:18:05,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:18:05,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 17:18:08,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:18:08,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:10,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1732346.6666666667, ans=0.0 2023-10-04 17:18:13,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:13,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 17:18:13,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 17:18:13,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:18:21,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:18:21,915 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 17:18:24,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:18:24,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:18:26,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:18:26,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1732413.3333333333, ans=0.05 2023-10-04 17:18:27,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 17:18:27,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:28,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 17:18:28,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:18:30,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:18:30,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 17:18:33,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1732480.0, ans=0.0 2023-10-04 17:18:38,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:38,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1732480.0, ans=0.0 2023-10-04 17:18:42,982 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.29 vs. limit=6.0 2023-10-04 17:18:44,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:18:44,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:18:45,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1732480.0, ans=0.125 2023-10-04 17:18:48,040 INFO [train.py:1046] (2/4) Epoch 49, batch 4900, loss[loss=0.1337, simple_loss=0.1996, pruned_loss=0.03389, over 22536.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.233, pruned_loss=0.03598, over 4700472.75 frames. ], batch size: 322, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:18:50,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 17:18:50,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:18:54,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:18:55,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1732546.6666666667, ans=0.125 2023-10-04 17:18:55,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.74 vs. limit=10.0 2023-10-04 17:18:56,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:18:56,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:18:57,119 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.41 vs. limit=12.0 2023-10-04 17:18:59,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 17:19:03,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 17:19:06,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 17:19:07,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 17:19:07,494 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:19:07,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:19:07,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:19:07,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:19:07,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:19:07,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1732613.3333333333, ans=0.0 2023-10-04 17:19:08,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-10-04 17:19:09,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 17:19:11,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.726e+02 2.088e+02 2.355e+02 2.729e+02 5.318e+02, threshold=4.711e+02, percent-clipped=2.0 2023-10-04 17:19:12,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 17:19:14,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:19:16,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:19:17,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:19:20,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:19:21,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:19:21,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:21,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 17:19:23,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:19:24,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:19:24,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 17:19:24,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 17:19:29,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 17:19:30,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:19:31,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:19:31,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:19:33,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:19:33,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 17:19:33,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:19:33,373 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 17:19:36,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:37,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:19:39,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:19:41,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 17:19:41,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:19:42,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 17:19:42,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 17:19:50,451 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:19:50,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1732813.3333333333, ans=0.1 2023-10-04 17:19:51,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:19:51,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 17:19:51,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:19:51,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:19:54,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:58,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:19:58,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:19:58,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:20:00,154 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 17:20:01,453 INFO [train.py:1046] (2/4) Epoch 49, batch 4950, loss[loss=0.1351, simple_loss=0.2106, pruned_loss=0.02974, over 24326.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2317, pruned_loss=0.03563, over 4701551.35 frames. ], batch size: 56, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:20:01,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:20:03,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:20:03,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:20:03,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1732880.0, ans=0.125 2023-10-04 17:20:07,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 17:20:07,690 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 17:20:07,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:20:08,385 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.40 vs. limit=22.5 2023-10-04 17:20:09,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 17:20:09,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:09,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:20:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:20:10,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:12,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:20:13,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:20:14,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:20:16,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:20:17,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:17,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:20:21,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:20:23,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1732946.6666666667, ans=0.125 2023-10-04 17:20:25,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.89 vs. limit=12.0 2023-10-04 17:20:27,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:28,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:20:30,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:31,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:31,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1733013.3333333333, ans=0.05 2023-10-04 17:20:31,988 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.56 vs. limit=15.0 2023-10-04 17:20:32,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:20:34,023 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 17:20:35,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 17:20:36,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:38,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:20:38,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:20:40,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:20:41,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:20:41,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:20:43,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:20:44,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:20:46,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:20:47,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:47,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1733080.0, ans=0.1 2023-10-04 17:20:49,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:49,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.59 vs. limit=15.0 2023-10-04 17:20:51,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 17:20:51,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:20:52,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:20:57,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:20:59,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:20:59,925 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:20:59,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:21:00,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:21:01,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:21:02,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.32 vs. limit=15.0 2023-10-04 17:21:04,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:21:04,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:21:04,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:21:05,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 17:21:08,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:14,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 17:21:14,858 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 17:21:16,190 INFO [train.py:1046] (2/4) Epoch 49, batch 5000, loss[loss=0.1747, simple_loss=0.2618, pruned_loss=0.04379, over 23719.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2319, pruned_loss=0.03581, over 4707882.01 frames. ], batch size: 85, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:21:21,012 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:21:21,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:21:23,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 17:21:23,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 17:21:24,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1733213.3333333333, ans=0.2 2023-10-04 17:21:25,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:21:28,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 17:21:28,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:21:28,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:21:29,511 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.97 vs. limit=15.0 2023-10-04 17:21:29,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 17:21:30,020 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:21:31,279 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:21:31,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 17:21:31,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:31,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:21:31,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1733280.0, ans=0.0 2023-10-04 17:21:34,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 17:21:34,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 17:21:35,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:21:37,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 17:21:37,188 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:21:37,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:38,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:21:38,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 17:21:38,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 17:21:39,928 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.001e+02 2.201e+02 2.789e+02 6.311e+02, threshold=4.402e+02, percent-clipped=4.0 2023-10-04 17:21:40,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 17:21:40,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:21:41,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:46,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 17:21:46,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:21:46,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:48,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:49,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 17:21:52,724 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 17:21:54,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:21:55,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:21:58,685 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 17:22:01,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:22:01,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1733413.3333333333, ans=0.015 2023-10-04 17:22:03,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:22:03,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:03,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1733413.3333333333, ans=0.125 2023-10-04 17:22:06,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 17:22:06,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:22:06,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:22:07,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:22:10,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 17:22:10,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:22:13,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:22:14,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:22:17,825 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:22:20,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 17:22:23,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:33,220 INFO [train.py:1046] (2/4) Epoch 49, batch 5050, loss[loss=0.1475, simple_loss=0.2295, pruned_loss=0.03273, over 24576.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2324, pruned_loss=0.03546, over 4721274.93 frames. ], batch size: 60, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:22:33,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:22:34,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:34,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:22:34,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:22:34,799 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:22:34,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:22:34,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:38,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:38,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 17:22:40,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:22:42,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:22:43,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:22:43,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 17:22:46,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:22:46,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:22:48,939 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:22:50,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:22:50,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:22:59,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 17:22:59,943 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:23:00,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:23:00,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 17:23:01,577 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:23:02,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:02,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:04,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:23:04,303 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 17:23:05,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 17:23:07,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:08,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:09,241 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.36 vs. limit=10.0 2023-10-04 17:23:11,762 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:13,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 17:23:14,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:23:17,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 17:23:18,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:23:18,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:23:18,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:23:19,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:23:20,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:23:22,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:23:24,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:24,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:23:24,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:23:24,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 17:23:25,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:23:27,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:23:30,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:23:30,621 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 17:23:30,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:23:33,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:23:33,761 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:33,787 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 17:23:37,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:37,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 17:23:37,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:40,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:23:40,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:40,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 17:23:44,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 17:23:46,727 INFO [train.py:1046] (2/4) Epoch 49, batch 5100, loss[loss=0.1624, simple_loss=0.2411, pruned_loss=0.04182, over 23779.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.233, pruned_loss=0.03579, over 4716122.19 frames. ], batch size: 212, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:23:46,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:46,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:23:46,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:23:50,856 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 17:23:52,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:55,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 17:23:55,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 17:23:57,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:58,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:23:58,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1733880.0, ans=0.125 2023-10-04 17:24:00,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:24:01,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 17:24:01,715 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 17:24:06,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:24:06,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:24:10,748 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.155e+02 2.470e+02 3.044e+02 5.202e+02, threshold=4.940e+02, percent-clipped=2.0 2023-10-04 17:24:10,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:24:12,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1733946.6666666667, ans=0.05 2023-10-04 17:24:15,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 17:24:15,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:24:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:24:18,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 17:24:20,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:22,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:22,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 17:24:22,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1734013.3333333333, ans=0.125 2023-10-04 17:24:24,975 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 17:24:26,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:26,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 17:24:26,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 17:24:29,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:24:31,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1734080.0, ans=0.0 2023-10-04 17:24:39,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:24:40,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 17:24:41,895 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 17:24:41,912 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 17:24:43,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 17:24:43,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:46,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 17:24:49,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 17:24:50,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.15 vs. limit=15.0 2023-10-04 17:24:52,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:24:53,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:24:55,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 17:24:57,865 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:24:57,902 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 17:25:00,572 INFO [train.py:1046] (2/4) Epoch 49, batch 5150, loss[loss=0.1568, simple_loss=0.2418, pruned_loss=0.0359, over 24663.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.03578, over 4731472.22 frames. ], batch size: 68, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:25:03,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:25:03,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:25:03,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:25:04,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:25:05,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:25:07,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:25:08,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 17:25:08,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 17:25:08,739 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 17:25:08,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:25:08,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 17:25:10,540 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:11,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 17:25:13,286 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:14,674 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:19,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:25:19,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 17:25:22,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:22,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:25:22,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1734280.0, ans=0.125 2023-10-04 17:25:23,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:25:23,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:25:23,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:25:23,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:25:23,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:25:23,858 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 17:25:25,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:25:25,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:25:28,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:25:29,469 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 17:25:29,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:25:31,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1734346.6666666667, ans=0.0 2023-10-04 17:25:33,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:25:35,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 17:25:40,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:25:40,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1734346.6666666667, ans=0.07 2023-10-04 17:25:43,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:25:45,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:49,142 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:25:49,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:25:51,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 17:25:55,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:56,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:25:56,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:26:00,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:00,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:26:01,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 17:26:07,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:26:09,474 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:26:10,942 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:26:10,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:26:12,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:26:12,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:26:12,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:26:12,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:26:15,744 INFO [train.py:1046] (2/4) Epoch 49, batch 5200, loss[loss=0.1897, simple_loss=0.2646, pruned_loss=0.05744, over 19553.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03628, over 4732817.98 frames. ], batch size: 388, lr: 2.06e-03, grad_scale: 32.0 2023-10-04 17:26:17,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:26:17,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1734546.6666666667, ans=0.0 2023-10-04 17:26:18,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:26:21,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:24,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 17:26:25,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:26:27,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:27,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1734546.6666666667, ans=0.125 2023-10-04 17:26:28,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:29,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:26:29,941 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:32,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 17:26:37,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:26:37,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:39,110 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.105e+02 2.384e+02 2.750e+02 4.154e+02, threshold=4.767e+02, percent-clipped=0.0 2023-10-04 17:26:41,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 17:26:42,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:26:44,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:26:44,319 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 17:26:45,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 17:26:46,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.78 vs. limit=22.5 2023-10-04 17:26:48,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 17:26:48,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:48,961 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 17:26:48,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:51,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:26:51,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:26:51,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 17:26:51,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:26:52,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1734680.0, ans=0.125 2023-10-04 17:26:52,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.58 vs. limit=10.0 2023-10-04 17:26:54,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:55,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 17:26:55,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 17:26:57,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 17:27:00,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 17:27:01,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:27:07,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:27:07,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:09,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 17:27:10,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:27:11,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 17:27:11,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:12,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:27:15,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:27:16,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:27:19,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:27:21,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:21,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:25,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:25,587 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 17:27:26,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:27:26,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:27:26,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:28,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:27:29,639 INFO [train.py:1046] (2/4) Epoch 49, batch 5250, loss[loss=0.1576, simple_loss=0.25, pruned_loss=0.0326, over 24419.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2335, pruned_loss=0.03628, over 4720285.17 frames. ], batch size: 69, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:27:29,704 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:27:31,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:27:34,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:35,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:27:37,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:27:42,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:43,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1734946.6666666667, ans=0.125 2023-10-04 17:27:44,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:27:44,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1734946.6666666667, ans=0.1 2023-10-04 17:27:47,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:27:48,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:27:50,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 17:27:50,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:52,421 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:57,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1735013.3333333333, ans=0.125 2023-10-04 17:28:38,910 INFO [train.py:1046] (2/4) Epoch 49, batch 5300, loss[loss=0.1638, simple_loss=0.2498, pruned_loss=0.03893, over 23744.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2314, pruned_loss=0.03587, over 4699111.41 frames. ], batch size: 85, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:28:45,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.46 vs. limit=22.5 2023-10-04 17:28:46,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1735213.3333333333, ans=0.125 2023-10-04 17:28:53,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:28:53,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 17:28:53,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 17:28:53,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:53,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:53,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:53,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:53,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:53,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:28:53,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:53,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:28:53,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:28:53,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 17:28:54,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 17:28:54,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 17:28:54,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:28:54,555 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 17:28:54,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 17:28:54,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:55,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:55,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:55,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:28:55,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:28:55,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:28:55,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:55,552 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:55,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:55,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:55,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:28:55,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:55,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:28:56,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 17:28:56,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:28:57,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:57,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 17:28:57,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 17:28:57,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:28:57,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:28:57,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 17:28:57,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 17:28:57,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:28:57,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:28:57,924 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:28:58,019 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 17:28:58,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 17:28:58,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:28:58,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:58,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 17:28:58,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 17:28:58,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 17:28:58,524 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:29:05,609 INFO [train.py:1046] (2/4) Epoch 50, batch 0, loss[loss=0.1586, simple_loss=0.2411, pruned_loss=0.03807, over 23302.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2411, pruned_loss=0.03807, over 23302.00 frames. ], batch size: 93, lr: 2.04e-03, grad_scale: 32.0 2023-10-04 17:29:05,609 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 17:29:16,752 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.2377, 2.0120, 2.8153, 2.9874], device='cuda:2') 2023-10-04 17:29:18,990 INFO [train.py:1078] (2/4) Epoch 50, validation: loss=0.3435, simple_loss=0.2762, pruned_loss=0.2054, over 1125622.00 frames. 2023-10-04 17:29:18,991 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 17:29:21,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 17:29:21,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:29:23,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:29:25,813 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.206e+02 2.557e+02 2.974e+02 5.162e+02, threshold=5.113e+02, percent-clipped=2.0 2023-10-04 17:29:28,794 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:28,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:29:30,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:31,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 17:29:32,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 17:29:34,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:35,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:40,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:40,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:40,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:29:40,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:29:41,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 17:29:43,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:29:51,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:29:51,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:53,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 17:29:56,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:29:56,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:29:59,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:03,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:30:04,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1735493.3333333333, ans=0.0 2023-10-04 17:30:04,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-10-04 17:30:07,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:14,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 17:30:14,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1735493.3333333333, ans=0.125 2023-10-04 17:30:15,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1735560.0, ans=0.1 2023-10-04 17:30:17,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 17:30:17,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:30:17,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:18,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:30:19,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:30:21,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 17:30:23,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:23,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:27,752 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:30:29,285 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 17:30:29,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:30:32,034 INFO [train.py:1046] (2/4) Epoch 50, batch 50, loss[loss=0.1521, simple_loss=0.2296, pruned_loss=0.03731, over 23820.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2347, pruned_loss=0.03548, over 1081513.00 frames. ], batch size: 179, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:30:33,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:30:34,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:30:35,721 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=22.5 2023-10-04 17:30:36,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 17:30:36,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:30:37,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:30:38,599 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.98 vs. limit=12.0 2023-10-04 17:30:39,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:30:40,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:30:42,233 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:30:45,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 17:30:46,835 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:52,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:30:53,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.44 vs. limit=15.0 2023-10-04 17:30:54,894 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 17:30:56,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 17:30:56,812 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-10-04 17:30:57,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:30:59,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:30:59,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:31:00,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:31:00,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:31:00,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:31:00,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:31:07,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:31:09,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:09,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:31:10,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 17:31:12,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:31:13,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:31:13,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 17:31:13,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:31:15,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 17:31:17,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1735826.6666666667, ans=0.0 2023-10-04 17:31:21,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:31:21,564 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:31:24,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:24,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:31:26,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:31:28,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 17:31:28,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 17:31:31,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:31,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:31:33,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:31:34,396 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:31:34,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 17:31:34,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 17:31:35,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 17:31:37,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:31:38,455 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:31:38,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 17:31:38,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 17:31:39,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:31:39,961 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:41,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:31:41,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:31:45,909 INFO [train.py:1046] (2/4) Epoch 50, batch 100, loss[loss=0.194, simple_loss=0.2641, pruned_loss=0.06192, over 19151.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2351, pruned_loss=0.036, over 1895232.05 frames. ], batch size: 388, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:31:45,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:31:47,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:31:50,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:31:52,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 17:31:52,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:55,784 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.084e+02 2.344e+02 2.870e+02 5.145e+02, threshold=4.687e+02, percent-clipped=1.0 2023-10-04 17:31:57,264 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:31:57,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:31:58,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:58,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:31:58,634 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:32:00,494 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 17:32:03,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:32:04,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:04,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:04,563 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:32:08,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 17:32:10,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:10,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:11,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:32:13,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:32:17,652 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 17:32:17,673 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 17:32:19,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1736093.3333333333, ans=0.2 2023-10-04 17:32:20,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:32:20,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:32:20,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1736093.3333333333, ans=0.1 2023-10-04 17:32:25,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:32:26,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:28,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:31,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:33,028 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 17:32:34,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 17:32:37,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1736160.0, ans=0.125 2023-10-04 17:32:38,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:32:38,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:32:42,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:44,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:44,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1736226.6666666667, ans=0.125 2023-10-04 17:32:46,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:32:46,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1736226.6666666667, ans=10.0 2023-10-04 17:32:49,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:32:50,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:50,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:53,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:53,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:32:53,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:53,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 17:32:55,340 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 17:32:55,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:55,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:32:55,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:32:55,493 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:32:56,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 17:32:56,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:32:56,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:32:56,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:32:58,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:58,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1736226.6666666667, ans=0.5 2023-10-04 17:33:00,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:00,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:33:01,410 INFO [train.py:1046] (2/4) Epoch 50, batch 150, loss[loss=0.1519, simple_loss=0.2382, pruned_loss=0.03279, over 24480.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2353, pruned_loss=0.03618, over 2531767.58 frames. ], batch size: 66, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:33:01,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:33:04,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:07,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:33:07,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:07,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:10,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:33:11,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:12,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:33:14,253 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:17,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 17:33:17,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 17:33:17,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 17:33:19,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:33:19,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:33:20,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:33:21,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:33:21,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:24,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:26,356 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 17:33:27,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:29,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1736426.6666666667, ans=0.0 2023-10-04 17:33:35,219 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:39,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:33:41,242 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 17:33:45,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:33:45,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:45,295 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:33:46,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:33:48,072 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:33:49,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:33:51,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:52,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 17:33:55,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:57,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:33:57,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:33:57,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:34:00,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:01,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 17:34:03,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:34:05,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:34:06,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:34:07,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:34:09,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 17:34:09,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:34:09,184 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 17:34:09,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1736560.0, ans=0.125 2023-10-04 17:34:09,765 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=12.0 2023-10-04 17:34:12,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:34:15,179 INFO [train.py:1046] (2/4) Epoch 50, batch 200, loss[loss=0.1824, simple_loss=0.2578, pruned_loss=0.0535, over 19785.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2368, pruned_loss=0.03698, over 3018441.53 frames. ], batch size: 389, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:34:16,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:34:16,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:34:20,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 17:34:20,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:34:21,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:24,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 17:34:25,500 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.066e+02 2.212e+02 2.512e+02 4.565e+02, threshold=4.424e+02, percent-clipped=0.0 2023-10-04 17:34:25,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:34:28,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:29,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:32,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:34:34,146 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:34:34,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:51,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1736760.0, ans=0.125 2023-10-04 17:34:54,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:34:54,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:34:56,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:34:56,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:34:57,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 17:34:57,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:34:57,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:59,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:35:00,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:35:00,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:01,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 17:35:03,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:35:03,153 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:07,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:35:12,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:35:18,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1736893.3333333333, ans=0.125 2023-10-04 17:35:19,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:20,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:35:24,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1736893.3333333333, ans=0.125 2023-10-04 17:35:28,248 INFO [train.py:1046] (2/4) Epoch 50, batch 250, loss[loss=0.1529, simple_loss=0.2393, pruned_loss=0.03326, over 24367.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2373, pruned_loss=0.03599, over 3408685.75 frames. ], batch size: 61, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:35:28,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:30,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 17:35:30,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:35:30,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:31,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:35:31,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 17:35:33,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:35:33,413 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 17:35:35,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:36,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:35:37,992 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:38,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:41,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:35:42,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:43,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:35:46,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:35:47,516 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.90 vs. limit=22.5 2023-10-04 17:35:54,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1737026.6666666667, ans=0.125 2023-10-04 17:35:55,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:35:58,527 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:58,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:36:05,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:36:05,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:36:06,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:36:07,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:36:07,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:36:07,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:36:08,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:36:11,305 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:36:12,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1737160.0, ans=0.2 2023-10-04 17:36:14,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 17:36:14,192 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:36:14,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1737160.0, ans=0.015 2023-10-04 17:36:14,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1737160.0, ans=0.1 2023-10-04 17:36:16,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:36:16,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:36:16,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:36:17,118 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:36:18,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:36:20,135 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:36:20,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:36:21,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:22,960 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:36:24,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:25,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:36:30,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1737226.6666666667, ans=0.0 2023-10-04 17:36:31,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:35,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:36:35,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1737226.6666666667, ans=0.1 2023-10-04 17:36:39,329 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:40,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:36:43,258 INFO [train.py:1046] (2/4) Epoch 50, batch 300, loss[loss=0.1503, simple_loss=0.2399, pruned_loss=0.0304, over 24429.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2345, pruned_loss=0.03512, over 3696243.04 frames. ], batch size: 69, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:36:43,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 17:36:45,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:36:45,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:36:46,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 17:36:46,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:36:48,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:36:48,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 17:36:53,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:53,142 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:36:54,341 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.135e+02 2.414e+02 2.952e+02 4.730e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-04 17:36:57,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:36:57,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 17:36:58,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:58,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:36:58,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 17:36:58,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:03,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:37:08,150 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:37:08,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 17:37:12,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 17:37:12,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:15,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:15,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:15,747 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 17:37:15,749 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:37:19,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:37:21,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:37:23,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:37:23,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1737426.6666666667, ans=0.2 2023-10-04 17:37:25,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 17:37:25,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 17:37:27,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:37:28,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:30,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 17:37:30,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:37:36,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:37:38,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:37:38,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 17:37:41,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:41,087 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:37:44,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:45,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:37:45,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 17:37:45,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:37:47,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:37:49,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 17:37:50,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:50,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:37:51,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:51,995 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:37:53,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:37:57,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1737626.6666666667, ans=0.125 2023-10-04 17:37:58,852 INFO [train.py:1046] (2/4) Epoch 50, batch 350, loss[loss=0.1422, simple_loss=0.2284, pruned_loss=0.02797, over 24644.00 frames. ], tot_loss[loss=0.1504, simple_loss=0.2316, pruned_loss=0.03463, over 3900679.83 frames. ], batch size: 68, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:37:58,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:37:58,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 17:38:00,857 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:01,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.75 vs. limit=15.0 2023-10-04 17:38:06,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:38:06,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1737626.6666666667, ans=0.125 2023-10-04 17:38:09,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:10,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.40 vs. limit=22.5 2023-10-04 17:38:10,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:12,859 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 17:38:14,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:38:14,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 17:38:15,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1737693.3333333333, ans=0.125 2023-10-04 17:38:18,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:18,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 17:38:19,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:38:22,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 17:38:23,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:38:24,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:38:25,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:38:27,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:38:28,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:38:28,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:38:28,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:30,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:38:32,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:38:32,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:36,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1737760.0, ans=15.0 2023-10-04 17:38:39,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:38:39,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:38:41,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:38:41,770 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:43,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1737826.6666666667, ans=0.1 2023-10-04 17:38:47,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 17:38:47,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:51,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:51,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:38:51,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:38:53,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 17:38:55,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:38:56,731 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 17:38:59,396 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 17:38:59,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:02,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:39:02,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 17:39:05,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:06,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:39:07,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1737893.3333333333, ans=0.125 2023-10-04 17:39:08,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:09,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:09,757 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:39:13,039 INFO [train.py:1046] (2/4) Epoch 50, batch 400, loss[loss=0.1671, simple_loss=0.2495, pruned_loss=0.04235, over 23989.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.2315, pruned_loss=0.03484, over 4086344.81 frames. ], batch size: 80, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:39:13,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:39:15,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:39:17,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:39:19,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 17:39:19,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:19,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:20,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:39:20,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:20,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1737960.0, ans=0.125 2023-10-04 17:39:23,325 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 2.204e+02 2.503e+02 2.955e+02 6.313e+02, threshold=5.007e+02, percent-clipped=5.0 2023-10-04 17:39:23,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:24,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:27,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.07 vs. limit=15.0 2023-10-04 17:39:27,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 17:39:29,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 17:39:29,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:31,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 17:39:32,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:35,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:39:35,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:39:35,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 17:39:37,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:39:37,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:37,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:39:37,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:39,942 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 17:39:39,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 17:39:44,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:46,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:47,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 17:39:48,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 17:39:50,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:39:52,352 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:39:59,591 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 17:39:59,836 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:40:02,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:40:04,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 17:40:06,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:40:08,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:40:08,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 17:40:11,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:40:14,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:40:15,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:40:18,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:18,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 17:40:21,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:40:22,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 17:40:24,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:40:24,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:40:25,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 17:40:27,278 INFO [train.py:1046] (2/4) Epoch 50, batch 450, loss[loss=0.1521, simple_loss=0.2279, pruned_loss=0.03817, over 16893.00 frames. ], tot_loss[loss=0.1508, simple_loss=0.2318, pruned_loss=0.03493, over 4217076.95 frames. ], batch size: 36, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:40:27,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:40:27,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:40:27,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:40:28,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 17:40:30,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:40:30,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:40:30,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:40:31,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1738293.3333333333, ans=0.125 2023-10-04 17:40:32,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 17:40:32,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:40:34,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:40:36,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:40:46,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:46,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:40:49,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 17:40:51,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 17:40:53,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:40:53,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1738360.0, ans=0.0 2023-10-04 17:40:55,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:57,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:02,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:41:03,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:41:03,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1738426.6666666667, ans=0.0 2023-10-04 17:41:05,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 17:41:05,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 17:41:07,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 17:41:07,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:08,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:08,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:41:11,651 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 17:41:11,660 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 17:41:11,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:41:13,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:41:14,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:41:17,740 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:41:17,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:41:19,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 17:41:19,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 17:41:22,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:41:23,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:41:25,275 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:41:26,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 17:41:30,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:41:30,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 17:41:30,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 17:41:32,356 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:41:37,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:41:38,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:41:40,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:41:40,228 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 17:41:41,552 INFO [train.py:1046] (2/4) Epoch 50, batch 500, loss[loss=0.1582, simple_loss=0.2449, pruned_loss=0.03571, over 24658.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2328, pruned_loss=0.03549, over 4329667.73 frames. ], batch size: 68, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:41:44,410 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:45,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:41:45,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:45,795 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 17:41:47,075 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 17:41:47,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:47,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1738626.6666666667, ans=0.125 2023-10-04 17:41:50,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:41:52,337 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.023e+02 2.217e+02 2.721e+02 3.663e+02, threshold=4.434e+02, percent-clipped=0.0 2023-10-04 17:41:56,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:41:57,949 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:41:59,398 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:41:59,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:42:00,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:05,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1738693.3333333333, ans=0.0 2023-10-04 17:42:10,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1738760.0, ans=0.125 2023-10-04 17:42:11,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:11,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:42:13,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:42:13,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:13,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 17:42:13,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:42:17,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:42:17,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:42:17,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:42:18,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:20,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 17:42:23,446 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 17:42:26,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:27,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:28,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:28,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:29,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:42:31,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 17:42:35,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:42:36,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:39,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:42:43,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:48,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:51,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 17:42:52,792 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:52,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:54,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 17:42:56,133 INFO [train.py:1046] (2/4) Epoch 50, batch 550, loss[loss=0.1472, simple_loss=0.2371, pruned_loss=0.02864, over 24632.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2343, pruned_loss=0.03604, over 4419184.20 frames. ], batch size: 65, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:42:56,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:42:57,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:58,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.87 vs. limit=22.5 2023-10-04 17:43:02,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 17:43:03,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1738960.0, ans=0.125 2023-10-04 17:43:04,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 17:43:04,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:04,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 17:43:05,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:43:05,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:06,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:07,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:07,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:43:07,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:43:10,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:43:12,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 17:43:12,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:43:16,931 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:16,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:17,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1739026.6666666667, ans=0.0 2023-10-04 17:43:19,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:43:19,722 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:22,579 WARNING [train.py:1204] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 17:43:23,396 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.55 vs. limit=15.0 2023-10-04 17:43:24,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 17:43:27,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:43:28,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.81 vs. limit=15.0 2023-10-04 17:43:30,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:43:30,386 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:43:31,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:43:33,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1739093.3333333333, ans=0.0 2023-10-04 17:43:33,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.75 vs. limit=6.0 2023-10-04 17:43:34,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:34,578 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 17:43:35,926 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:37,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 17:43:41,991 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:43:42,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:43:42,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:43:43,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:44,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 17:43:46,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 17:43:46,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:43:46,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:43:48,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:43:48,073 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:49,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:43:50,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:43:53,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:43:53,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:53,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:43:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:43:56,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:43:57,768 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:43:57,827 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:00,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:44:00,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 17:44:04,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 17:44:09,279 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 17:44:10,497 INFO [train.py:1046] (2/4) Epoch 50, batch 600, loss[loss=0.1431, simple_loss=0.227, pruned_loss=0.02964, over 23379.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03624, over 4482054.66 frames. ], batch size: 119, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:44:10,593 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:44:10,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:44:11,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:17,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:44:19,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:44:19,388 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 17:44:22,520 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.089e+02 2.281e+02 2.529e+02 5.225e+02, threshold=4.562e+02, percent-clipped=2.0 2023-10-04 17:44:22,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:44:24,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:44:27,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:28,748 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 17:44:28,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:44:35,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 17:44:39,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:44:39,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:39,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:44:41,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1739426.6666666667, ans=0.1 2023-10-04 17:44:44,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:44:44,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:44:46,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:53,882 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:44:58,518 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:59,786 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:44:59,793 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:45:01,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1739493.3333333333, ans=0.035 2023-10-04 17:45:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 17:45:11,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:45:11,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:45:15,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 17:45:15,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:45:17,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 17:45:19,028 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:45:19,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:45:20,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1739560.0, ans=0.125 2023-10-04 17:45:25,059 INFO [train.py:1046] (2/4) Epoch 50, batch 650, loss[loss=0.1524, simple_loss=0.2421, pruned_loss=0.03134, over 24401.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2333, pruned_loss=0.03585, over 4539428.31 frames. ], batch size: 77, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:45:25,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 17:45:26,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:45:28,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:45:29,431 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:45:30,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:45:34,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 17:45:35,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:45:39,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:45:39,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:45:42,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:45:47,307 WARNING [train.py:1204] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 17:45:48,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:45:50,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:45:53,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:45:53,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 17:45:55,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:45:56,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:45:56,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1739760.0, ans=0.125 2023-10-04 17:45:57,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:45:58,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:00,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:46:03,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:46:03,032 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 17:46:03,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:46:03,053 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:46:04,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:05,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:46:07,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:08,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:46:09,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 17:46:09,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:46:09,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:46:11,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:46:11,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:46:12,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:46:14,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 17:46:15,361 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.64 vs. limit=22.5 2023-10-04 17:46:15,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 17:46:15,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:15,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:46:17,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:46:17,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:46:19,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:46:22,478 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:24,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:46:24,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:46:27,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:28,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 17:46:28,511 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:35,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:46:35,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:46:35,581 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:46:35,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:46:38,245 INFO [train.py:1046] (2/4) Epoch 50, batch 700, loss[loss=0.1606, simple_loss=0.247, pruned_loss=0.03715, over 24061.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2324, pruned_loss=0.03549, over 4579824.76 frames. ], batch size: 80, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:46:39,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 17:46:41,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 17:46:44,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 17:46:44,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:45,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:46:47,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 17:46:50,804 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.097e+02 2.406e+02 2.781e+02 3.851e+02, threshold=4.811e+02, percent-clipped=0.0 2023-10-04 17:46:54,209 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:46:55,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:46:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:59,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:46:59,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:47:02,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:47:05,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 17:47:05,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:47:06,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 17:47:06,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1740093.3333333333, ans=0.125 2023-10-04 17:47:09,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 17:47:12,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:47:13,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:47:15,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:47:18,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:47:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 17:47:24,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:24,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1740160.0, ans=0.0 2023-10-04 17:47:25,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:47:26,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 17:47:27,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:47:29,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1740160.0, ans=0.125 2023-10-04 17:47:30,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:31,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:47:35,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:47:37,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 17:47:40,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 17:47:40,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 17:47:42,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:43,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:47:44,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:47:46,228 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:46,240 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 17:47:50,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 17:47:51,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 17:47:51,043 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 17:47:52,325 INFO [train.py:1046] (2/4) Epoch 50, batch 750, loss[loss=0.1537, simple_loss=0.2166, pruned_loss=0.04536, over 19473.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2317, pruned_loss=0.03536, over 4613118.28 frames. ], batch size: 388, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:47:52,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 17:47:53,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 17:47:53,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:47:54,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1740293.3333333333, ans=0.125 2023-10-04 17:47:55,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 17:47:55,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:56,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:47:57,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:47:59,393 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:59,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:48:00,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:48:03,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:48:04,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:48:06,185 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:48:08,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:48:10,280 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:48:11,557 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 17:48:12,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:48:14,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:48:16,329 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:48:17,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:48:17,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 17:48:17,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:48:17,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1740360.0, ans=0.125 2023-10-04 17:48:21,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 17:48:21,904 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 17:48:23,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 17:48:23,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:48:23,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:48:26,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:48:30,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:48:30,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:48:30,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:48:31,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1740426.6666666667, ans=0.1 2023-10-04 17:48:33,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:48:34,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:48:34,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 17:48:34,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:48:36,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 17:48:37,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:48:40,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:48:41,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 17:48:42,345 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:48:46,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:48:47,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:48:47,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:48:51,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:48:51,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1740560.0, ans=0.0 2023-10-04 17:48:54,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 17:48:55,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:48:55,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:48:56,920 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:48:58,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:01,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:01,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:49:05,299 INFO [train.py:1046] (2/4) Epoch 50, batch 800, loss[loss=0.1354, simple_loss=0.213, pruned_loss=0.02893, over 24627.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2323, pruned_loss=0.0353, over 4638066.89 frames. ], batch size: 60, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:49:09,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:09,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:10,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.24 vs. limit=15.0 2023-10-04 17:49:10,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:49:10,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:12,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:12,204 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:14,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:16,753 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.060e+02 2.355e+02 2.799e+02 5.307e+02, threshold=4.709e+02, percent-clipped=2.0 2023-10-04 17:49:20,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:20,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:49:23,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 17:49:24,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:26,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:26,237 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:49:26,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:49:27,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 17:49:27,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:28,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 17:49:30,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:33,265 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:35,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:49:35,877 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:49:38,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:38,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:43,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:49:44,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.17 vs. limit=22.5 2023-10-04 17:49:44,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:49:44,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 17:49:45,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1740760.0, ans=10.0 2023-10-04 17:49:45,982 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 17:49:46,011 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 17:49:47,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:49:47,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:48,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:48,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:49:51,780 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 17:49:53,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 17:49:55,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:49:57,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1740826.6666666667, ans=0.0 2023-10-04 17:49:58,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:50:01,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:50:05,727 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:50:07,090 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 17:50:07,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:50:10,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 17:50:15,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:50:17,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:50:17,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 17:50:17,849 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:50:19,242 INFO [train.py:1046] (2/4) Epoch 50, batch 850, loss[loss=0.1522, simple_loss=0.2284, pruned_loss=0.03795, over 23574.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2333, pruned_loss=0.03595, over 4649548.76 frames. ], batch size: 232, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:50:19,347 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:50:20,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 17:50:20,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:22,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:50:22,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:24,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:50:25,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:50:28,504 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 17:50:28,556 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 17:50:28,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 17:50:31,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:50:31,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:50:33,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:33,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:50:34,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:50:36,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1741026.6666666667, ans=0.1 2023-10-04 17:50:39,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:40,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:50:40,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 17:50:40,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1741026.6666666667, ans=0.125 2023-10-04 17:50:43,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 17:50:46,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:47,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 17:50:50,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 17:50:51,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 17:50:54,662 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 17:50:55,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:50:55,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:50:55,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 17:50:57,874 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:57,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:57,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 17:51:01,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:51:01,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:02,643 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:51:02,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:51:04,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:51:05,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:51:05,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 17:51:10,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:51:10,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:51:10,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:51:10,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:51:12,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:15,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:51:16,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:51:19,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:51:20,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:20,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:51:28,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:51:30,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:51:30,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 17:51:30,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:51:30,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:51:32,660 INFO [train.py:1046] (2/4) Epoch 50, batch 900, loss[loss=0.1532, simple_loss=0.2324, pruned_loss=0.037, over 23507.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03617, over 4666861.56 frames. ], batch size: 256, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:51:32,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 17:51:38,334 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:51:41,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:41,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 17:51:42,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:51:42,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 17:51:44,439 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:51:44,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:51:44,530 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:51:45,743 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.794e+02 2.220e+02 2.526e+02 3.144e+02 5.127e+02, threshold=5.052e+02, percent-clipped=1.0 2023-10-04 17:51:45,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:51:45,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:51:45,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1741360.0, ans=0.125 2023-10-04 17:51:56,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:56,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:56,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:52:00,787 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:52:01,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:52:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 17:52:07,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:52:12,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:52:14,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:52:16,099 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 17:52:16,175 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 17:52:20,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:52:20,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:52:21,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:52:28,065 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:28,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:52:29,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 17:52:30,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:52:33,948 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 17:52:35,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:52:35,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:37,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:52:38,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:52:41,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1741560.0, ans=0.125 2023-10-04 17:52:42,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 17:52:42,945 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 17:52:44,304 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 17:52:44,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 17:52:45,555 INFO [train.py:1046] (2/4) Epoch 50, batch 950, loss[loss=0.1573, simple_loss=0.2427, pruned_loss=0.03597, over 23967.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03641, over 4663133.75 frames. ], batch size: 86, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:52:47,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:50,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 17:52:50,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1741626.6666666667, ans=0.125 2023-10-04 17:52:51,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1741626.6666666667, ans=0.125 2023-10-04 17:52:54,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:52:56,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:52:57,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:52:57,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:52:59,367 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 17:53:00,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.13 vs. limit=12.0 2023-10-04 17:53:03,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:05,428 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:53:06,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:53:06,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:53:06,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 17:53:08,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 17:53:09,104 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.05 vs. limit=15.0 2023-10-04 17:53:09,571 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:10,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.57 vs. limit=22.5 2023-10-04 17:53:12,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 17:53:12,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:53:15,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:15,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:53:15,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:53:15,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1741760.0, ans=0.1 2023-10-04 17:53:18,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 17:53:19,979 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:53:22,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:53:22,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:53:27,024 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:53:27,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:53:28,730 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:53:29,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 17:53:31,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 17:53:31,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:53:31,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1741826.6666666667, ans=0.0 2023-10-04 17:53:32,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:53:34,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:34,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:53:37,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 17:53:37,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:53:38,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1741826.6666666667, ans=0.0 2023-10-04 17:53:40,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:53:40,529 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:40,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 17:53:40,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1741826.6666666667, ans=0.125 2023-10-04 17:53:41,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:41,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:53:41,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 17:53:46,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:53:49,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:53,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:53:56,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 17:53:56,346 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 17:53:59,215 INFO [train.py:1046] (2/4) Epoch 50, batch 1000, loss[loss=0.1437, simple_loss=0.2272, pruned_loss=0.03016, over 19949.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.03602, over 4664103.86 frames. ], batch size: 43, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:53:59,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:59,547 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:54:03,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 17:54:03,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:09,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:54:10,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 17:54:10,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 17:54:12,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1741960.0, ans=0.125 2023-10-04 17:54:12,391 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:54:13,325 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.086e+02 2.281e+02 2.987e+02 4.437e+02, threshold=4.562e+02, percent-clipped=0.0 2023-10-04 17:54:16,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:16,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:54:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:20,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 17:54:23,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 17:54:25,348 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:54:26,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 17:54:26,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:54:27,890 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 17:54:29,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 17:54:29,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 17:54:31,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:32,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:39,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1742093.3333333333, ans=0.04949747468305833 2023-10-04 17:54:40,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:41,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:54:41,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:42,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:42,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 17:54:42,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:54:44,167 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:54:44,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:44,269 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 17:54:47,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 17:54:47,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 17:54:48,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 17:54:51,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:54:57,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:57,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:54:57,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:58,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:55:01,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 17:55:03,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:55:05,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 17:55:05,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 17:55:05,244 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:55:05,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:55:09,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:55:11,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:55:12,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:55:13,945 INFO [train.py:1046] (2/4) Epoch 50, batch 1050, loss[loss=0.1428, simple_loss=0.2251, pruned_loss=0.03026, over 24363.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2313, pruned_loss=0.03583, over 4670578.19 frames. ], batch size: 61, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:55:16,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:55:17,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:55:18,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:55:18,975 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:55:21,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:55:23,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:55:23,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:55:27,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:55:28,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:55:28,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:55:30,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:55:30,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 17:55:32,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:55:33,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 17:55:36,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:55:36,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 17:55:36,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 17:55:42,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:55:43,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:55:44,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:55:46,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 17:55:46,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 17:55:46,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:55:51,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 17:55:55,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 17:55:55,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:55:58,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 17:55:59,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 17:55:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:56:00,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:56:01,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1742493.3333333333, ans=0.125 2023-10-04 17:56:01,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1742493.3333333333, ans=0.125 2023-10-04 17:56:04,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:56:07,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 17:56:07,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 17:56:08,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 17:56:08,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:56:08,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:56:10,499 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 17:56:10,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1742493.3333333333, ans=0.125 2023-10-04 17:56:13,298 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:56:14,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:56:14,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:56:16,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:56:16,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:56:16,804 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:56:20,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:56:20,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 17:56:20,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:56:20,803 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 17:56:22,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 17:56:23,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:56:25,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.92 vs. limit=15.0 2023-10-04 17:56:26,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:56:27,628 INFO [train.py:1046] (2/4) Epoch 50, batch 1100, loss[loss=0.1468, simple_loss=0.2198, pruned_loss=0.03686, over 22753.00 frames. ], tot_loss[loss=0.1509, simple_loss=0.2308, pruned_loss=0.03552, over 4674289.98 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:56:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:56:31,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1742626.6666666667, ans=0.125 2023-10-04 17:56:36,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:56:37,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1742626.6666666667, ans=0.0 2023-10-04 17:56:38,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:56:39,561 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:56:39,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 17:56:39,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:56:41,302 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.164e+02 2.522e+02 3.230e+02 4.448e+02, threshold=5.045e+02, percent-clipped=0.0 2023-10-04 17:56:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:56:46,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:56:48,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:56:48,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 17:56:49,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 17:56:51,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:56:51,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:56:54,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:56:55,905 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:57:00,596 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:57:03,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 17:57:05,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 17:57:05,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:08,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:09,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:57:10,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:57:10,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 17:57:10,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:57:12,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:57:12,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:57:13,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:13,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 17:57:18,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:57:18,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 17:57:20,492 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:57:23,184 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:57:24,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:57:25,767 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 17:57:25,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:57:25,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1742893.3333333333, ans=0.07 2023-10-04 17:57:27,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:29,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:57:29,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:57:32,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 17:57:32,507 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:57:33,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:57:35,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 17:57:35,131 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:57:35,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 17:57:36,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:57:36,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:57:39,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:57:41,899 INFO [train.py:1046] (2/4) Epoch 50, batch 1150, loss[loss=0.1554, simple_loss=0.2419, pruned_loss=0.03444, over 24026.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2317, pruned_loss=0.03561, over 4689347.56 frames. ], batch size: 80, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:57:44,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:57:48,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:57:49,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:57:49,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:57:51,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 17:57:51,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:57:54,210 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 17:57:55,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:57:56,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=1743026.6666666667, ans=15.0 2023-10-04 17:57:56,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:58:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 17:58:01,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1743026.6666666667, ans=0.125 2023-10-04 17:58:04,444 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:58:07,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:58:09,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:09,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.34 vs. limit=22.5 2023-10-04 17:58:10,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 17:58:10,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:58:10,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:58:12,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1743093.3333333333, ans=0.125 2023-10-04 17:58:13,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1743093.3333333333, ans=0.0 2023-10-04 17:58:14,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 17:58:16,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:58:16,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1743093.3333333333, ans=0.125 2023-10-04 17:58:17,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:58:24,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1743093.3333333333, ans=0.0 2023-10-04 17:58:25,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:31,330 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:31,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 17:58:32,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:32,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:38,707 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 17:58:41,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:46,985 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 17:58:49,855 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:58:51,224 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:58:51,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:58:53,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:58:56,182 INFO [train.py:1046] (2/4) Epoch 50, batch 1200, loss[loss=0.2026, simple_loss=0.2704, pruned_loss=0.06738, over 19115.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2326, pruned_loss=0.03579, over 4702565.62 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 17:58:57,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:01,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:59:01,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:59:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:05,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:06,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:59:07,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:59:09,092 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.143e+02 2.355e+02 2.830e+02 4.818e+02, threshold=4.710e+02, percent-clipped=0.0 2023-10-04 17:59:09,270 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:59:11,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:11,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:59:12,775 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 17:59:15,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 17:59:17,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1743360.0, ans=0.1 2023-10-04 17:59:19,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:59:22,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:59:23,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:25,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:59:25,808 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 17:59:27,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:36,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:59:36,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:59:36,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 17:59:36,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:59:38,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1743426.6666666667, ans=0.0 2023-10-04 17:59:41,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 17:59:45,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 17:59:45,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:46,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:59:46,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1743493.3333333333, ans=0.125 2023-10-04 17:59:49,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:59:49,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:59:50,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:50,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:59:50,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1743493.3333333333, ans=0.125 2023-10-04 17:59:52,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:59:52,207 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 17:59:53,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:59:53,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:59:53,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 17:59:54,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.77 vs. limit=15.0 2023-10-04 17:59:56,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:56,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:59:59,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:00:01,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:00:04,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 18:00:07,783 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 18:00:09,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:00:10,951 INFO [train.py:1046] (2/4) Epoch 50, batch 1250, loss[loss=0.1502, simple_loss=0.2429, pruned_loss=0.02872, over 24530.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.233, pruned_loss=0.03573, over 4716799.37 frames. ], batch size: 71, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:00:12,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:00:13,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:00:14,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.36 vs. limit=6.0 2023-10-04 18:00:15,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:00:16,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 18:00:20,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:00:20,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:22,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 18:00:24,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:00:26,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:00:26,526 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:00:27,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1743693.3333333333, ans=0.2 2023-10-04 18:00:30,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:00:31,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:32,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:00:32,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:00:35,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:00:38,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:00:38,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:00:38,337 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:00:40,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:00:41,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:41,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1743760.0, ans=0.05 2023-10-04 18:00:43,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1743760.0, ans=10.0 2023-10-04 18:00:43,589 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.58 vs. limit=22.5 2023-10-04 18:00:44,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:00:45,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1743760.0, ans=0.125 2023-10-04 18:00:46,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:00:47,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1743760.0, ans=0.1 2023-10-04 18:00:51,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 18:00:51,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:00:54,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:00:54,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 18:00:54,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:54,136 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 18:00:54,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:55,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:57,241 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:00:58,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:01:01,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:01:01,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:01:03,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 18:01:03,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 18:01:03,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 18:01:05,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:09,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 18:01:09,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:01:12,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 18:01:12,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:01:13,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 18:01:13,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:01:13,998 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:01:14,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:01:14,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:01:16,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 18:01:19,554 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:01:21,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:01:22,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:01:23,773 INFO [train.py:1046] (2/4) Epoch 50, batch 1300, loss[loss=0.1467, simple_loss=0.2263, pruned_loss=0.03354, over 24452.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.03586, over 4722996.30 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:01:25,132 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:01:27,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:01:27,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 18:01:28,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1743960.0, ans=0.2 2023-10-04 18:01:29,666 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.260e-03 2023-10-04 18:01:30,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1743960.0, ans=0.125 2023-10-04 18:01:32,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:34,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:01:34,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:01:35,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:01:37,115 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.152e+02 2.609e+02 3.024e+02 4.878e+02, threshold=5.218e+02, percent-clipped=1.0 2023-10-04 18:01:37,219 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:01:38,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 18:01:43,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:01:46,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:01:47,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 18:01:50,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:01:53,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:01:54,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:01:56,044 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:56,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:01:57,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:01:57,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:01:57,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 18:02:04,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:02:04,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:02:05,742 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 18:02:07,006 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:02:08,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:02:09,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:02:11,818 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 18:02:13,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:02:13,152 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 18:02:15,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:02:19,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:02:19,174 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:02:19,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1744160.0, ans=0.125 2023-10-04 18:02:23,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 18:02:24,538 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 18:02:24,624 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 18:02:29,289 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:02:32,018 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 18:02:32,782 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:02:32,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1744226.6666666667, ans=0.125 2023-10-04 18:02:36,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.37 vs. limit=15.0 2023-10-04 18:02:38,156 INFO [train.py:1046] (2/4) Epoch 50, batch 1350, loss[loss=0.1458, simple_loss=0.2317, pruned_loss=0.02991, over 24642.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2332, pruned_loss=0.03604, over 4723350.73 frames. ], batch size: 68, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:02:39,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 18:02:42,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:02:44,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:02:47,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:02:47,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:02:47,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:02:48,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:02:52,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:02:53,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 18:02:55,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:02:55,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:02:59,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 18:03:01,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:03:01,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:03:03,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 18:03:04,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 18:03:05,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 18:03:07,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:07,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 18:03:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:22,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1744493.3333333333, ans=0.125 2023-10-04 18:03:26,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:28,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:28,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 18:03:29,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:31,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 18:03:31,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:03:32,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:03:35,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:03:37,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 18:03:38,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:03:44,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 18:03:47,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 18:03:47,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1744560.0, ans=0.2 2023-10-04 18:03:51,539 INFO [train.py:1046] (2/4) Epoch 50, batch 1400, loss[loss=0.149, simple_loss=0.2232, pruned_loss=0.03735, over 23764.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2316, pruned_loss=0.03556, over 4704822.51 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:03:52,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 18:03:53,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:54,488 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:03:56,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:03:59,958 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 18:04:00,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1744626.6666666667, ans=0.0 2023-10-04 18:04:01,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 18:04:04,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1744626.6666666667, ans=0.1 2023-10-04 18:04:05,071 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.179e+02 2.424e+02 2.840e+02 4.477e+02, threshold=4.849e+02, percent-clipped=0.0 2023-10-04 18:04:09,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:04:11,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:04:14,010 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:04:14,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:04:16,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:04:18,148 WARNING [train.py:1204] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 18:04:25,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1744760.0, ans=0.0 2023-10-04 18:04:30,321 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:30,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:34,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 18:04:35,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:04:35,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:04:37,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:04:37,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:04:38,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:04:38,707 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:04:40,542 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:04:43,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 18:04:43,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:04:46,570 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:50,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:04:52,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1744893.3333333333, ans=0.125 2023-10-04 18:04:56,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 18:04:57,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 18:05:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:05:02,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 18:05:02,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:05,576 INFO [train.py:1046] (2/4) Epoch 50, batch 1450, loss[loss=0.157, simple_loss=0.2361, pruned_loss=0.03897, over 23558.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2313, pruned_loss=0.03548, over 4705868.03 frames. ], batch size: 120, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:05:05,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:05:05,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1744960.0, ans=0.1 2023-10-04 18:05:07,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:05:08,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:05:08,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:08,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 18:05:13,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:13,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1744960.0, ans=0.125 2023-10-04 18:05:13,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-10-04 18:05:15,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:05:16,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:05:16,713 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 18:05:18,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:05:18,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 18:05:18,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:19,670 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:19,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 18:05:21,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:05:22,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:05:23,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 18:05:23,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:25,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:05:25,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:26,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1745026.6666666667, ans=0.125 2023-10-04 18:05:27,927 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:30,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:05:30,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:05:30,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1745026.6666666667, ans=0.125 2023-10-04 18:05:33,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:33,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:35,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:35,963 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:05:35,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:36,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:05:41,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 18:05:45,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:05:47,992 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 18:05:50,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:05:52,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:05:53,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:05:55,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 18:05:58,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1745160.0, ans=10.0 2023-10-04 18:05:58,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1745160.0, ans=0.125 2023-10-04 18:05:59,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:00,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 18:06:02,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 18:06:03,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:03,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1745226.6666666667, ans=0.125 2023-10-04 18:06:06,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:06:06,864 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:06:09,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 18:06:12,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 18:06:13,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 18:06:15,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:15,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:06:20,485 INFO [train.py:1046] (2/4) Epoch 50, batch 1500, loss[loss=0.1292, simple_loss=0.2021, pruned_loss=0.02813, over 22393.00 frames. ], tot_loss[loss=0.1509, simple_loss=0.2317, pruned_loss=0.0351, over 4718192.80 frames. ], batch size: 49, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:06:22,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1745293.3333333333, ans=0.125 2023-10-04 18:06:23,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 18:06:23,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:06:23,604 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:06:24,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:24,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:06:26,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:06:26,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 18:06:27,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:06:29,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:06:29,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:06:30,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:06:30,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1745293.3333333333, ans=0.125 2023-10-04 18:06:33,288 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.060e+02 2.266e+02 2.721e+02 4.158e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 18:06:33,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:06:34,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:06:40,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:06:40,222 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 18:06:40,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1745360.0, ans=0.125 2023-10-04 18:06:41,560 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:06:41,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:06:42,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:46,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 18:06:49,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 18:06:50,967 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:52,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 18:06:54,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:06:56,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:06:57,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:57,682 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:06:59,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 18:07:00,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:07:00,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:07:00,486 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 18:07:01,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:07:04,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:07:04,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 18:07:09,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:07:10,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1745493.3333333333, ans=0.2 2023-10-04 18:07:12,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:07:17,506 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 18:07:17,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:17,566 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 18:07:18,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:19,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1745560.0, ans=0.125 2023-10-04 18:07:20,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:07:20,429 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 18:07:21,717 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:07:24,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 18:07:25,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:26,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1745560.0, ans=0.125 2023-10-04 18:07:27,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:07:27,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:28,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:07:28,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:28,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:07:29,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1745560.0, ans=0.1 2023-10-04 18:07:30,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 18:07:31,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 18:07:31,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:07:32,927 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 18:07:32,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 18:07:34,224 INFO [train.py:1046] (2/4) Epoch 50, batch 1550, loss[loss=0.1391, simple_loss=0.2195, pruned_loss=0.02939, over 23620.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2326, pruned_loss=0.03549, over 4718929.19 frames. ], batch size: 149, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:07:35,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:07:37,009 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:37,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:07:38,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:07:38,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:40,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:45,781 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 18:07:45,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:47,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:07:47,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:07:48,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:07:48,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 18:07:51,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:07:51,513 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 18:07:52,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 18:07:52,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 18:07:52,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:54,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:07:57,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:07:59,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 18:07:59,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 18:08:01,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1745693.3333333333, ans=0.5 2023-10-04 18:08:08,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:08:08,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1745760.0, ans=0.1 2023-10-04 18:08:12,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:08:12,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:08:12,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:08:13,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 18:08:19,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:08:21,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:22,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1745826.6666666667, ans=0.0 2023-10-04 18:08:23,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:08:25,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:08:26,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:08:26,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 18:08:26,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:08:26,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:08:27,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:28,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 18:08:29,317 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 18:08:29,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1745826.6666666667, ans=0.05 2023-10-04 18:08:32,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:08:36,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 18:08:42,551 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:08:45,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:45,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 18:08:49,103 INFO [train.py:1046] (2/4) Epoch 50, batch 1600, loss[loss=0.1631, simple_loss=0.2462, pruned_loss=0.03997, over 24049.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.03576, over 4715874.52 frames. ], batch size: 86, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:08:49,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:08:49,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:08:49,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:08:51,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:08:52,448 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:08:54,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:08:54,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1745960.0, ans=0.0 2023-10-04 18:08:55,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 18:08:56,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 18:08:59,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 18:08:59,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1745960.0, ans=0.125 2023-10-04 18:09:00,899 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:09:02,117 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.102e+02 2.370e+02 2.617e+02 3.812e+02, threshold=4.739e+02, percent-clipped=0.0 2023-10-04 18:09:02,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 18:09:03,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:09:06,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:09:11,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:09:14,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 18:09:16,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:09:18,176 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 18:09:19,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:20,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 18:09:22,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1746093.3333333333, ans=0.5 2023-10-04 18:09:26,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 18:09:32,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:09:32,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 18:09:33,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:09:33,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:09:33,440 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:09:34,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 18:09:35,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=7.99 vs. limit=22.5 2023-10-04 18:09:38,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:09:40,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:09:40,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:41,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:43,105 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:09:44,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:09:46,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:09:46,410 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:09:52,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:52,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:09:54,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 18:09:54,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:09:56,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 18:09:58,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1746226.6666666667, ans=0.0 2023-10-04 18:10:02,471 INFO [train.py:1046] (2/4) Epoch 50, batch 1650, loss[loss=0.1537, simple_loss=0.2411, pruned_loss=0.03315, over 24486.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2337, pruned_loss=0.03608, over 4708415.55 frames. ], batch size: 66, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:10:02,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:02,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1746293.3333333333, ans=0.2 2023-10-04 18:10:03,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:10:05,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:10:05,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 18:10:05,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 18:10:05,369 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 18:10:06,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 18:10:09,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:10:09,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:10:09,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:10:10,922 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:10:12,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:13,735 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 18:10:15,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:10:15,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:10:15,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:10:15,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:10:17,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 18:10:18,359 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 18:10:23,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:10:24,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:10:32,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 18:10:32,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:34,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1746426.6666666667, ans=0.0 2023-10-04 18:10:35,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 18:10:37,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:10:39,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:10:39,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:10:41,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:10:41,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:10:42,044 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.96 vs. limit=15.0 2023-10-04 18:10:42,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:44,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:44,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1746493.3333333333, ans=0.125 2023-10-04 18:10:46,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:46,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:10:47,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:10:49,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:10:51,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:10:54,516 WARNING [train.py:1204] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:10:55,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 18:10:57,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1746493.3333333333, ans=0.125 2023-10-04 18:10:58,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:10:58,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 18:11:00,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 18:11:00,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 18:11:01,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:01,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:11:02,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-10-04 18:11:02,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:11:03,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:11:03,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 18:11:06,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:11:07,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:11:08,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:11:10,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 18:11:15,454 INFO [train.py:1046] (2/4) Epoch 50, batch 1700, loss[loss=0.138, simple_loss=0.2104, pruned_loss=0.03281, over 23637.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2334, pruned_loss=0.03584, over 4713066.52 frames. ], batch size: 232, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:11:15,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:11:15,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:11:15,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 18:11:16,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:11:16,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:11:16,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:11:18,883 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:11:18,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:11:18,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 18:11:21,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.50 vs. limit=15.0 2023-10-04 18:11:22,699 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:11:31,430 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.062e+02 2.401e+02 2.672e+02 3.684e+02, threshold=4.801e+02, percent-clipped=0.0 2023-10-04 18:11:32,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:11:34,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:11:34,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1746693.3333333333, ans=0.125 2023-10-04 18:11:38,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:11:40,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:11:40,078 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:11:40,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:11:42,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 18:11:44,172 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:11:44,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:45,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:11:47,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:11:50,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 18:11:52,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 18:11:53,881 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:55,795 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 18:11:57,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:12:00,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1746826.6666666667, ans=0.1 2023-10-04 18:12:02,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1746826.6666666667, ans=0.0 2023-10-04 18:12:03,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1746826.6666666667, ans=0.125 2023-10-04 18:12:04,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:04,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1746826.6666666667, ans=0.125 2023-10-04 18:12:05,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:05,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:12:07,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:12:08,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 18:12:08,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:12:10,003 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:12:11,103 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:11,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 18:12:11,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:12:11,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:12,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:12,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:13,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:13,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:12:15,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:15,400 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:12:16,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:19,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:12:21,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 18:12:24,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:24,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:12:24,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1746893.3333333333, ans=0.125 2023-10-04 18:12:29,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 18:12:30,758 INFO [train.py:1046] (2/4) Epoch 50, batch 1750, loss[loss=0.1493, simple_loss=0.2371, pruned_loss=0.03079, over 24630.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2322, pruned_loss=0.03549, over 4709264.44 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:12:33,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.90 vs. limit=15.0 2023-10-04 18:12:33,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:35,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:35,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:12:36,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 18:12:36,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:39,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:12:39,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:43,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 18:12:44,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:48,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 18:12:48,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:49,605 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:12:51,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:12:52,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 18:12:56,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:12:57,510 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 18:13:05,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:13:07,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:07,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:13:07,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1747093.3333333333, ans=0.125 2023-10-04 18:13:10,322 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:10,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:13:12,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:13:13,759 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.07 vs. limit=12.0 2023-10-04 18:13:14,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:15,866 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:13:17,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:13:19,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 18:13:19,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:13:21,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 18:13:23,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:13:25,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:13:26,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:13:30,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:13:31,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 18:13:31,550 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:33,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:13:37,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:13:40,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:13:40,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1747226.6666666667, ans=0.125 2023-10-04 18:13:41,600 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:13:42,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 18:13:42,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:44,237 INFO [train.py:1046] (2/4) Epoch 50, batch 1800, loss[loss=0.1504, simple_loss=0.2415, pruned_loss=0.02962, over 24631.00 frames. ], tot_loss[loss=0.1507, simple_loss=0.2309, pruned_loss=0.03526, over 4701038.91 frames. ], batch size: 68, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:13:44,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:13:44,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:13:44,342 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:13:44,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:13:45,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:13:48,913 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:13:48,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:49,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1747293.3333333333, ans=0.125 2023-10-04 18:13:50,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:13:50,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1747293.3333333333, ans=0.125 2023-10-04 18:13:54,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:56,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:13:57,869 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:13:59,698 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.181e+02 2.519e+02 2.906e+02 5.254e+02, threshold=5.039e+02, percent-clipped=1.0 2023-10-04 18:14:01,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:04,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:04,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:05,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:14:07,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:14:08,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 18:14:08,787 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:12,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:17,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 18:14:18,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 18:14:18,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 18:14:18,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:19,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:19,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:14:21,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:14:29,812 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 18:14:31,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:14:32,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:32,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 18:14:32,633 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 18:14:34,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:14:35,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:14:35,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:14:40,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 18:14:44,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:14:45,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 18:14:45,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:14:45,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:46,979 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:14:47,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 18:14:48,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:14:48,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:14:53,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 18:14:53,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:54,937 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:14:54,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:14:54,972 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:58,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:58,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:14:59,460 INFO [train.py:1046] (2/4) Epoch 50, batch 1850, loss[loss=0.1396, simple_loss=0.2273, pruned_loss=0.02598, over 24634.00 frames. ], tot_loss[loss=0.1509, simple_loss=0.2314, pruned_loss=0.0352, over 4714094.24 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:14:59,566 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:14:59,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:15:02,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:15:02,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:15:09,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:15:09,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 18:15:12,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 18:15:15,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 18:15:18,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:15:19,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 18:15:19,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 18:15:29,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:15:29,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 18:15:29,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1747760.0, ans=0.0 2023-10-04 18:15:32,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:15:32,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:15:35,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 18:15:35,635 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:15:35,657 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:15:38,258 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:15:39,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:15:42,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:15:45,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1747826.6666666667, ans=0.125 2023-10-04 18:15:45,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.26 vs. limit=15.0 2023-10-04 18:15:46,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:15:46,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:15:46,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1747826.6666666667, ans=0.0 2023-10-04 18:15:47,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:15:47,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:15:49,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:15:51,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:15:54,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 18:15:54,536 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:15:59,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:15:59,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:15:59,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 18:15:59,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 18:16:01,765 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 18:16:03,166 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 18:16:04,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:16:04,637 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:16:04,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:16:06,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:06,433 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 18:16:07,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:16:07,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:09,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:16:09,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:16:09,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:16:09,393 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 18:16:12,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:12,153 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 18:16:13,470 INFO [train.py:1046] (2/4) Epoch 50, batch 1900, loss[loss=0.1494, simple_loss=0.2267, pruned_loss=0.036, over 23356.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2321, pruned_loss=0.03541, over 4723558.33 frames. ], batch size: 106, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:16:13,524 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:16:13,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:16:17,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:16:19,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1747960.0, ans=0.0 2023-10-04 18:16:20,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:16:22,485 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 18:16:22,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 18:16:22,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1747960.0, ans=0.2 2023-10-04 18:16:22,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1747960.0, ans=0.125 2023-10-04 18:16:23,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:16:23,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:16:23,869 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 18:16:25,160 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 18:16:27,852 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.104e+02 2.393e+02 2.783e+02 5.984e+02, threshold=4.786e+02, percent-clipped=4.0 2023-10-04 18:16:28,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1748026.6666666667, ans=0.125 2023-10-04 18:16:29,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 18:16:30,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1748026.6666666667, ans=0.125 2023-10-04 18:16:32,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:16:34,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1748026.6666666667, ans=0.1 2023-10-04 18:16:35,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 18:16:37,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 18:16:47,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 18:16:49,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 18:16:49,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:50,013 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 18:16:50,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 18:16:50,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 18:16:51,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 18:16:51,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:16:56,086 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 18:16:59,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:17:00,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:17:00,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 18:17:03,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:17:04,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.13 vs. limit=15.0 2023-10-04 18:17:08,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 18:17:10,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:17:14,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:17:14,640 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:17:16,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:17:16,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:17:18,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:17:18,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:17:20,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:17:21,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1748226.6666666667, ans=0.0 2023-10-04 18:17:23,133 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:17:23,135 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:17:26,380 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:17:26,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:17:27,676 INFO [train.py:1046] (2/4) Epoch 50, batch 1950, loss[loss=0.1546, simple_loss=0.2495, pruned_loss=0.0298, over 24607.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2335, pruned_loss=0.03586, over 4729099.72 frames. ], batch size: 71, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:17:27,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:17:29,130 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:17:32,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:17:35,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:17:35,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:35,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:17:38,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 18:17:39,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 18:17:39,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:40,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:43,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:17:43,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:17:43,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:44,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:17:47,186 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:17:47,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:17:47,231 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:17:47,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1748360.0, ans=0.125 2023-10-04 18:17:48,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:51,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:53,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:17:53,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:17:53,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:17:53,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 18:17:55,360 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:17:55,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:17:56,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:57,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1748426.6666666667, ans=0.1 2023-10-04 18:18:00,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:18:01,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:18:07,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:18:07,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1748426.6666666667, ans=0.125 2023-10-04 18:18:09,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:18:11,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:18:11,083 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 18:18:12,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:18:15,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:18:17,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:18:17,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:18:22,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1748493.3333333333, ans=0.125 2023-10-04 18:18:24,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:26,276 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:27,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:30,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:33,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:18:33,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:34,426 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.03 vs. limit=10.0 2023-10-04 18:18:35,061 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 18:18:35,066 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:18:35,133 WARNING [train.py:1204] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:18:36,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 18:18:37,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:18:39,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=1748560.0, ans=0.1 2023-10-04 18:18:41,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:18:43,191 INFO [train.py:1046] (2/4) Epoch 50, batch 2000, loss[loss=0.1502, simple_loss=0.2348, pruned_loss=0.03282, over 24619.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2337, pruned_loss=0.03599, over 4727280.18 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:18:43,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:18:44,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:18:45,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:18:47,293 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:50,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 18:18:51,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:18:54,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:18:55,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 18:18:57,498 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.194e+02 2.589e+02 3.041e+02 4.664e+02, threshold=5.178e+02, percent-clipped=0.0 2023-10-04 18:18:58,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:18:58,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:19:01,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:19:03,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 18:19:06,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:06,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:07,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:07,743 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 18:19:07,780 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:19:09,500 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:19:10,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 18:19:10,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:19:13,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:19:13,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:19:13,166 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:14,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:19:14,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:19:16,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 18:19:19,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 18:19:19,659 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:19:20,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:26,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:27,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:19:27,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:19:28,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:19:31,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:19:31,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:31,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:19:31,465 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:34,070 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:36,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=8.30 vs. limit=12.0 2023-10-04 18:19:36,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:19:36,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 18:19:41,703 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:19:41,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1748893.3333333333, ans=0.0 2023-10-04 18:19:43,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:44,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1748893.3333333333, ans=0.125 2023-10-04 18:19:47,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:47,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:19:50,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:52,320 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:19:52,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:52,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:19:52,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:19:55,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:57,088 INFO [train.py:1046] (2/4) Epoch 50, batch 2050, loss[loss=0.1483, simple_loss=0.2273, pruned_loss=0.03471, over 24441.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2328, pruned_loss=0.03606, over 4712302.70 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:19:57,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:01,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:20:01,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:05,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:20:06,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:20:08,831 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:08,910 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:20:11,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 18:20:11,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:20:11,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:20:11,630 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:20:16,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1749026.6666666667, ans=0.125 2023-10-04 18:20:20,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.42 vs. limit=12.0 2023-10-04 18:20:22,137 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:20:22,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:20:24,081 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 18:20:25,556 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:20:26,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 18:20:28,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:20:31,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:20:31,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1749093.3333333333, ans=0.07 2023-10-04 18:20:32,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:20:32,647 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:20:34,013 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:20:34,120 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:20:37,451 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:20:37,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:20:39,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:20:41,689 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:20:41,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1749160.0, ans=0.2 2023-10-04 18:20:44,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:20:44,434 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:20:48,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-04 18:20:48,816 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:20:49,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1749160.0, ans=0.07 2023-10-04 18:20:54,069 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:20:55,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 18:20:58,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1749226.6666666667, ans=0.125 2023-10-04 18:20:58,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1749226.6666666667, ans=0.0 2023-10-04 18:21:00,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:21:02,323 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:21:03,739 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:21:07,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 18:21:09,986 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 18:21:09,987 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:10,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:21:11,259 INFO [train.py:1046] (2/4) Epoch 50, batch 2100, loss[loss=0.1513, simple_loss=0.2299, pruned_loss=0.03629, over 18511.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2317, pruned_loss=0.0356, over 4701565.10 frames. ], batch size: 40, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:21:11,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:21:11,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:21:12,695 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 18:21:12,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 18:21:13,715 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.40 vs. limit=15.0 2023-10-04 18:21:14,160 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:21:14,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1749293.3333333333, ans=0.125 2023-10-04 18:21:16,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:21:18,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:21:22,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:22,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:21:22,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1749293.3333333333, ans=0.125 2023-10-04 18:21:23,653 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 18:21:23,844 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:21:25,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:21:26,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 18:21:26,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 18:21:27,636 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.172e+02 2.598e+02 3.189e+02 5.506e+02, threshold=5.196e+02, percent-clipped=2.0 2023-10-04 18:21:28,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:21:28,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:21:28,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 18:21:28,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 18:21:35,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 18:21:35,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:21:36,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:21:36,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:21:41,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:21:41,297 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 18:21:42,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:42,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 18:21:42,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1749426.6666666667, ans=0.0 2023-10-04 18:21:44,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 18:21:44,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:44,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 18:21:45,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 18:21:45,436 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 18:21:48,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:21:49,483 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:21:52,917 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:21:53,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1749426.6666666667, ans=0.0 2023-10-04 18:21:54,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:21:56,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:21:59,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:59,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 18:21:59,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:59,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:59,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:00,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 18:22:01,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 18:22:02,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 18:22:06,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:22:09,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:22:10,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 18:22:13,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:16,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:22:16,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:22:16,708 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:22:16,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 18:22:17,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:22:20,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:20,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:22:22,026 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:22:22,057 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:23,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 18:22:23,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1749626.6666666667, ans=0.125 2023-10-04 18:22:25,407 INFO [train.py:1046] (2/4) Epoch 50, batch 2150, loss[loss=0.1391, simple_loss=0.2228, pruned_loss=0.0277, over 24457.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2313, pruned_loss=0.03536, over 4711700.26 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:22:25,515 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 18:22:25,535 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:25,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1749626.6666666667, ans=0.125 2023-10-04 18:22:28,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:22:28,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:22:28,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:22:29,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:22:33,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 18:22:35,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:37,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:38,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:22:38,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:39,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:22:41,901 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:43,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:22:43,255 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:22:47,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:48,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 18:22:51,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:22:51,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:22:53,765 WARNING [train.py:1204] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:53,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:22:55,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:55,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:22:55,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:55,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:22:57,075 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:58,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 18:23:00,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1749760.0, ans=0.125 2023-10-04 18:23:00,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1749760.0, ans=0.125 2023-10-04 18:23:01,565 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:23:01,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:01,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:03,039 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:23:04,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:23:07,195 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:07,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:23:08,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:08,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 18:23:09,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:23:11,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:23:13,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:14,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:23:14,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1749826.6666666667, ans=0.125 2023-10-04 18:23:16,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:23:17,312 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:18,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:18,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 18:23:20,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 18:23:21,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:23:21,401 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 18:23:21,458 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:21,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:23:22,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 18:23:22,844 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:23:22,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 18:23:22,866 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 18:23:22,867 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 18:23:24,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 18:23:26,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:26,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:23:26,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:23:27,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:27,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:23:29,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:29,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:33,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1749893.3333333333, ans=0.125 2023-10-04 18:23:38,028 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.69 vs. limit=15.0 2023-10-04 18:23:38,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:23:39,923 INFO [train.py:1046] (2/4) Epoch 50, batch 2200, loss[loss=0.1649, simple_loss=0.2257, pruned_loss=0.05208, over 19219.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2311, pruned_loss=0.03543, over 4700139.66 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:23:39,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 18:23:43,348 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:23:46,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1749960.0, ans=0.0 2023-10-04 18:23:47,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:47,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:23:48,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:50,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:23:52,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:54,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:54,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 18:23:55,571 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.726e+02 2.058e+02 2.327e+02 2.714e+02 4.351e+02, threshold=4.654e+02, percent-clipped=0.0 2023-10-04 18:23:58,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 18:24:01,845 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:24:02,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1750026.6666666667, ans=0.125 2023-10-04 18:24:06,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 18:24:09,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:09,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:24:11,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:24:14,435 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:24:14,464 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 18:24:17,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:24:17,786 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=15.0 2023-10-04 18:24:18,571 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:18,632 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 18:24:22,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:24:23,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:24:25,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:24:26,600 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:29,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 18:24:31,244 WARNING [train.py:1204] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:32,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 18:24:32,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1750160.0, ans=10.0 2023-10-04 18:24:35,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:35,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:24:36,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:39,115 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:24:39,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:24:39,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:39,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:40,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:24:40,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:24:43,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:24:46,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:24:47,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:24:49,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:24:50,829 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 18:24:53,540 INFO [train.py:1046] (2/4) Epoch 50, batch 2250, loss[loss=0.1538, simple_loss=0.2386, pruned_loss=0.03453, over 23070.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2323, pruned_loss=0.03558, over 4702651.44 frames. ], batch size: 105, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:24:53,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:24:53,662 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 18:24:54,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:24:56,280 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 18:24:56,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:57,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:25:00,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:25:00,502 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 18:25:03,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:25:07,678 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:25:11,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:25:11,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:25:13,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1750360.0, ans=0.0 2023-10-04 18:25:14,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:14,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:25:14,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:25:14,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1750360.0, ans=0.125 2023-10-04 18:25:17,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 18:25:17,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:25:17,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:25:20,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 18:25:21,779 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:25:21,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:23,177 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:25:28,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:25:30,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:25:30,115 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:25:31,527 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 18:25:34,139 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:34,295 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:25:38,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:25:41,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:25:41,104 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:25:42,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:25:45,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:25:45,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:25:49,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:25:52,818 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:25:54,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.45 vs. limit=15.0 2023-10-04 18:25:58,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:25:58,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:25:59,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:26:05,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:26:06,939 INFO [train.py:1046] (2/4) Epoch 50, batch 2300, loss[loss=0.1582, simple_loss=0.2285, pruned_loss=0.04393, over 23661.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2332, pruned_loss=0.03611, over 4701059.19 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:26:07,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:26:07,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 18:26:07,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:07,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:26:09,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 18:26:11,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:26:13,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:17,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:17,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1750626.6666666667, ans=0.04949747468305833 2023-10-04 18:26:19,357 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:26:22,141 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 18:26:23,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:24,814 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.106e+02 2.353e+02 2.933e+02 4.045e+02, threshold=4.706e+02, percent-clipped=0.0 2023-10-04 18:26:29,158 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:26:30,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:26:30,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:26:30,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:30,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 18:26:30,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:26:33,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:26:33,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:26:39,101 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:26:41,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:26:44,687 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:26:47,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:26:49,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:52,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:26:53,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1750826.6666666667, ans=0.125 2023-10-04 18:26:54,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:55,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1750826.6666666667, ans=0.2 2023-10-04 18:26:55,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.82 vs. limit=22.5 2023-10-04 18:26:57,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:26:57,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:26:59,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:26:59,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 18:27:02,047 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:27:02,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:03,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:03,399 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:27:03,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:27:05,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 18:27:05,376 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:27:05,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 18:27:05,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:27:05,449 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:06,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 18:27:11,562 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:27:14,332 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:27:17,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:27:17,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:27:17,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:27:20,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:27:20,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:27:21,696 INFO [train.py:1046] (2/4) Epoch 50, batch 2350, loss[loss=0.1461, simple_loss=0.2144, pruned_loss=0.03884, over 22584.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2337, pruned_loss=0.0364, over 4689481.24 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:27:21,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:27:21,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 18:27:28,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:27:28,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 18:27:33,504 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 18:27:36,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:37,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:37,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:37,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:27:39,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:27:39,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1751026.6666666667, ans=0.125 2023-10-04 18:27:41,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 18:27:43,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:27:43,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1751026.6666666667, ans=0.125 2023-10-04 18:27:43,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1751026.6666666667, ans=0.1 2023-10-04 18:27:49,899 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 18:27:52,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:27:55,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:27:55,490 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:27:56,930 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:27:58,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 18:27:59,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:28:00,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:28:00,032 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:28:01,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:28:04,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:28:06,308 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 18:28:06,361 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:28:06,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.25 vs. limit=15.0 2023-10-04 18:28:10,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:28:10,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:28:12,326 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 18:28:13,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:28:15,147 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 18:28:16,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:28:18,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 18:28:22,768 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 18:28:24,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:28:24,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 18:28:24,116 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 18:28:24,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 18:28:26,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 18:28:31,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:28:31,796 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.59 vs. limit=6.0 2023-10-04 18:28:35,673 INFO [train.py:1046] (2/4) Epoch 50, batch 2400, loss[loss=0.1355, simple_loss=0.2058, pruned_loss=0.03266, over 23400.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2329, pruned_loss=0.0364, over 4692208.69 frames. ], batch size: 285, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:28:35,773 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:28:41,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:28:43,481 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:28:43,521 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 18:28:44,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 18:28:50,781 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:28:50,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:28:53,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.47 vs. limit=15.0 2023-10-04 18:28:54,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 18:28:54,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:28:55,231 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.154e+02 2.535e+02 3.094e+02 5.336e+02, threshold=5.070e+02, percent-clipped=5.0 2023-10-04 18:28:55,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:28:55,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 18:29:01,008 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:02,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 18:29:05,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:29:09,097 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 18:29:11,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1751426.6666666667, ans=0.2 2023-10-04 18:29:13,084 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:29:14,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:17,843 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:29:17,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 18:29:17,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:29:25,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1751493.3333333333, ans=0.1 2023-10-04 18:29:26,673 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:29,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:29:30,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:29:32,280 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:29:32,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:29:32,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:29:32,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:33,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:29:33,613 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:29:36,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:29:38,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:29:38,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1751560.0, ans=0.2 2023-10-04 18:29:40,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 18:29:40,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 18:29:41,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:29:41,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:42,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 18:29:42,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 18:29:42,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 18:29:42,889 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 18:29:44,567 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 18:29:45,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:29:46,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:46,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:29:48,700 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 18:29:50,520 INFO [train.py:1046] (2/4) Epoch 50, batch 2450, loss[loss=0.1177, simple_loss=0.1726, pruned_loss=0.03133, over 19270.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.232, pruned_loss=0.0359, over 4698628.57 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:29:50,594 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:50,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:29:52,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1751626.6666666667, ans=0.1 2023-10-04 18:29:53,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:29:53,526 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:29:56,383 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:29:56,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:29:57,751 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 18:30:01,928 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:30:04,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:30:04,538 WARNING [train.py:1204] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:08,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:30:09,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:30:09,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:30:09,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 18:30:13,907 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:16,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:30:17,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:30:23,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:30:24,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:24,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:24,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:30:25,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 18:30:27,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:30:34,227 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:30:35,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:35,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:30:35,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:30:35,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:30:37,100 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:30:37,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 18:30:42,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:42,096 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:30:44,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:30:44,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:30:49,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:30:49,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 18:30:50,971 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:30:52,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:30:52,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 18:30:52,761 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:30:54,145 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:30:58,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:30:58,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1751893.3333333333, ans=0.125 2023-10-04 18:30:59,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:31:00,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1751893.3333333333, ans=0.125 2023-10-04 18:31:01,118 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:31:03,727 INFO [train.py:1046] (2/4) Epoch 50, batch 2500, loss[loss=0.1518, simple_loss=0.2261, pruned_loss=0.03882, over 23367.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2319, pruned_loss=0.03556, over 4717269.35 frames. ], batch size: 119, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:31:03,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 18:31:04,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1751960.0, ans=0.125 2023-10-04 18:31:05,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:31:13,052 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:31:17,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1752026.6666666667, ans=0.125 2023-10-04 18:31:19,501 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-10-04 18:31:20,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:31:20,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:31:21,993 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:31:21,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 18:31:23,685 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.188e+02 2.534e+02 3.108e+02 6.481e+02, threshold=5.068e+02, percent-clipped=2.0 2023-10-04 18:31:28,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:31:29,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:31:29,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:31:29,536 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:31:30,941 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 18:31:32,261 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:33,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:31:33,660 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 18:31:33,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:34,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.03 vs. limit=15.0 2023-10-04 18:31:34,977 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 18:31:35,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:40,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:31:41,055 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:31:44,309 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:31:45,639 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 18:31:45,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:31:45,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1752093.3333333333, ans=0.125 2023-10-04 18:31:47,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:51,149 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:53,128 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:56,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1752160.0, ans=0.0 2023-10-04 18:31:57,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:31:57,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1752160.0, ans=0.125 2023-10-04 18:32:01,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:32:03,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.42 vs. limit=15.0 2023-10-04 18:32:04,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 18:32:04,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:32:04,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:32:05,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-10-04 18:32:06,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:32:06,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:32:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 18:32:08,710 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 18:32:08,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 18:32:11,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.20 vs. limit=6.0 2023-10-04 18:32:11,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:32:13,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 18:32:13,923 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 18:32:15,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:32:15,492 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 18:32:15,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1752226.6666666667, ans=0.125 2023-10-04 18:32:17,972 INFO [train.py:1046] (2/4) Epoch 50, batch 2550, loss[loss=0.1468, simple_loss=0.2281, pruned_loss=0.03276, over 23496.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2323, pruned_loss=0.03562, over 4708954.99 frames. ], batch size: 134, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:32:19,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 18:32:22,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:32:22,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1752293.3333333333, ans=0.125 2023-10-04 18:32:24,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:32:25,379 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:32:27,271 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:32:27,343 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 18:32:27,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:32:30,256 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 18:32:31,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1752360.0, ans=0.125 2023-10-04 18:32:32,922 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:32:34,273 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:36,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:32:36,923 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 18:32:36,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:32:36,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:32:37,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:32:37,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1752360.0, ans=0.125 2023-10-04 18:32:39,737 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:32:39,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 18:32:41,613 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:32:41,619 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:41,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 18:32:47,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-10-04 18:32:53,252 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:32:54,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1752426.6666666667, ans=0.125 2023-10-04 18:32:59,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:32:59,442 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:59,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:33:00,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:33:08,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1752493.3333333333, ans=0.1 2023-10-04 18:33:09,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:33:12,294 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:33:12,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:33:12,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:33:13,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:33:13,623 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:33:16,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:33:16,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:33:21,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:33:21,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 18:33:21,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:33:21,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:33:22,418 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:33:23,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:33:23,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:33:30,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:33:31,748 INFO [train.py:1046] (2/4) Epoch 50, batch 2600, loss[loss=0.1822, simple_loss=0.2532, pruned_loss=0.05558, over 19317.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2329, pruned_loss=0.03562, over 4709222.98 frames. ], batch size: 389, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:33:33,145 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:33:34,556 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 18:33:36,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-10-04 18:33:37,347 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 18:33:37,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:33:37,397 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 18:33:38,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 18:33:38,777 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 18:33:43,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:33:43,241 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 18:33:43,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1752626.6666666667, ans=0.125 2023-10-04 18:33:45,192 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 18:33:45,282 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 18:33:46,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:33:47,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1752693.3333333333, ans=0.125 2023-10-04 18:33:48,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 18:33:51,031 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 2.019e+02 2.212e+02 2.492e+02 4.115e+02, threshold=4.424e+02, percent-clipped=0.0 2023-10-04 18:33:51,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 18:33:52,473 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:33:52,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 18:33:55,230 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 18:33:55,245 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 18:34:02,320 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.95 vs. limit=12.0 2023-10-04 18:34:04,169 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:04,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:04,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:34:04,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 18:34:04,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:34:08,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1752760.0, ans=0.125 2023-10-04 18:34:11,162 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 18:34:14,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:14,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:16,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 18:34:16,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:34:16,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:34:18,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 18:34:20,935 WARNING [train.py:1204] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:34:20,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:34:21,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1752826.6666666667, ans=0.125 2023-10-04 18:34:23,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:34:27,659 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 18:34:27,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:34:27,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:34:33,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:34:33,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:34:33,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 18:34:35,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:36,649 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:34:38,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:34:39,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1752893.3333333333, ans=0.0 2023-10-04 18:34:43,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 18:34:45,716 INFO [train.py:1046] (2/4) Epoch 50, batch 2650, loss[loss=0.1544, simple_loss=0.2334, pruned_loss=0.0377, over 23868.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2335, pruned_loss=0.03587, over 4723159.76 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:34:45,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:48,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:34:51,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1752960.0, ans=0.0 2023-10-04 18:34:52,662 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 18:34:52,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:54,017 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:34:54,086 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 18:34:55,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:34:56,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:58,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1753026.6666666667, ans=0.1 2023-10-04 18:34:59,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:35:00,952 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:35:02,341 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:35:04,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 18:35:04,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:35:04,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:35:07,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 18:35:08,480 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 18:35:09,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:12,707 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 18:35:12,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:12,772 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 18:35:17,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:18,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:35:18,030 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:18,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 18:35:22,216 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 18:35:22,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1753093.3333333333, ans=0.125 2023-10-04 18:35:23,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:35:29,107 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 18:35:29,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:31,029 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:31,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:35:31,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:35:31,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:32,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:35:34,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:35:35,704 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:35:35,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:35:37,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:35:39,803 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:39,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:35:39,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:43,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:35:43,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:35:47,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:47,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:35:48,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:48,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 18:35:52,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:53,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:53,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:54,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:35:56,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:35:56,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:35:56,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1753226.6666666667, ans=0.125 2023-10-04 18:35:58,861 INFO [train.py:1046] (2/4) Epoch 50, batch 2700, loss[loss=0.1473, simple_loss=0.2259, pruned_loss=0.03435, over 19874.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2343, pruned_loss=0.03642, over 4712356.41 frames. ], batch size: 43, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:35:58,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:35:58,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 18:36:02,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:36:03,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 18:36:05,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:36:06,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:06,966 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:08,621 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:36:08,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:36:08,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:36:08,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1753293.3333333333, ans=0.125 2023-10-04 18:36:09,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:36:09,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 18:36:11,371 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:36:14,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:36:14,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:36:14,754 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:36:18,703 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.205e+02 2.516e+02 3.164e+02 5.488e+02, threshold=5.032e+02, percent-clipped=3.0 2023-10-04 18:36:18,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:36:20,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 18:36:20,323 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:36:20,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1753360.0, ans=0.1 2023-10-04 18:36:26,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:36:26,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:36:26,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1753360.0, ans=0.2 2023-10-04 18:36:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:36:30,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:36:30,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:36:30,746 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:36:33,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:36:36,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:36:36,894 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:36:36,914 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:36:41,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:41,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:36:41,484 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:36:41,577 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:36:48,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:36:50,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:36:53,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:36:53,598 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:36:57,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:59,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:00,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:37:00,472 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:01,860 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:37:03,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:37:05,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:37:07,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:37:07,902 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:37:11,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 18:37:11,742 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.20 vs. limit=15.0 2023-10-04 18:37:12,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:13,957 INFO [train.py:1046] (2/4) Epoch 50, batch 2750, loss[loss=0.1451, simple_loss=0.2297, pruned_loss=0.03023, over 24672.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2348, pruned_loss=0.03652, over 4710418.90 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:37:14,695 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:37:14,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 18:37:15,974 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 18:37:16,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:20,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:20,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:22,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:22,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:37:22,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1753626.6666666667, ans=0.125 2023-10-04 18:37:23,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:26,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:37:26,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:37:27,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:37:27,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:27,419 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 18:37:27,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:37:27,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:31,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 18:37:31,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1753693.3333333333, ans=0.1 2023-10-04 18:37:33,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:37:34,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:34,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:37:36,349 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:37:36,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:38,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:37:38,321 WARNING [train.py:1204] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:39,515 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:39,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1753693.3333333333, ans=0.0 2023-10-04 18:37:43,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:37:43,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:37:44,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:37:44,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:45,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:37:51,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:52,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:37:54,181 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:37:55,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1753760.0, ans=0.125 2023-10-04 18:37:58,272 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:58,276 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:37:58,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:38:02,783 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:38:04,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:38:04,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 18:38:04,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1753826.6666666667, ans=0.125 2023-10-04 18:38:08,794 WARNING [train.py:1204] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:10,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 18:38:16,233 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:38:18,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:38:18,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 18:38:18,965 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:38:21,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:38:23,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 18:38:23,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:38:26,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 18:38:26,435 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:26,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:38:27,750 INFO [train.py:1046] (2/4) Epoch 50, batch 2800, loss[loss=0.1542, simple_loss=0.2449, pruned_loss=0.0318, over 24645.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2325, pruned_loss=0.03603, over 4712347.69 frames. ], batch size: 73, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:38:27,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 18:38:28,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1753960.0, ans=0.2 2023-10-04 18:38:29,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:38:29,217 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:30,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:38:30,626 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 18:38:30,627 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 18:38:33,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:35,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:38:35,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:38:39,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:38:40,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.42 vs. limit=22.5 2023-10-04 18:38:41,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1754026.6666666667, ans=0.09899494936611666 2023-10-04 18:38:42,806 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 18:38:44,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 18:38:45,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 18:38:47,175 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.767e+02 2.165e+02 2.479e+02 3.124e+02 4.666e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-04 18:38:47,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:47,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:38:47,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:38:50,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:38:51,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:51,667 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:38:51,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1754026.6666666667, ans=0.125 2023-10-04 18:38:53,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:39:00,677 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:39:02,055 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:39:03,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:04,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:39:06,815 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:12,118 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:39:12,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 18:39:12,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:12,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:39:12,862 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:39:14,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1754160.0, ans=0.1 2023-10-04 18:39:16,885 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:17,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1754160.0, ans=0.0 2023-10-04 18:39:18,087 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:19,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1754160.0, ans=10.0 2023-10-04 18:39:20,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:39:23,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:39:23,590 WARNING [train.py:1204] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:23,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:39:23,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:39:23,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1754160.0, ans=0.1 2023-10-04 18:39:25,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:39:26,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:39:26,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 18:39:26,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:39:28,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:39:28,259 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:39:29,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 18:39:30,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:31,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:39:31,057 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:39:32,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 18:39:39,267 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.94 vs. limit=15.0 2023-10-04 18:39:40,016 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:39:40,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:39:40,098 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:39:41,402 INFO [train.py:1046] (2/4) Epoch 50, batch 2850, loss[loss=0.155, simple_loss=0.2487, pruned_loss=0.03063, over 24425.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2315, pruned_loss=0.0355, over 4695063.23 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:39:42,847 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:39:47,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:39:47,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:39:47,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:49,059 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:49,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:50,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:39:51,846 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 18:39:57,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 18:39:57,544 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:39:59,522 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 18:40:00,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:03,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 18:40:03,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 18:40:06,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:10,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1754426.6666666667, ans=0.04949747468305833 2023-10-04 18:40:19,269 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:40:19,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:40:19,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:40:22,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:40:22,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:40:22,056 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:40:22,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:40:23,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 18:40:26,338 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:40:26,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:40:27,753 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:40:27,808 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:30,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:40:30,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:40:32,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:33,985 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:40:35,427 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:40:35,483 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:36,884 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:38,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:40:43,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:40:45,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 18:40:45,071 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 18:40:47,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:40:47,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:40:47,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 18:40:49,197 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:40:49,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:40:50,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:40:50,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:40:50,471 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 18:40:50,529 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 18:40:50,532 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:40:51,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:52,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1754560.0, ans=0.125 2023-10-04 18:40:54,807 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:40:54,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:40:56,081 INFO [train.py:1046] (2/4) Epoch 50, batch 2900, loss[loss=0.151, simple_loss=0.2328, pruned_loss=0.03455, over 23236.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2323, pruned_loss=0.03578, over 4692038.30 frames. ], batch size: 105, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:40:56,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:40:57,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 18:41:00,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:41:00,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 18:41:01,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 18:41:03,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:41:03,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:41:05,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:41:07,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:41:11,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:41:11,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:41:13,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:41:15,121 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 18:41:15,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:41:15,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1754693.3333333333, ans=0.05 2023-10-04 18:41:16,326 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.129e+02 2.392e+02 2.896e+02 5.102e+02, threshold=4.784e+02, percent-clipped=2.0 2023-10-04 18:41:16,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:19,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 18:41:20,574 WARNING [train.py:1204] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 18:41:23,281 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:41:23,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 18:41:23,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:41:26,043 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:41:26,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:41:28,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.80 vs. limit=15.0 2023-10-04 18:41:28,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:41:30,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:32,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:41:34,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:41:36,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 18:41:36,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 18:41:36,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:41:41,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:41:43,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 18:41:44,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.58 vs. limit=15.0 2023-10-04 18:41:45,550 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:41:47,928 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-10-04 18:41:51,479 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:59,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:41:59,762 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:42:01,202 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 18:42:03,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:03,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 18:42:05,325 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:42:05,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:42:10,537 INFO [train.py:1046] (2/4) Epoch 50, batch 2950, loss[loss=0.1497, simple_loss=0.2212, pruned_loss=0.03912, over 23864.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2327, pruned_loss=0.03561, over 4700406.15 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:42:11,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:42:13,317 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 18:42:15,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:42:15,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:16,624 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:18,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:42:19,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 18:42:19,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 18:42:20,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1754960.0, ans=0.125 2023-10-04 18:42:21,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:42:21,208 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:42:21,531 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:42:25,614 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:42:26,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:42:28,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:42:29,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:42:33,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:42:33,835 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:42:35,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:35,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1755026.6666666667, ans=0.0 2023-10-04 18:42:37,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:37,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:42:39,876 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 18:42:44,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 18:42:44,461 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 18:42:45,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:42:47,410 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 18:42:49,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 18:42:50,539 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:42:50,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:42:50,591 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 18:42:50,595 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:42:53,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 18:42:55,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:42:55,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:42:56,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:58,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:42:58,155 WARNING [train.py:1204] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:42:58,183 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 18:42:58,218 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:59,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 18:43:03,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:43:05,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:43:06,562 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 18:43:06,586 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:43:09,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 18:43:12,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:43:14,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:43:14,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:43:16,969 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:43:16,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:43:17,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:43:18,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:18,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:43:19,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:43:19,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:43:21,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:43:24,343 INFO [train.py:1046] (2/4) Epoch 50, batch 3000, loss[loss=0.1475, simple_loss=0.2227, pruned_loss=0.03609, over 23861.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2336, pruned_loss=0.03546, over 4717956.07 frames. ], batch size: 179, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:43:24,343 INFO [train.py:1069] (2/4) Computing validation loss 2023-10-04 18:43:36,826 INFO [train.py:1078] (2/4) Epoch 50, validation: loss=0.3701, simple_loss=0.2758, pruned_loss=0.2322, over 1125622.00 frames. 2023-10-04 18:43:36,827 INFO [train.py:1079] (2/4) Maximum memory allocated so far is 20967MB 2023-10-04 18:43:36,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:36,906 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 18:43:37,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:43:41,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:43:45,969 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 18:43:46,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 18:43:46,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1755293.3333333333, ans=0.125 2023-10-04 18:43:48,722 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:43:48,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:43:50,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 18:43:50,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:43:57,837 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.159e+02 2.457e+02 2.749e+02 3.497e+02, threshold=4.915e+02, percent-clipped=0.0 2023-10-04 18:43:57,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:43:58,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1755360.0, ans=0.125 2023-10-04 18:44:06,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:44:07,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1755426.6666666667, ans=0.125 2023-10-04 18:44:12,603 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 18:44:12,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:44:15,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:44:16,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:44:16,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:44:18,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:44:18,823 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 18:44:20,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 18:44:21,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:44:22,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:44:24,921 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:44:24,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:44:25,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:25,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:44:29,615 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:44:29,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:44:29,650 WARNING [train.py:1204] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:44:31,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:44:33,801 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 18:44:35,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:44:35,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:44:35,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:44:38,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1755560.0, ans=0.2 2023-10-04 18:44:38,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1755560.0, ans=0.125 2023-10-04 18:44:39,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:39,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:41,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 18:44:41,266 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 18:44:42,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:44:42,548 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 18:44:42,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:44:45,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 18:44:45,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1755560.0, ans=0.2 2023-10-04 18:44:47,455 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:44:47,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1755560.0, ans=0.0 2023-10-04 18:44:48,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 18:44:50,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 18:44:51,413 INFO [train.py:1046] (2/4) Epoch 50, batch 3050, loss[loss=0.147, simple_loss=0.2377, pruned_loss=0.02821, over 24564.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2337, pruned_loss=0.03559, over 4720751.00 frames. ], batch size: 71, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:44:51,453 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 18:44:51,454 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:44:51,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:44:52,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:52,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:44:54,085 WARNING [train.py:1204] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:44:54,144 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:44:59,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 18:45:00,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:45:03,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:04,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:45:04,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1755693.3333333333, ans=0.05 2023-10-04 18:45:06,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:09,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 18:45:14,882 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 18:45:14,929 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 18:45:14,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:18,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:45:18,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1755693.3333333333, ans=0.125 2023-10-04 18:45:21,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:21,136 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:21,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1755760.0, ans=0.1 2023-10-04 18:45:22,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:23,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:45:25,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:45:26,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:26,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:26,458 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:28,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:31,491 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:34,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:34,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 18:45:34,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:34,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:45:37,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:45:38,517 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:45:38,575 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:45:39,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:45:44,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:44,514 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:45:51,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:51,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:45:51,892 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:54,713 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:45:56,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:45:56,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:45:56,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 18:45:57,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:45:57,911 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:58,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1755893.3333333333, ans=0.125 2023-10-04 18:45:59,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 18:46:01,153 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:05,224 INFO [train.py:1046] (2/4) Epoch 50, batch 3100, loss[loss=0.16, simple_loss=0.247, pruned_loss=0.03655, over 23355.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.234, pruned_loss=0.03585, over 4713066.44 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:46:05,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:06,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:46:09,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:46:12,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 18:46:13,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 18:46:15,307 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 18:46:15,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1755960.0, ans=0.1 2023-10-04 18:46:16,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:46:20,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:46:20,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:22,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:46:25,458 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.135e+02 2.495e+02 2.950e+02 6.010e+02, threshold=4.989e+02, percent-clipped=3.0 2023-10-04 18:46:25,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:32,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 18:46:32,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1756026.6666666667, ans=0.0 2023-10-04 18:46:35,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 18:46:35,107 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:36,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:46:36,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:46:37,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 18:46:39,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:46:39,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 18:46:39,194 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:46:40,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:42,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 18:46:43,918 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:46:46,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:46:48,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 18:46:48,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 18:46:49,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:49,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:52,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.15 vs. limit=15.0 2023-10-04 18:46:54,109 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:46:54,119 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:54,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:46:56,125 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:46:56,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:57,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:46:57,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:46:57,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:57,375 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 18:47:02,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:47:03,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 18:47:04,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:47:06,284 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 18:47:07,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:07,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:47:07,696 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 18:47:17,925 WARNING [train.py:1204] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 18:47:19,631 INFO [train.py:1046] (2/4) Epoch 50, batch 3150, loss[loss=0.1469, simple_loss=0.2298, pruned_loss=0.03201, over 23280.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2328, pruned_loss=0.03557, over 4706493.89 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:47:21,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:21,168 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:47:22,553 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:47:22,555 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:47:22,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1756293.3333333333, ans=0.1 2023-10-04 18:47:23,920 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 18:47:25,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:25,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:47:25,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1756293.3333333333, ans=0.0 2023-10-04 18:47:28,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 18:47:29,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:34,468 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 18:47:34,649 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 18:47:36,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:47:36,182 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 18:47:37,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 18:47:40,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 18:47:40,255 WARNING [train.py:1204] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 18:47:40,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 18:47:40,281 WARNING [train.py:1204] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:40,284 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:47:41,697 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:43,031 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 18:47:43,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1756360.0, ans=0.0 2023-10-04 18:47:44,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:46,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:46,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:47:47,641 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:47:52,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 18:47:52,344 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:47:53,791 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:47:53,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:47:55,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 18:47:55,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1756426.6666666667, ans=0.125 2023-10-04 18:47:56,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 18:47:58,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:47:58,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 18:47:58,749 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 18:48:00,497 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-10-04 18:48:01,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:48:01,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:48:01,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:48:01,631 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:48:03,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 18:48:03,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:48:03,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:04,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:48:04,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:48:06,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 18:48:06,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:07,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 18:48:07,822 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:09,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 18:48:09,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1756493.3333333333, ans=0.0 2023-10-04 18:48:10,429 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 18:48:10,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1756493.3333333333, ans=0.0 2023-10-04 18:48:11,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:48:11,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:12,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1756493.3333333333, ans=0.125 2023-10-04 18:48:13,661 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 18:48:13,731 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 18:48:15,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:48:19,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:48:21,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:21,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:48:25,406 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:48:25,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1756560.0, ans=0.1 2023-10-04 18:48:27,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:28,553 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 18:48:33,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:48:33,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:48:34,623 INFO [train.py:1046] (2/4) Epoch 50, batch 3200, loss[loss=0.1451, simple_loss=0.2181, pruned_loss=0.03608, over 23862.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2318, pruned_loss=0.03525, over 4708267.69 frames. ], batch size: 212, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:48:36,172 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:36,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1756626.6666666667, ans=0.95 2023-10-04 18:48:38,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:48:38,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 18:48:40,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:43,339 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:48:49,359 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:54,866 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.049e+02 2.254e+02 2.662e+02 4.145e+02, threshold=4.508e+02, percent-clipped=0.0 2023-10-04 18:48:56,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:48:59,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-10-04 18:49:06,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 18:49:06,095 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:49:09,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 18:49:11,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:49:16,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:49:16,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:49:16,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:49:20,611 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 18:49:21,926 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 18:49:24,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 18:49:28,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 18:49:29,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:49:35,703 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:49:35,724 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:49:36,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:49:37,020 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 18:49:37,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 18:49:38,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1756893.3333333333, ans=0.0 2023-10-04 18:49:39,841 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:49:41,197 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 18:49:41,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 18:49:42,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 18:49:45,814 WARNING [train.py:1204] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 18:49:47,708 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:49:48,987 INFO [train.py:1046] (2/4) Epoch 50, batch 3250, loss[loss=0.1584, simple_loss=0.2292, pruned_loss=0.04378, over 23708.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2321, pruned_loss=0.03528, over 4703313.41 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:49:50,441 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:49:50,448 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 18:49:50,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:49:50,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:49:53,246 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 18:49:57,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:50:00,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:50:07,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:07,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 18:50:07,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:09,728 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:50:09,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:50:09,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:50:11,148 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:50:13,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:13,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:50:13,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:15,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:15,179 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:15,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:50:19,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:21,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:50:23,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:23,134 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:24,597 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:24,628 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:50:24,636 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:50:26,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1757093.3333333333, ans=0.125 2023-10-04 18:50:28,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 18:50:28,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:50:28,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:50:30,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:30,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:50:31,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1757093.3333333333, ans=0.1 2023-10-04 18:50:37,389 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:50:44,944 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:50:44,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:44,976 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 18:50:44,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:50:44,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:50:46,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:49,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 18:50:49,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 18:50:49,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1757226.6666666667, ans=0.0 2023-10-04 18:50:51,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:50:52,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:52,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:52,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:50:54,242 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:56,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:50:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:50:58,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 18:50:58,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:50:58,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1757226.6666666667, ans=0.0 2023-10-04 18:50:59,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:50:59,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 18:51:00,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1757226.6666666667, ans=0.1 2023-10-04 18:51:02,931 INFO [train.py:1046] (2/4) Epoch 50, batch 3300, loss[loss=0.1468, simple_loss=0.2199, pruned_loss=0.03688, over 23783.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2332, pruned_loss=0.03534, over 4713678.68 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:51:03,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:51:03,073 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 18:51:05,796 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 18:51:05,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 18:51:05,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:10,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:51:12,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:51:12,140 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:12,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1757293.3333333333, ans=0.125 2023-10-04 18:51:13,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1757293.3333333333, ans=0.125 2023-10-04 18:51:14,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:51:14,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:51:16,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:18,318 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:51:24,177 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.069e+02 2.302e+02 2.665e+02 3.368e+02, threshold=4.603e+02, percent-clipped=0.0 2023-10-04 18:51:24,306 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 18:51:25,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:51:25,572 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:27,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:28,426 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 18:51:29,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:51:29,863 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:51:31,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:51:31,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:51:31,276 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 18:51:33,352 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.31 vs. limit=15.0 2023-10-04 18:51:35,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:35,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:51:37,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:37,475 WARNING [train.py:1204] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 18:51:40,138 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 18:51:40,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:40,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:51:41,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 18:51:44,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 18:51:44,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:51:47,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 18:51:51,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:51:52,638 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:51:52,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:51:55,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:51:55,388 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:55,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:57,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:51:58,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:51:58,582 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:58,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1757493.3333333333, ans=0.0 2023-10-04 18:51:59,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:51:59,994 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 18:52:01,379 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 18:52:03,401 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.20 vs. limit=15.0 2023-10-04 18:52:04,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:52:04,079 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:52:04,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:06,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:52:06,693 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:10,143 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:52:11,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:11,495 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:52:11,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:52:12,991 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:52:14,436 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 18:52:15,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:15,840 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:17,876 INFO [train.py:1046] (2/4) Epoch 50, batch 3350, loss[loss=0.1592, simple_loss=0.2403, pruned_loss=0.03905, over 23380.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2335, pruned_loss=0.03539, over 4712698.68 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:52:17,931 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:52:17,964 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:52:19,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:20,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:20,783 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:24,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:52:27,334 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:28,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:52:30,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:31,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:52:32,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:34,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:52:35,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 18:52:37,037 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 18:52:37,081 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:40,568 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:52:41,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 18:52:41,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 18:52:43,040 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:52:43,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:52:43,156 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:52:44,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 18:52:44,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:44,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:52:47,268 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:49,240 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:50,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:50,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:52:53,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:52:55,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:55,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1757760.0, ans=0.125 2023-10-04 18:52:56,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:01,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:53:01,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:53:04,200 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:53:04,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:05,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:08,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 18:53:08,391 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:53:08,423 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 18:53:08,454 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:53:09,852 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 18:53:11,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:12,988 WARNING [train.py:1204] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:53:20,674 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:20,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 18:53:20,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:53:22,095 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:53:23,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:53:27,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:53:29,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=22.5 2023-10-04 18:53:31,251 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 18:53:32,603 INFO [train.py:1046] (2/4) Epoch 50, batch 3400, loss[loss=0.1404, simple_loss=0.2208, pruned_loss=0.02995, over 23523.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2348, pruned_loss=0.03583, over 4718403.61 frames. ], batch size: 149, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:53:32,650 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:53:32,675 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:53:32,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:34,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 18:53:34,205 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:34,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 18:53:35,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:53:37,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:53:37,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:53:38,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:53:38,589 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 18:53:43,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 18:53:44,502 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 18:53:44,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:53:48,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:53:48,840 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:53:50,651 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:53:50,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:53:51,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1758026.6666666667, ans=0.1 2023-10-04 18:53:54,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.748e+02 2.083e+02 2.309e+02 2.717e+02 5.994e+02, threshold=4.619e+02, percent-clipped=1.0 2023-10-04 18:53:54,864 WARNING [train.py:1204] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:53:56,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 18:53:56,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1758026.6666666667, ans=0.1 2023-10-04 18:54:01,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:54:04,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:54:05,415 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:54:06,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:54:10,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:54:15,468 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 18:54:18,644 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:54:18,882 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=22.5 2023-10-04 18:54:19,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:54:19,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:54:21,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 18:54:21,681 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:54:23,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:54:23,045 WARNING [train.py:1204] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:54:24,368 WARNING [train.py:1204] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:54:26,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:54:30,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:54:30,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:54:35,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:54:36,788 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 18:54:40,888 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:54:45,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 18:54:46,855 INFO [train.py:1046] (2/4) Epoch 50, batch 3450, loss[loss=0.1614, simple_loss=0.229, pruned_loss=0.04693, over 19525.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2349, pruned_loss=0.03609, over 4706495.54 frames. ], batch size: 389, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:54:48,421 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 18:54:50,058 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:54:51,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:54:51,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 18:54:52,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:54:55,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:55:02,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:55:02,241 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:03,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:55:03,656 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:06,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:11,311 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.98 vs. limit=15.0 2023-10-04 18:55:12,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 18:55:16,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 18:55:16,623 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:55:16,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:55:19,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:24,031 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 18:55:24,111 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:55:27,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1758426.6666666667, ans=0.0 2023-10-04 18:55:28,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:55:28,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:55:30,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:55:31,589 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:55:33,778 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 18:55:34,942 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:55:35,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:37,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:55:39,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 18:55:43,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:55:46,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1758560.0, ans=0.125 2023-10-04 18:55:47,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:55:49,150 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:50,687 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:55:50,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1758560.0, ans=0.1 2023-10-04 18:55:52,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1758560.0, ans=0.0 2023-10-04 18:55:54,614 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:54,633 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:55:55,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:55:55,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:55:59,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:56:01,203 INFO [train.py:1046] (2/4) Epoch 50, batch 3500, loss[loss=0.1482, simple_loss=0.2317, pruned_loss=0.03235, over 23565.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2331, pruned_loss=0.03589, over 4700216.59 frames. ], batch size: 134, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 18:56:04,505 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:56:04,578 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 18:56:07,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:56:10,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 18:56:10,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1758626.6666666667, ans=0.0 2023-10-04 18:56:12,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:56:12,260 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 18:56:16,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:56:17,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:56:20,503 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:56:20,510 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:56:20,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:56:20,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:20,616 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:56:20,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 18:56:24,331 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:24,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:56:25,599 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.089e+02 2.430e+02 2.862e+02 4.477e+02, threshold=4.860e+02, percent-clipped=0.0 2023-10-04 18:56:26,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:56:28,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1758693.3333333333, ans=0.1 2023-10-04 18:56:30,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:31,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 18:56:31,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:56:34,506 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:56:35,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:56:37,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:38,530 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:56:38,546 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:56:41,184 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 18:56:42,576 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 18:56:43,225 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 18:56:43,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:56:45,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:47,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:56:47,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:56:48,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1758826.6666666667, ans=0.125 2023-10-04 18:56:50,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1758826.6666666667, ans=0.125 2023-10-04 18:56:51,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:56:51,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:56:54,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1758826.6666666667, ans=0.125 2023-10-04 18:56:55,509 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:56:58,718 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 18:56:58,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 18:56:58,725 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:01,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:57:01,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1758893.3333333333, ans=0.0 2023-10-04 18:57:03,313 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:57:04,665 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:06,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 18:57:07,496 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:57:07,629 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:57:08,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 18:57:10,430 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 18:57:13,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:13,626 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:57:13,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:14,911 INFO [train.py:1046] (2/4) Epoch 50, batch 3550, loss[loss=0.1434, simple_loss=0.2177, pruned_loss=0.03451, over 23613.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.232, pruned_loss=0.03581, over 4698413.83 frames. ], batch size: 256, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 18:57:14,989 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:16,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1758960.0, ans=0.1 2023-10-04 18:57:17,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.88 vs. limit=22.5 2023-10-04 18:57:17,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:57:26,110 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:28,076 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 18:57:29,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1759026.6666666667, ans=0.0 2023-10-04 18:57:30,850 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:57:33,984 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:57:35,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:35,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:57:35,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:57:39,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:39,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:57:39,754 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:39,769 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:57:41,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:57:48,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:57:48,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:49,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:57:49,692 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:49,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:57:49,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 18:57:49,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:51,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:51,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:57:57,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:57,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:57:58,579 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:00,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 18:58:01,000 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.55 vs. limit=15.0 2023-10-04 18:58:01,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:58:02,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1759160.0, ans=0.2 2023-10-04 18:58:02,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.83 vs. limit=22.5 2023-10-04 18:58:03,348 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 18:58:03,401 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:58:06,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:58:06,797 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:58:09,867 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 18:58:09,956 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:58:16,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:58:17,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 18:58:18,776 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:21,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:58:21,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 18:58:29,010 INFO [train.py:1046] (2/4) Epoch 50, batch 3600, loss[loss=0.1519, simple_loss=0.2289, pruned_loss=0.03744, over 23345.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2315, pruned_loss=0.03557, over 4690504.29 frames. ], batch size: 285, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 18:58:29,072 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 18:58:29,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:58:30,478 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:58:31,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:33,736 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:33,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:58:38,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:58:39,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:41,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:58:42,586 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:58:43,937 WARNING [train.py:1204] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:43,940 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 18:58:47,230 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:58:48,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:49,945 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:58:50,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1759360.0, ans=0.0 2023-10-04 18:58:52,625 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.097e+02 2.364e+02 2.970e+02 4.964e+02, threshold=4.728e+02, percent-clipped=1.0 2023-10-04 18:58:54,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:58:55,464 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:58:55,516 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:58:55,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 18:58:56,939 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:58:58,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:59:00,201 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:59:01,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:04,269 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:59:06,203 WARNING [train.py:1204] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:59:06,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 18:59:08,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1759426.6666666667, ans=0.125 2023-10-04 18:59:12,456 WARNING [train.py:1204] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:59:15,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:59:15,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 18:59:18,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:59:24,046 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:28,176 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:32,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:59:32,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:59:32,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 18:59:35,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 18:59:37,010 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 18:59:40,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:59:40,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1759560.0, ans=0.125 2023-10-04 18:59:42,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:59:42,213 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 18:59:43,481 INFO [train.py:1046] (2/4) Epoch 50, batch 3650, loss[loss=0.1487, simple_loss=0.2371, pruned_loss=0.0301, over 24335.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2317, pruned_loss=0.03533, over 4708285.19 frames. ], batch size: 74, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 18:59:43,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:59:43,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:59:43,573 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:59:43,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 18:59:46,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 18:59:47,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:48,363 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 18:59:49,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1759626.6666666667, ans=0.0 2023-10-04 18:59:52,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1759626.6666666667, ans=0.0 2023-10-04 18:59:53,873 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 18:59:53,983 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:59:58,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 18:59:59,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 19:00:03,677 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:03,679 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:00:03,735 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:00:08,263 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 19:00:08,290 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:00:08,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 19:00:10,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:00:10,212 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:10,253 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 19:00:10,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:00:11,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:00:11,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:11,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:00:13,236 WARNING [train.py:1204] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 19:00:14,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 19:00:14,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:00:17,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 19:00:20,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:00:20,217 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:00:24,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:00:27,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:27,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:00:27,264 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:00:28,559 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:00:30,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:00:30,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1759826.6666666667, ans=15.0 2023-10-04 19:00:34,710 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:35,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.48 vs. limit=15.0 2023-10-04 19:00:36,050 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:36,061 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:00:38,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:00:38,112 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:39,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:00:46,404 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 19:00:46,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.68 vs. limit=15.0 2023-10-04 19:00:47,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1759893.3333333333, ans=0.015 2023-10-04 19:00:50,896 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:00:50,908 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:00:50,997 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:00:51,045 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:00:52,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 19:00:52,484 WARNING [train.py:1204] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:53,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 19:00:53,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:00:56,553 INFO [train.py:1046] (2/4) Epoch 50, batch 3700, loss[loss=0.1516, simple_loss=0.2271, pruned_loss=0.03801, over 23675.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2322, pruned_loss=0.03543, over 4721479.80 frames. ], batch size: 256, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:00:57,843 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:00:58,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1759960.0, ans=0.125 2023-10-04 19:00:59,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:59,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:01:02,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:01:02,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 19:01:02,053 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:01:03,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:01:03,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:01:11,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:01:14,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:01:15,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:15,532 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:01:16,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:01:17,007 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:01:18,500 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:18,617 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 19:01:22,811 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.080e+02 2.404e+02 2.897e+02 4.526e+02, threshold=4.809e+02, percent-clipped=0.0 2023-10-04 19:01:26,998 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:01:27,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:01:27,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1760093.3333333333, ans=0.025 2023-10-04 19:01:28,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:01:28,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 19:01:28,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:01:32,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:32,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 19:01:34,236 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:37,437 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:01:40,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:40,684 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:01:42,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:01:44,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1760160.0, ans=0.125 2023-10-04 19:01:45,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1760160.0, ans=0.0 2023-10-04 19:01:48,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:01:48,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 19:01:48,152 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:48,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 19:01:55,534 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:01:55,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:01:57,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:01:57,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 19:01:59,930 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:01:59,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 19:01:59,949 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:01:59,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:02:05,384 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:02:05,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 19:02:08,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 19:02:08,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:02:08,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:09,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:02:09,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:02:09,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1760226.6666666667, ans=0.125 2023-10-04 19:02:12,772 INFO [train.py:1046] (2/4) Epoch 50, batch 3750, loss[loss=0.1398, simple_loss=0.2197, pruned_loss=0.02993, over 23625.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2335, pruned_loss=0.03581, over 4726946.97 frames. ], batch size: 149, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:02:12,934 WARNING [train.py:1204] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:02:16,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:02:16,691 WARNING [train.py:1204] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:02:18,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 19:02:19,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 19:02:22,188 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 19:02:22,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 19:02:25,350 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:02:26,741 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:26,838 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:27,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1760360.0, ans=0.05 2023-10-04 19:02:28,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:02:29,755 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:02:32,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:02:33,990 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:02:35,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1760360.0, ans=0.125 2023-10-04 19:02:36,759 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:02:38,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:02:39,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 19:02:40,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:02:41,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1760426.6666666667, ans=0.07 2023-10-04 19:02:42,940 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:02:43,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1760426.6666666667, ans=0.1 2023-10-04 19:02:44,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:02:49,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 19:02:53,443 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 19:02:54,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:02:55,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:02:55,673 WARNING [train.py:1204] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:02:58,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1760493.3333333333, ans=0.125 2023-10-04 19:03:01,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:01,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 19:03:04,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 19:03:06,817 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:10,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:03:10,889 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:03:14,178 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:03:18,309 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.35 vs. limit=22.5 2023-10-04 19:03:18,823 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:03:20,204 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:03:21,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:03:23,583 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:03:26,228 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 19:03:27,425 INFO [train.py:1046] (2/4) Epoch 50, batch 3800, loss[loss=0.1517, simple_loss=0.2373, pruned_loss=0.03301, over 24558.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.0359, over 4716487.37 frames. ], batch size: 71, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:03:30,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1760626.6666666667, ans=0.0 2023-10-04 19:03:33,277 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:03:35,356 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.78 vs. limit=15.0 2023-10-04 19:03:36,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:37,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 19:03:37,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 19:03:39,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:40,452 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:03:40,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 19:03:41,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 19:03:41,896 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:43,798 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:03:45,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.71 vs. limit=6.0 2023-10-04 19:03:46,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:46,518 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:03:47,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:03:47,855 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 19:03:50,491 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.844e+02 2.154e+02 2.528e+02 3.068e+02 4.826e+02, threshold=5.056e+02, percent-clipped=1.0 2023-10-04 19:03:52,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 19:03:53,820 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:03:53,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:03:55,470 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:03:56,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:03:58,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 19:03:58,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:04:02,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:02,270 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:04:02,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1760760.0, ans=0.2 2023-10-04 19:04:06,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 19:04:06,367 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 19:04:07,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:04:09,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1760826.6666666667, ans=0.125 2023-10-04 19:04:16,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:04:20,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:04:22,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 19:04:25,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 19:04:25,390 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:04:28,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:04:29,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:30,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 19:04:34,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 19:04:34,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 19:04:34,869 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:34,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:04:35,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1760893.3333333333, ans=0.125 2023-10-04 19:04:39,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:04:40,392 INFO [train.py:1046] (2/4) Epoch 50, batch 3850, loss[loss=0.1478, simple_loss=0.2278, pruned_loss=0.03386, over 23312.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2321, pruned_loss=0.03564, over 4703314.43 frames. ], batch size: 119, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:04:41,839 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:04:45,238 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:04:47,178 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 19:04:48,653 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:04:50,068 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:53,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:04:53,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1760960.0, ans=0.0 2023-10-04 19:04:56,090 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:04:57,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 19:04:58,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 19:05:03,129 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:04,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:05:07,126 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:07,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:05:10,246 WARNING [train.py:1204] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:10,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1761093.3333333333, ans=0.125 2023-10-04 19:05:11,526 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:05:11,585 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:11,599 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:05:12,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:14,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:16,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:16,301 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:05:16,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 19:05:17,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 19:05:18,958 WARNING [train.py:1204] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:18,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:22,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:22,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:22,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 19:05:22,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-04 19:05:25,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 19:05:26,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:27,950 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 19:05:29,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 19:05:33,647 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:35,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:39,015 WARNING [train.py:1204] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:40,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 19:05:41,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 19:05:45,093 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:45,124 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:47,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1761226.6666666667, ans=0.125 2023-10-04 19:05:48,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:05:48,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:05:49,685 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:51,644 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:51,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:05:51,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 19:05:51,730 WARNING [train.py:1204] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:52,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 19:05:52,890 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:54,184 INFO [train.py:1046] (2/4) Epoch 50, batch 3900, loss[loss=0.1468, simple_loss=0.2207, pruned_loss=0.03647, over 23654.00 frames. ], tot_loss[loss=0.1504, simple_loss=0.2306, pruned_loss=0.03509, over 4700712.82 frames. ], batch size: 232, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:05:54,226 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:55,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:05:55,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:57,015 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:05:57,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:57,066 WARNING [train.py:1204] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:58,411 WARNING [train.py:1204] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:05:58,418 WARNING [train.py:1204] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 19:05:59,757 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:02,606 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:02,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:06:04,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:06:05,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:08,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:06:08,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:09,745 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:06:11,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1761360.0, ans=0.5 2023-10-04 19:06:12,447 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 19:06:12,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:06:14,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 19:06:14,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:15,875 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 19:06:17,796 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.032e+02 2.234e+02 2.595e+02 4.358e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 19:06:17,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 19:06:18,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1761360.0, ans=0.05 2023-10-04 19:06:22,382 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:06:23,973 WARNING [train.py:1204] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:06:23,986 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:06:24,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:06:26,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:06:29,602 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:06:30,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:06:30,960 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:06:32,365 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:06:39,302 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:06:39,346 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:06:45,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1761493.3333333333, ans=0.09899494936611666 2023-10-04 19:06:46,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:06:46,824 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:06:57,465 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:58,959 WARNING [train.py:1204] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:07:00,254 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 19:07:00,282 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 19:07:00,294 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:07:00,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1761560.0, ans=0.125 2023-10-04 19:07:01,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 19:07:02,971 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:07:04,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 19:07:04,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1761560.0, ans=0.125 2023-10-04 19:07:06,884 INFO [train.py:1046] (2/4) Epoch 50, batch 3950, loss[loss=0.1421, simple_loss=0.2254, pruned_loss=0.02935, over 17233.00 frames. ], tot_loss[loss=0.1503, simple_loss=0.231, pruned_loss=0.03481, over 4699278.53 frames. ], batch size: 37, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:07:09,936 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:07:11,292 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 19:07:12,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:07:15,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:07:17,770 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:07:21,099 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 19:07:22,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:07:22,423 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 19:07:22,482 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 19:07:22,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:07:25,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:07:25,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:07:25,726 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:07:28,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 19:07:31,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:07:31,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:07:32,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:07:33,842 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:07:35,220 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:07:43,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:07:45,601 WARNING [train.py:1204] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:07:50,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 19:07:51,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1761826.6666666667, ans=0.125 2023-10-04 19:07:54,424 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 19:07:54,427 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 19:07:55,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:07:56,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:08:02,101 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:08:02,116 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:08:03,445 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:08:03,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:08:04,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 19:08:08,909 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:08:09,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1761893.3333333333, ans=0.125 2023-10-04 19:08:10,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:08:15,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 19:08:21,569 INFO [train.py:1046] (2/4) Epoch 50, batch 4000, loss[loss=0.1765, simple_loss=0.2596, pruned_loss=0.0467, over 23718.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2321, pruned_loss=0.03497, over 4705114.60 frames. ], batch size: 85, lr: 2.02e-03, grad_scale: 32.0 2023-10-04 19:08:25,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:27,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1761960.0, ans=0.2 2023-10-04 19:08:32,986 WARNING [train.py:1204] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:38,557 WARNING [train.py:1204] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:08:38,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:08:38,635 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:38,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 19:08:40,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:08:40,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 19:08:40,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:08:40,089 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 19:08:42,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:08:45,963 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.094e+02 2.391e+02 2.911e+02 5.164e+02, threshold=4.782e+02, percent-clipped=3.0 2023-10-04 19:08:46,067 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:08:46,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:08:46,082 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:08:46,113 WARNING [train.py:1204] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:08:46,117 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:08:48,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:08:49,552 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 19:08:50,898 WARNING [train.py:1204] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:08:50,980 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:08:55,572 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 19:08:55,648 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:08:55,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:09:02,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1762093.3333333333, ans=0.0 2023-10-04 19:09:03,098 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 19:09:03,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:09:05,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:09:07,132 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 19:09:08,545 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:09:08,612 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 19:09:09,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:09:11,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:09:11,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1762160.0, ans=0.0 2023-10-04 19:09:12,499 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:09:13,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:09:13,910 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:09:15,206 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:09:16,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 19:09:16,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:09:18,053 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 19:09:22,856 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:09:25,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 19:09:27,604 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:09:27,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:09:28,977 WARNING [train.py:1204] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:09:30,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:09:34,807 INFO [train.py:1046] (2/4) Epoch 50, batch 4050, loss[loss=0.16, simple_loss=0.2358, pruned_loss=0.04213, over 23911.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2329, pruned_loss=0.03499, over 4719604.76 frames. ], batch size: 180, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:09:34,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:09:39,122 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:09:39,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 19:09:40,533 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:09:41,999 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:09:42,076 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:09:43,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:09:44,868 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:09:47,552 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:09:49,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1762360.0, ans=0.125 2023-10-04 19:09:51,433 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:09:51,476 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 19:09:52,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=1762360.0, ans=15.0 2023-10-04 19:09:52,857 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:09:52,924 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:09:56,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:09:58,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:10:00,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1762360.0, ans=0.125 2023-10-04 19:10:00,589 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.88 vs. limit=15.0 2023-10-04 19:10:02,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 19:10:03,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 19:10:03,425 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 19:10:06,165 WARNING [train.py:1204] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:10:10,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 19:10:11,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:10:15,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:10:15,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1762426.6666666667, ans=0.125 2023-10-04 19:10:19,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:10:20,688 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:10:20,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:10:23,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:10:26,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 19:10:26,265 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:10:27,681 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:10:29,040 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 19:10:34,501 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:10:40,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 19:10:40,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:10:40,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:10:41,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 19:10:41,679 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 19:10:41,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:10:44,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:10:46,355 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:10:46,369 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:10:47,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1762626.6666666667, ans=0.1 2023-10-04 19:10:48,831 INFO [train.py:1046] (2/4) Epoch 50, batch 4100, loss[loss=0.1544, simple_loss=0.2289, pruned_loss=0.03993, over 23835.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2345, pruned_loss=0.03574, over 4718648.98 frames. ], batch size: 180, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:10:55,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 19:10:56,402 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 19:10:58,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 19:10:59,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 19:10:59,444 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:00,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:00,805 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:00,817 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:11:02,118 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 19:11:05,790 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:11:05,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1762693.3333333333, ans=0.0 2023-10-04 19:11:07,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:11:07,239 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:07,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:11:12,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:11:12,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1762693.3333333333, ans=0.125 2023-10-04 19:11:13,952 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.059e+02 2.320e+02 2.888e+02 5.611e+02, threshold=4.640e+02, percent-clipped=1.0 2023-10-04 19:11:14,037 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:11:14,091 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:11:14,112 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 19:11:16,077 WARNING [train.py:1204] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:16,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:11:16,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:11:16,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:11:16,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 19:11:20,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:21,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 19:11:23,046 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:11:25,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:11:25,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 19:11:27,747 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:11:27,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:11:27,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:11:30,619 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 19:11:32,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:11:32,089 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:11:35,366 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 19:11:35,413 WARNING [train.py:1204] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:36,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:11:38,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:38,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1762826.6666666667, ans=0.2 2023-10-04 19:11:42,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:11:45,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:11:47,351 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:54,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:11:54,268 WARNING [train.py:1204] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:57,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:11:58,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:12:03,065 INFO [train.py:1046] (2/4) Epoch 50, batch 4150, loss[loss=0.1536, simple_loss=0.2165, pruned_loss=0.04534, over 23877.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2345, pruned_loss=0.03582, over 4724523.99 frames. ], batch size: 195, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:12:03,227 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:12:04,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1762960.0, ans=0.125 2023-10-04 19:12:06,438 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:12:06,512 WARNING [train.py:1204] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:12:06,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:12:09,460 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 19:12:09,481 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:12:10,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 19:12:10,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 19:12:10,879 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 19:12:12,266 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:12:14,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1762960.0, ans=0.125 2023-10-04 19:12:14,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1762960.0, ans=0.125 2023-10-04 19:12:15,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:12:15,380 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:12:20,096 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:12:21,338 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:12:21,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:12:22,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:12:24,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:12:25,484 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:12:28,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:12:32,446 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:12:33,607 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 19:12:36,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 19:12:36,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:12:36,867 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 19:12:38,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:12:38,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:12:40,885 WARNING [train.py:1204] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:12:40,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:12:46,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 19:12:46,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1763160.0, ans=0.125 2023-10-04 19:12:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:12:50,944 WARNING [train.py:1204] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:12:51,014 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 19:12:52,375 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:12:53,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 19:12:55,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:12:55,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1763160.0, ans=0.1 2023-10-04 19:12:56,541 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:12:58,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:12:59,810 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 19:12:59,811 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:12:59,813 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 19:13:00,561 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:13:01,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 19:13:01,815 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:13:01,819 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:13:01,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:13:03,258 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 19:13:03,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:13:04,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:13:04,603 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:13:07,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:13:07,642 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 19:13:07,676 WARNING [train.py:1204] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:13:13,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:13:13,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1763226.6666666667, ans=0.1 2023-10-04 19:13:14,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1763226.6666666667, ans=0.125 2023-10-04 19:13:16,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 19:13:17,308 INFO [train.py:1046] (2/4) Epoch 50, batch 4200, loss[loss=0.1254, simple_loss=0.2046, pruned_loss=0.02305, over 24285.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03569, over 4699626.55 frames. ], batch size: 56, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:13:17,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:13:19,408 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:13:20,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:13:20,884 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:13:20,886 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:13:22,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1763293.3333333333, ans=0.125 2023-10-04 19:13:23,592 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 19:13:26,335 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 19:13:28,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:29,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1763293.3333333333, ans=0.1 2023-10-04 19:13:30,829 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:13:33,968 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:13:34,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1763360.0, ans=0.1 2023-10-04 19:13:35,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:13:36,868 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:13:38,519 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:38,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 19:13:38,581 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:13:41,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:42,644 WARNING [train.py:1204] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:13:42,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:13:43,864 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 2.065e+02 2.338e+02 2.693e+02 5.755e+02, threshold=4.677e+02, percent-clipped=2.0 2023-10-04 19:13:44,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:13:44,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1763360.0, ans=10.0 2023-10-04 19:13:45,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 19:13:45,479 WARNING [train.py:1204] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:50,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 19:13:51,655 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:13:53,088 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:13:54,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:13:56,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-10-04 19:13:58,514 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:13:58,521 WARNING [train.py:1204] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 19:13:58,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:14:00,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:14:05,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:14:06,680 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:14:06,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1763493.3333333333, ans=0.125 2023-10-04 19:14:10,859 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:14:14,140 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 19:14:16,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:14:17,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1763560.0, ans=0.2 2023-10-04 19:14:17,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1763560.0, ans=0.025 2023-10-04 19:14:21,596 WARNING [train.py:1204] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:14:21,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:21,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1763560.0, ans=0.0 2023-10-04 19:14:23,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 19:14:27,354 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:14:31,950 INFO [train.py:1046] (2/4) Epoch 50, batch 4250, loss[loss=0.1625, simple_loss=0.2433, pruned_loss=0.04081, over 23448.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2314, pruned_loss=0.03536, over 4699033.29 frames. ], batch size: 120, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:14:32,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:14:32,051 WARNING [train.py:1204] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 19:14:34,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=1763626.6666666667, ans=0.5 2023-10-04 19:14:35,257 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:36,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1763626.6666666667, ans=0.125 2023-10-04 19:14:40,731 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:14:40,765 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 19:14:40,798 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:14:43,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:48,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:14:52,572 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:52,585 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:14:54,027 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:14:54,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:14:56,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:56,774 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:14:58,200 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:58,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1763693.3333333333, ans=0.2 2023-10-04 19:15:00,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:15:00,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:02,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 19:15:07,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 19:15:07,409 WARNING [train.py:1204] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:15:08,636 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:15:08,658 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:15:10,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:15:10,022 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:10,069 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:15:10,669 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.85 vs. limit=15.0 2023-10-04 19:15:13,364 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:15:14,720 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:15:14,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1763826.6666666667, ans=0.125 2023-10-04 19:15:19,428 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:15:20,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:22,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 19:15:22,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:15:23,450 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 19:15:24,907 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:15:25,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1763826.6666666667, ans=0.125 2023-10-04 19:15:27,568 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:15:28,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:29,001 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:15:30,419 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 19:15:31,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:15:31,831 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:15:35,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:38,416 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:38,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:15:41,047 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:15:42,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:15:44,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:15:45,509 INFO [train.py:1046] (2/4) Epoch 50, batch 4300, loss[loss=0.1628, simple_loss=0.2374, pruned_loss=0.04417, over 23792.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2318, pruned_loss=0.03522, over 4703868.87 frames. ], batch size: 164, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:15:45,563 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:15:45,569 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 19:15:47,588 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:15:51,827 WARNING [train.py:1204] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:15:51,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1763960.0, ans=0.015 2023-10-04 19:15:53,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:15:58,567 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:16:02,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:16:02,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 19:16:05,663 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:16:05,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1764026.6666666667, ans=0.025 2023-10-04 19:16:06,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1764026.6666666667, ans=0.125 2023-10-04 19:16:08,157 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:16:08,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:16:08,192 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 19:16:12,133 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.411e+02 2.835e+02 5.289e+02, threshold=4.821e+02, percent-clipped=1.0 2023-10-04 19:16:12,239 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:16:13,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:16:15,698 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 19:16:15,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:16:17,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 19:16:18,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:16:20,367 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:16:20,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1764093.3333333333, ans=0.125 2023-10-04 19:16:23,162 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:16:23,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:16:24,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:16:24,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:16:25,929 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:16:27,201 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 19:16:27,274 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 19:16:30,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:16:32,872 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:32,881 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:16:32,895 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:32,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:16:32,954 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 19:16:32,956 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 19:16:33,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1764160.0, ans=0.125 2023-10-04 19:16:34,381 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 19:16:35,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:16:35,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 19:16:35,830 WARNING [train.py:1204] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 19:16:39,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:16:42,455 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 19:16:42,522 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:16:43,953 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:16:43,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:16:47,179 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 19:16:48,548 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:16:48,560 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:48,606 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:16:48,632 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:16:49,972 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:16:51,371 WARNING [train.py:1204] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:16:53,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1764226.6666666667, ans=0.09899494936611666 2023-10-04 19:16:54,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:16:55,846 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:55,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:17:00,004 INFO [train.py:1046] (2/4) Epoch 50, batch 4350, loss[loss=0.1475, simple_loss=0.2294, pruned_loss=0.03276, over 23730.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2317, pruned_loss=0.03522, over 4704011.23 frames. ], batch size: 149, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:17:01,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 19:17:01,549 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 19:17:05,898 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:06,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1764293.3333333333, ans=0.0 2023-10-04 19:17:07,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:17:10,981 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:17:10,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:17:18,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:17:20,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:17:21,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1764360.0, ans=0.0 2023-10-04 19:17:22,970 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:17:24,256 WARNING [train.py:1204] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:17:25,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:17:27,180 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:17:28,564 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:17:35,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 19:17:35,358 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:35,432 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:17:35,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1764426.6666666667, ans=0.1 2023-10-04 19:17:38,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1764426.6666666667, ans=0.0 2023-10-04 19:17:41,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:17:44,659 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 19:17:47,422 WARNING [train.py:1204] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:17:47,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:17:50,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1764493.3333333333, ans=0.0 2023-10-04 19:17:51,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.04 vs. limit=22.5 2023-10-04 19:17:52,269 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 19:17:52,377 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:17:54,327 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:17:55,692 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 19:17:55,752 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 19:17:55,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:17:55,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:58,374 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:17:58,424 WARNING [train.py:1204] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:17:58,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1764560.0, ans=0.125 2023-10-04 19:17:59,786 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:17:59,828 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:18:01,272 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 19:18:01,286 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:01,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:18:01,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:02,634 WARNING [train.py:1204] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 19:18:03,986 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 19:18:03,990 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 19:18:05,250 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 19:18:08,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:18:08,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:18:09,663 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:10,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:18:12,976 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 19:18:14,252 INFO [train.py:1046] (2/4) Epoch 50, batch 4400, loss[loss=0.1998, simple_loss=0.2675, pruned_loss=0.06601, over 19462.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2324, pruned_loss=0.03554, over 4700988.66 frames. ], batch size: 388, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:18:14,355 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 19:18:14,362 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:16,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1764626.6666666667, ans=0.125 2023-10-04 19:18:17,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:18:17,429 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:18,842 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:18:19,513 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.55 vs. limit=15.0 2023-10-04 19:18:20,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 19:18:22,063 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 19:18:22,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 19:18:22,112 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 19:18:23,509 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:18:23,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:18:26,573 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 19:18:26,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1764626.6666666667, ans=0.125 2023-10-04 19:18:29,211 WARNING [train.py:1204] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:30,702 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:30,712 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 19:18:33,470 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:33,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 19:18:34,903 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 19:18:36,457 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 19:18:37,726 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 19:18:37,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 19:18:37,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:37,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1764693.3333333333, ans=0.125 2023-10-04 19:18:39,194 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:18:40,796 WARNING [train.py:1204] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:18:41,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:18:42,813 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.185e+02 2.399e+02 2.720e+02 3.791e+02, threshold=4.798e+02, percent-clipped=0.0 2023-10-04 19:18:43,012 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 19:18:43,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 19:18:43,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1764760.0, ans=0.125 2023-10-04 19:18:44,296 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:45,824 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:18:45,832 WARNING [train.py:1204] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:47,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:47,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:47,300 WARNING [train.py:1204] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 19:18:48,734 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 19:18:50,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1764760.0, ans=0.0 2023-10-04 19:18:51,969 WARNING [train.py:1204] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:52,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1764760.0, ans=0.125 2023-10-04 19:18:59,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:19:00,642 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 19:19:06,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:19:08,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:19:12,206 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:19:12,259 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 19:19:12,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:19:12,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:19:12,288 WARNING [train.py:1204] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:19:13,549 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:19:16,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 19:19:21,006 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 19:19:21,094 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 19:19:22,285 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:19:22,292 WARNING [train.py:1204] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 19:19:22,378 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:19:25,804 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:19:28,735 INFO [train.py:1046] (2/4) Epoch 50, batch 4450, loss[loss=0.1616, simple_loss=0.2344, pruned_loss=0.04443, over 23790.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2331, pruned_loss=0.03548, over 4707679.98 frames. ], batch size: 212, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:19:28,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 19:19:31,994 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:19:32,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1764960.0, ans=0.0 2023-10-04 19:19:33,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1764960.0, ans=0.125 2023-10-04 19:19:34,698 WARNING [train.py:1204] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:34,751 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:19:37,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1764960.0, ans=0.1 2023-10-04 19:19:41,669 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:19:41,686 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:19:45,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:48,271 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:19:49,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:19:50,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.91 vs. limit=15.0 2023-10-04 19:19:51,020 WARNING [train.py:1204] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:19:51,105 WARNING [train.py:1204] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 19:19:51,106 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:19:52,480 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:52,523 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:19:52,525 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:19:55,241 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:20:00,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:01,671 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:01,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1765093.3333333333, ans=0.125 2023-10-04 19:20:03,019 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:20:03,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:04,392 WARNING [train.py:1204] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:20:07,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 19:20:08,537 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 19:20:09,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 19:20:09,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:20:12,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:20:12,828 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 19:20:16,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:20:19,102 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:20,446 WARNING [train.py:1204] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 19:20:20,467 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:20,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:20:20,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:20:20,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1765160.0, ans=0.125 2023-10-04 19:20:21,850 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:20:21,975 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:24,706 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:20:24,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 19:20:28,037 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:20:29,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:20:31,378 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:20:32,769 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:32,789 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:20:34,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1765226.6666666667, ans=0.0 2023-10-04 19:20:35,540 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:20:37,078 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 19:20:38,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:20:42,365 INFO [train.py:1046] (2/4) Epoch 50, batch 4500, loss[loss=0.1436, simple_loss=0.2282, pruned_loss=0.02953, over 24677.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.0358, over 4711850.89 frames. ], batch size: 65, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:20:42,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:20:45,919 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 19:20:45,921 WARNING [train.py:1204] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 19:20:47,341 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:20:50,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-10-04 19:20:54,291 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:54,340 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:20:54,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:20:55,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:20:55,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:57,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:21:08,668 WARNING [train.py:1204] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:21:09,971 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.200e+02 2.403e+02 2.900e+02 5.127e+02, threshold=4.806e+02, percent-clipped=1.0 2023-10-04 19:21:10,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:21:11,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:21:12,684 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:21:12,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:21:13,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1765426.6666666667, ans=0.125 2023-10-04 19:21:18,962 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:21:20,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1765426.6666666667, ans=0.0 2023-10-04 19:21:21,848 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:21:24,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1765493.3333333333, ans=0.2 2023-10-04 19:21:26,070 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:21:29,193 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:21:29,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 19:21:30,964 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:31,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:21:32,414 WARNING [train.py:1204] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:21:33,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:21:36,386 WARNING [train.py:1204] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:21:36,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 19:21:36,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:21:36,415 WARNING [train.py:1204] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:39,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1765560.0, ans=0.1 2023-10-04 19:21:40,790 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:21:40,809 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:21:45,485 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:46,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:21:47,651 WARNING [train.py:1204] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:21:50,283 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 19:21:51,652 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 19:21:51,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 19:21:54,391 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 19:21:55,722 INFO [train.py:1046] (2/4) Epoch 50, batch 4550, loss[loss=0.1514, simple_loss=0.2434, pruned_loss=0.02971, over 24657.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2337, pruned_loss=0.03582, over 4717969.02 frames. ], batch size: 68, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:21:57,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 19:21:58,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:21:59,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1765626.6666666667, ans=0.125 2023-10-04 19:22:01,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:22:02,014 WARNING [train.py:1204] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:22:04,033 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:09,519 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:22:10,862 WARNING [train.py:1204] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:22:12,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:12,766 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:22:12,767 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:15,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:15,544 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:22:15,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1765693.3333333333, ans=0.0 2023-10-04 19:22:19,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1765693.3333333333, ans=0.0 2023-10-04 19:22:20,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:22:20,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1765693.3333333333, ans=0.125 2023-10-04 19:22:21,482 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 19:22:21,535 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 19:22:22,834 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:22:24,221 WARNING [train.py:1204] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 19:22:28,157 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 19:22:29,495 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:22:34,784 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 19:22:34,912 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:22:38,932 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:38,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:38,980 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:22:40,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 19:22:42,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:22:43,893 WARNING [train.py:1204] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:45,199 WARNING [train.py:1204] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:22:46,622 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:48,598 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 19:22:48,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 19:22:49,675 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:22:49,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 19:22:51,262 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 19:22:51,282 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:52,646 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:52,664 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:22:54,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:54,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:22:55,629 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:22:56,962 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 19:22:57,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1765893.3333333333, ans=0.0 2023-10-04 19:22:58,218 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:22:58,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 19:22:58,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 19:22:58,306 WARNING [train.py:1204] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:22:58,322 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 19:23:02,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:23:02,936 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:23:06,313 WARNING [train.py:1204] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:23:06,362 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:23:06,398 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 19:23:07,838 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:23:09,737 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:23:09,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1765960.0, ans=0.125 2023-10-04 19:23:10,991 INFO [train.py:1046] (2/4) Epoch 50, batch 4600, loss[loss=0.1614, simple_loss=0.2385, pruned_loss=0.04221, over 23771.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2323, pruned_loss=0.03559, over 4706445.72 frames. ], batch size: 179, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:23:12,498 WARNING [train.py:1204] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:13,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:23:16,558 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:23:17,209 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:23:17,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1765960.0, ans=0.0 2023-10-04 19:23:18,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:18,609 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 19:23:20,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:23:23,928 WARNING [train.py:1204] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:23:23,992 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:26,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:32,758 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 19:23:32,851 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:36,175 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:38,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.38 vs. limit=15.0 2023-10-04 19:23:39,342 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.190e+02 2.507e+02 2.915e+02 5.152e+02, threshold=5.014e+02, percent-clipped=2.0 2023-10-04 19:23:39,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:23:39,488 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:44,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 19:23:44,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:23:47,018 WARNING [train.py:1204] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:23:48,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1766093.3333333333, ans=0.125 2023-10-04 19:23:52,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:52,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:23:53,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:23:57,892 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 19:23:57,987 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:24:02,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:04,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:05,473 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:05,474 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 19:24:07,328 WARNING [train.py:1204] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:07,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 19:24:08,760 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:08,812 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:10,203 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:12,196 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:24:12,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:12,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten.whitening_limit, batch_count=1766226.6666666667, ans=15.0 2023-10-04 19:24:13,654 WARNING [train.py:1204] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 19:24:13,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 19:24:13,727 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 19:24:13,732 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:15,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:24:16,405 WARNING [train.py:1204] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:17,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=15.0 2023-10-04 19:24:18,191 WARNING [train.py:1204] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:19,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1766226.6666666667, ans=0.1 2023-10-04 19:24:25,033 INFO [train.py:1046] (2/4) Epoch 50, batch 4650, loss[loss=0.1506, simple_loss=0.2278, pruned_loss=0.03667, over 23342.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2319, pruned_loss=0.03548, over 4711348.57 frames. ], batch size: 119, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:24:26,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:24:29,309 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:24:29,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:29,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:24:29,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:30,680 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:24:30,771 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:34,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1766293.3333333333, ans=0.125 2023-10-04 19:24:35,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 19:24:40,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:24:42,003 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 19:24:43,289 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:24:44,641 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 19:24:44,668 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:24:46,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 19:24:46,056 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 19:24:46,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:47,403 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:24:49,539 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:24:50,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:50,764 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 19:24:50,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1766360.0, ans=0.2 2023-10-04 19:24:54,836 WARNING [train.py:1204] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:56,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 19:24:57,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1766426.6666666667, ans=0.125 2023-10-04 19:24:59,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:59,041 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:25:00,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 19:25:01,847 WARNING [train.py:1204] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:25:04,711 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:25:07,970 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:12,274 WARNING [train.py:1204] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:25:14,311 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:25:14,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1766493.3333333333, ans=0.0 2023-10-04 19:25:15,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:25:16,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:25:19,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 19:25:19,709 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 19:25:21,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 19:25:21,060 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 19:25:23,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:28,610 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:25:28,620 WARNING [train.py:1204] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:25:29,935 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 19:25:29,948 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:31,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:25:31,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:25:32,900 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:25:35,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:25:35,645 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:25:35,733 WARNING [train.py:1204] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:25:38,798 INFO [train.py:1046] (2/4) Epoch 50, batch 4700, loss[loss=0.166, simple_loss=0.2439, pruned_loss=0.04401, over 23607.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2319, pruned_loss=0.03522, over 4720505.05 frames. ], batch size: 256, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:25:38,967 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:38,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:25:39,004 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:25:40,943 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 19:25:41,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1766626.6666666667, ans=0.2 2023-10-04 19:25:42,353 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:25:43,723 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 19:25:50,120 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:51,414 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:51,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:25:53,381 WARNING [train.py:1204] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:25:56,024 WARNING [train.py:1204] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:26:01,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 19:26:01,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 19:26:04,324 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:05,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:26:07,000 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.151e+02 2.397e+02 2.969e+02 5.110e+02, threshold=4.793e+02, percent-clipped=1.0 2023-10-04 19:26:07,065 WARNING [train.py:1204] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:26:10,493 WARNING [train.py:1204] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:15,363 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:26:15,449 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:26:18,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:26:22,880 WARNING [train.py:1204] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 19:26:24,592 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:26:26,059 WARNING [train.py:1204] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:31,543 WARNING [train.py:1204] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 19:26:32,897 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:26:35,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:26:37,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 19:26:38,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:38,886 WARNING [train.py:1204] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:26:40,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:40,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1766893.3333333333, ans=0.2 2023-10-04 19:26:41,854 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:26:41,871 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 19:26:41,948 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 19:26:42,898 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.15 vs. limit=15.0 2023-10-04 19:26:43,319 WARNING [train.py:1204] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:26:45,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:45,351 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:45,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 19:26:46,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:50,779 WARNING [train.py:1204] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 19:26:54,055 INFO [train.py:1046] (2/4) Epoch 50, batch 4750, loss[loss=0.159, simple_loss=0.2525, pruned_loss=0.0328, over 24341.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2332, pruned_loss=0.03565, over 4723614.87 frames. ], batch size: 74, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:26:54,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:26:56,009 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:00,159 WARNING [train.py:1204] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:00,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1766960.0, ans=0.2 2023-10-04 19:27:01,442 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:27:04,102 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 19:27:04,151 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:05,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.79 vs. limit=12.0 2023-10-04 19:27:06,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 19:27:09,108 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:27:09,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:27:10,466 WARNING [train.py:1204] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:27:14,692 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 19:27:20,764 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:27:23,339 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 19:27:23,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:27:26,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:27:26,696 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:27:26,715 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:28,089 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 19:27:28,092 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 19:27:28,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1767093.3333333333, ans=0.1 2023-10-04 19:27:34,303 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 19:27:34,573 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 19:27:36,996 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:39,627 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:27:41,158 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:27:41,159 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 19:27:41,163 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:27:44,347 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:27:44,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1767160.0, ans=0.0 2023-10-04 19:27:45,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:27:47,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1767160.0, ans=0.125 2023-10-04 19:27:48,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 19:27:48,587 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 19:27:49,978 WARNING [train.py:1204] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:49,999 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:27:50,036 WARNING [train.py:1204] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:51,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:27:51,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 19:27:53,438 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 19:27:56,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:27:58,353 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:27:58,355 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 19:27:59,658 WARNING [train.py:1204] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:28:02,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:04,138 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:28:04,186 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:04,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1767226.6666666667, ans=0.2 2023-10-04 19:28:05,508 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:28:08,313 INFO [train.py:1046] (2/4) Epoch 50, batch 4800, loss[loss=0.1354, simple_loss=0.221, pruned_loss=0.0249, over 24334.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2336, pruned_loss=0.03567, over 4718735.95 frames. ], batch size: 61, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:28:08,358 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:08,390 WARNING [train.py:1204] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 19:28:09,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 19:28:11,143 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 19:28:14,383 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:28:14,407 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:14,502 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 19:28:18,816 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:20,212 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:24,387 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:28:26,291 WARNING [train.py:1204] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:28:26,325 WARNING [train.py:1204] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:28,183 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 19:28:28,249 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:28:28,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:28:29,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.10 vs. limit=12.0 2023-10-04 19:28:29,721 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:28:34,326 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:28:35,729 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:35,760 WARNING [train.py:1204] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:28:37,018 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.867e+02 2.213e+02 2.544e+02 3.489e+02 5.983e+02, threshold=5.088e+02, percent-clipped=6.0 2023-10-04 19:28:37,160 WARNING [train.py:1204] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:37,169 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 19:28:37,181 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:38,505 WARNING [train.py:1204] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:28:40,000 WARNING [train.py:1204] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:43,399 WARNING [train.py:1204] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:46,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:46,099 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:28:47,433 WARNING [train.py:1204] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:28:48,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:50,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 19:28:50,216 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 19:28:51,666 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:51,682 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:28:51,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:28:51,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:28:51,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:28:53,184 WARNING [train.py:1204] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:28:53,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:28:53,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1767493.3333333333, ans=0.0 2023-10-04 19:28:56,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:58,317 WARNING [train.py:1204] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:00,918 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:05,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 19:29:05,088 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:29:05,127 WARNING [train.py:1204] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:05,170 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:29:06,463 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:29:09,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1767560.0, ans=0.125 2023-10-04 19:29:10,617 WARNING [train.py:1204] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:29:10,688 WARNING [train.py:1204] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:29:10,694 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:10,750 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:29:12,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:29:13,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1767560.0, ans=0.125 2023-10-04 19:29:14,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:29:14,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1767560.0, ans=0.125 2023-10-04 19:29:16,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:16,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:18,171 WARNING [train.py:1204] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:29:19,595 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 19:29:21,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 19:29:21,060 WARNING [train.py:1204] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:29:21,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:29:22,382 INFO [train.py:1046] (2/4) Epoch 50, batch 4850, loss[loss=0.1635, simple_loss=0.2515, pruned_loss=0.03774, over 24416.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2341, pruned_loss=0.03612, over 4707375.94 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:29:22,501 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:29:22,502 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:28,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:29:33,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 19:29:33,995 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:39,672 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:29:40,955 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:29:40,983 WARNING [train.py:1204] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:44,196 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:45,575 WARNING [train.py:1204] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:29:48,214 WARNING [train.py:1204] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:29:48,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 19:29:49,914 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 19:29:51,082 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:29:53,750 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:29:53,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:29:55,124 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:29:55,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 19:29:55,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1767760.0, ans=0.125 2023-10-04 19:29:57,038 WARNING [train.py:1204] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:29:58,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:29:59,169 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.34 vs. limit=15.0 2023-10-04 19:30:02,809 WARNING [train.py:1204] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:02,821 WARNING [train.py:1204] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 19:30:02,866 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 19:30:04,234 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:30:11,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:30:12,409 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 19:30:13,792 WARNING [train.py:1204] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:30:13,806 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:30:15,193 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:30:17,365 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 19:30:17,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:18,738 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 19:30:18,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:30:18,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1767826.6666666667, ans=0.125 2023-10-04 19:30:20,300 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:30:20,370 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 19:30:29,566 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:31,927 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.16 vs. limit=22.5 2023-10-04 19:30:33,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1767893.3333333333, ans=0.2 2023-10-04 19:30:34,356 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:30:35,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:30:36,957 INFO [train.py:1046] (2/4) Epoch 50, batch 4900, loss[loss=0.1371, simple_loss=0.2073, pruned_loss=0.03346, over 23432.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2329, pruned_loss=0.03587, over 4703932.92 frames. ], batch size: 285, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:30:39,811 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 19:30:39,813 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:30:44,074 WARNING [train.py:1204] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:30:45,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:30:45,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:30:49,029 WARNING [train.py:1204] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 19:30:53,406 WARNING [train.py:1204] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 19:30:56,344 WARNING [train.py:1204] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 19:30:58,957 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 19:30:59,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1768026.6666666667, ans=0.2 2023-10-04 19:31:00,163 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:31:00,198 WARNING [train.py:1204] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:31:02,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:31:02,183 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:31:02,189 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:31:02,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 19:31:02,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1768026.6666666667, ans=0.1 2023-10-04 19:31:04,272 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.92 vs. limit=22.5 2023-10-04 19:31:05,022 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 19:31:05,071 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:31:06,262 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.161e+02 2.469e+02 3.032e+02 6.207e+02, threshold=4.938e+02, percent-clipped=2.0 2023-10-04 19:31:07,742 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:31:09,005 WARNING [train.py:1204] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:31:10,487 WARNING [train.py:1204] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:31:11,865 WARNING [train.py:1204] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:13,285 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:13,299 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 19:31:16,019 WARNING [train.py:1204] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:31:17,387 WARNING [train.py:1204] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:31:17,402 WARNING [train.py:1204] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 19:31:17,407 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 19:31:20,533 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 19:31:22,007 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:31:22,092 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:31:22,131 WARNING [train.py:1204] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:31:23,462 WARNING [train.py:1204] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:23,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 19:31:23,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:31:23,558 WARNING [train.py:1204] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 19:31:27,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:29,310 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:31:31,260 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:31:33,330 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 19:31:34,657 WARNING [train.py:1204] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:31:34,718 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 19:31:34,763 WARNING [train.py:1204] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 19:31:39,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:31:40,352 WARNING [train.py:1204] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:31:41,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 19:31:41,717 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:31:41,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:31:43,129 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:47,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:31:47,854 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:31:47,878 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:31:47,904 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 19:31:49,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:31:51,975 INFO [train.py:1046] (2/4) Epoch 50, batch 4950, loss[loss=0.1612, simple_loss=0.2349, pruned_loss=0.04377, over 23798.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2318, pruned_loss=0.03551, over 4715345.75 frames. ], batch size: 164, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:31:52,048 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:31:52,068 WARNING [train.py:1204] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:31:57,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 19:31:57,165 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 19:31:57,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:31:58,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 19:31:58,545 WARNING [train.py:1204] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:58,554 WARNING [train.py:1204] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:32:00,482 WARNING [train.py:1204] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:32:00,506 WARNING [train.py:1204] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:03,373 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:03,425 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:32:04,993 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 19:32:06,093 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:32:07,486 WARNING [train.py:1204] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:32:08,933 WARNING [train.py:1204] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:08,946 WARNING [train.py:1204] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:32:13,035 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:32:14,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1768360.0, ans=0.0 2023-10-04 19:32:15,916 WARNING [train.py:1204] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:18,021 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:32:19,392 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:19,450 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:20,745 WARNING [train.py:1204] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:32:23,461 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 19:32:23,531 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 19:32:26,710 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:28,646 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:32:28,665 WARNING [train.py:1204] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:32:30,013 WARNING [train.py:1204] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:32:30,023 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:32:31,342 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:32:33,376 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:34,640 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:32:36,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:32:37,385 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:37,417 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:38,764 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 19:32:38,791 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:32:38,905 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:32:43,097 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:32:44,475 WARNING [train.py:1204] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:32:46,232 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:32:46,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:47,547 WARNING [train.py:1204] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:32:47,611 WARNING [train.py:1204] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:32:50,304 WARNING [train.py:1204] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:32:50,357 WARNING [train.py:1204] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:32:50,394 WARNING [train.py:1204] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:51,788 WARNING [train.py:1204] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 19:32:56,448 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:32:56,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1768560.0, ans=0.0 2023-10-04 19:33:03,042 WARNING [train.py:1204] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 19:33:03,058 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 19:33:07,073 INFO [train.py:1046] (2/4) Epoch 50, batch 5000, loss[loss=0.1543, simple_loss=0.2329, pruned_loss=0.03781, over 23552.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2321, pruned_loss=0.03539, over 4722506.93 frames. ], batch size: 120, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:33:08,744 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:33:08,752 WARNING [train.py:1204] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:33:10,104 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 19:33:11,471 WARNING [train.py:1204] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 19:33:12,891 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:33:13,708 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.84 vs. limit=12.0 2023-10-04 19:33:14,267 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 19:33:14,293 WARNING [train.py:1204] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:33:14,305 WARNING [train.py:1204] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:33:16,262 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 19:33:16,297 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:17,630 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:33:18,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 19:33:18,984 WARNING [train.py:1204] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:19,025 WARNING [train.py:1204] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:33:19,141 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 19:33:20,372 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 19:33:21,781 WARNING [train.py:1204] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:33:21,825 WARNING [train.py:1204] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 19:33:21,832 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:33:23,173 WARNING [train.py:1204] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:24,512 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:33:24,513 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 19:33:24,520 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 19:33:25,950 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 19:33:25,963 WARNING [train.py:1204] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:27,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:29,315 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 19:33:29,336 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:33:31,288 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:32,605 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:34,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 19:33:35,955 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 19:33:37,191 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.194e+02 2.487e+02 3.099e+02 5.334e+02, threshold=4.974e+02, percent-clipped=1.0 2023-10-04 19:33:37,277 WARNING [train.py:1204] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:33:37,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.42 vs. limit=22.5 2023-10-04 19:33:38,712 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:33:42,782 WARNING [train.py:1204] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 19:33:44,298 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:33:46,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:46,335 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:33:49,182 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 19:33:50,403 WARNING [train.py:1204] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:50,432 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:33:51,723 WARNING [train.py:1204] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:33:53,187 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 19:33:53,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:33:57,430 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:33:57,511 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:03,643 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 19:34:07,002 WARNING [train.py:1204] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:17,230 WARNING [train.py:1204] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:34:18,648 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:18,655 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:34:18,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:34:19,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:34:19,982 WARNING [train.py:1204] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:34:20,026 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:21,295 INFO [train.py:1046] (2/4) Epoch 50, batch 5050, loss[loss=0.1399, simple_loss=0.2215, pruned_loss=0.02914, over 24466.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.233, pruned_loss=0.03564, over 4728307.18 frames. ], batch size: 58, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:34:25,459 WARNING [train.py:1204] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:25,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 19:34:26,897 WARNING [train.py:1204] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:34:28,340 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:34:28,426 WARNING [train.py:1204] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:34:29,777 WARNING [train.py:1204] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 19:34:31,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:33,048 WARNING [train.py:1204] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:34:35,689 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:34:37,625 WARNING [train.py:1204] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:34:37,671 WARNING [train.py:1204] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:34:46,412 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 19:34:47,709 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:34:47,785 WARNING [train.py:1204] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:34:49,224 WARNING [train.py:1204] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 19:34:49,250 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:34:50,609 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:50,637 WARNING [train.py:1204] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:50,683 WARNING [train.py:1204] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:34:50,686 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 19:34:53,993 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 19:34:54,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:56,133 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.79 vs. limit=22.5 2023-10-04 19:34:56,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:34:59,740 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:59,782 WARNING [train.py:1204] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 19:35:01,149 WARNING [train.py:1204] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:35:04,528 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 19:35:04,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1769160.0, ans=0.0 2023-10-04 19:35:05,800 WARNING [train.py:1204] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:35:05,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:35:07,222 WARNING [train.py:1204] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:07,299 WARNING [train.py:1204] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:35:09,275 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:35:12,366 WARNING [train.py:1204] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:35:13,699 WARNING [train.py:1204] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:13,728 WARNING [train.py:1204] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:35:13,734 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:35:15,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 19:35:16,408 WARNING [train.py:1204] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:35:17,833 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:35:20,685 WARNING [train.py:1204] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:35:20,691 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 19:35:20,706 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:35:22,062 WARNING [train.py:1204] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:35:22,099 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:22,129 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 19:35:25,477 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:35:25,489 WARNING [train.py:1204] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 19:35:25,490 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:29,660 WARNING [train.py:1204] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:31,027 WARNING [train.py:1204] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:31,049 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 19:35:32,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 19:35:34,232 WARNING [train.py:1204] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:35:34,248 WARNING [train.py:1204] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:35:34,287 WARNING [train.py:1204] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:35:35,536 INFO [train.py:1046] (2/4) Epoch 50, batch 5100, loss[loss=0.158, simple_loss=0.2311, pruned_loss=0.04244, over 23865.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.233, pruned_loss=0.03531, over 4736274.13 frames. ], batch size: 195, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:35:37,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.64 vs. limit=15.0 2023-10-04 19:35:38,300 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 19:35:41,546 WARNING [train.py:1204] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:35:44,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 19:35:44,826 WARNING [train.py:1204] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 19:35:44,874 WARNING [train.py:1204] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:35:47,584 WARNING [train.py:1204] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:35:50,273 WARNING [train.py:1204] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:35:50,324 WARNING [train.py:1204] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 19:35:50,345 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 19:35:54,860 WARNING [train.py:1204] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:54,903 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:35:57,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1769360.0, ans=0.09899494936611666 2023-10-04 19:35:59,190 WARNING [train.py:1204] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:36:00,577 WARNING [train.py:1204] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 19:36:01,951 WARNING [train.py:1204] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:36:04,713 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.080e+02 2.334e+02 2.937e+02 4.893e+02, threshold=4.667e+02, percent-clipped=0.0 2023-10-04 19:36:04,787 WARNING [train.py:1204] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:36:04,804 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:36:04,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1769426.6666666667, ans=10.0 2023-10-04 19:36:08,132 WARNING [train.py:1204] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:08,202 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:08,207 WARNING [train.py:1204] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 19:36:10,674 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.73 vs. limit=15.0 2023-10-04 19:36:11,448 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 19:36:11,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1769426.6666666667, ans=0.1 2023-10-04 19:36:12,802 WARNING [train.py:1204] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:12,852 WARNING [train.py:1204] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 19:36:12,861 WARNING [train.py:1204] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 19:36:16,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.79 vs. limit=15.0 2023-10-04 19:36:16,947 WARNING [train.py:1204] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:36:17,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1769426.6666666667, ans=0.125 2023-10-04 19:36:24,608 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:36:27,938 WARNING [train.py:1204] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 19:36:27,970 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 19:36:27,977 WARNING [train.py:1204] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 19:36:29,395 WARNING [train.py:1204] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 19:36:29,397 WARNING [train.py:1204] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:33,744 WARNING [train.py:1204] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 19:36:37,151 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 19:36:39,903 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:36:41,254 WARNING [train.py:1204] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:36:44,498 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 19:36:46,067 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 19:36:46,111 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 19:36:50,138 INFO [train.py:1046] (2/4) Epoch 50, batch 5150, loss[loss=0.1677, simple_loss=0.2511, pruned_loss=0.04214, over 23613.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2338, pruned_loss=0.03551, over 4740862.43 frames. ], batch size: 85, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:36:51,985 WARNING [train.py:1204] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:36:52,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:36:52,003 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:36:53,316 WARNING [train.py:1204] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:36:53,337 WARNING [train.py:1204] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:36:53,404 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:36:54,837 WARNING [train.py:1204] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 19:36:54,839 WARNING [train.py:1204] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 19:36:54,879 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 19:36:54,904 WARNING [train.py:1204] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:36:54,915 WARNING [train.py:1204] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 19:36:56,333 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:36:57,666 WARNING [train.py:1204] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 19:37:00,812 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:00,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:05,114 WARNING [train.py:1204] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:37:05,134 WARNING [train.py:1204] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 19:37:06,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:06,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:37:09,700 WARNING [train.py:1204] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:37:09,701 WARNING [train.py:1204] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:37:09,716 WARNING [train.py:1204] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:37:09,773 WARNING [train.py:1204] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:37:09,777 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:37:11,119 WARNING [train.py:1204] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 19:37:12,480 WARNING [train.py:1204] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:37:12,529 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:37:14,542 WARNING [train.py:1204] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:37:16,062 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 19:37:17,420 WARNING [train.py:1204] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:37:21,961 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:37:24,591 WARNING [train.py:1204] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 19:37:28,721 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:37:32,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.85 vs. limit=10.0 2023-10-04 19:37:36,229 WARNING [train.py:1204] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:37:37,628 WARNING [train.py:1204] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:40,997 WARNING [train.py:1204] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:37:41,041 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:37:43,772 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 19:37:47,064 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:48,439 WARNING [train.py:1204] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:37:48,472 WARNING [train.py:1204] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:37:53,032 WARNING [train.py:1204] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:37:53,123 WARNING [train.py:1204] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:37:53,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1769893.3333333333, ans=0.2 2023-10-04 19:37:54,461 WARNING [train.py:1204] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 19:37:58,719 WARNING [train.py:1204] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:58,810 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:38:00,208 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:38:00,223 WARNING [train.py:1204] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:38:02,028 WARNING [train.py:1204] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:38:02,054 WARNING [train.py:1204] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:38:02,063 WARNING [train.py:1204] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:38:03,343 WARNING [train.py:1204] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:38:04,696 INFO [train.py:1046] (2/4) Epoch 50, batch 5200, loss[loss=0.1455, simple_loss=0.2229, pruned_loss=0.03403, over 23132.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.234, pruned_loss=0.03589, over 4731412.28 frames. ], batch size: 105, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:38:06,109 WARNING [train.py:1204] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:38:08,774 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:38:12,011 WARNING [train.py:1204] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:13,537 WARNING [train.py:1204] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 19:38:15,517 WARNING [train.py:1204] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:38:15,580 WARNING [train.py:1204] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:16,900 WARNING [train.py:1204] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:18,314 WARNING [train.py:1204] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:38:18,333 WARNING [train.py:1204] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:21,618 WARNING [train.py:1204] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 19:38:22,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.57 vs. limit=10.0 2023-10-04 19:38:22,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=22.5 2023-10-04 19:38:24,106 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:38:24,164 WARNING [train.py:1204] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:38:24,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1770026.6666666667, ans=0.1 2023-10-04 19:38:28,336 WARNING [train.py:1204] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 19:38:29,805 WARNING [train.py:1204] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:38:31,667 WARNING [train.py:1204] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:38:31,720 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 19:38:31,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1770026.6666666667, ans=0.0 2023-10-04 19:38:33,017 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 19:38:34,485 WARNING [train.py:1204] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 19:38:35,661 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.266e+02 2.653e+02 3.411e+02 5.060e+02, threshold=5.305e+02, percent-clipped=4.0 2023-10-04 19:38:37,103 WARNING [train.py:1204] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:38:37,106 WARNING [train.py:1204] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 19:38:37,113 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:37,235 WARNING [train.py:1204] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:38:38,452 WARNING [train.py:1204] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:38:38,497 WARNING [train.py:1204] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 19:38:39,853 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:38:41,887 WARNING [train.py:1204] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:45,996 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 19:38:46,034 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 19:38:46,064 WARNING [train.py:1204] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 19:38:50,714 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 19:38:52,079 WARNING [train.py:1204] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:38:56,793 WARNING [train.py:1204] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:38:56,819 WARNING [train.py:1204] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:38:58,247 WARNING [train.py:1204] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 19:38:59,639 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:59,669 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 19:38:59,672 WARNING [train.py:1204] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:38:59,705 WARNING [train.py:1204] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:39:04,174 WARNING [train.py:1204] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:39:04,278 WARNING [train.py:1204] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:39:08,354 WARNING [train.py:1204] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:39:08,440 WARNING [train.py:1204] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:08,441 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:11,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1770226.6666666667, ans=0.0 2023-10-04 19:39:13,030 WARNING [train.py:1204] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:39:14,400 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 19:39:15,756 WARNING [train.py:1204] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:39:15,771 WARNING [train.py:1204] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:39:17,177 WARNING [train.py:1204] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:18,828 INFO [train.py:1046] (2/4) Epoch 50, batch 5250, loss[loss=0.1362, simple_loss=0.2149, pruned_loss=0.02874, over 19617.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2332, pruned_loss=0.03561, over 4729791.40 frames. ], batch size: 43, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:39:18,901 WARNING [train.py:1204] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:39:19,008 WARNING [train.py:1204] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:39:19,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1770293.3333333333, ans=0.05 2023-10-04 19:39:21,725 WARNING [train.py:1204] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:39:25,154 WARNING [train.py:1204] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:25,195 WARNING [train.py:1204] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:39:27,808 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:39:32,083 WARNING [train.py:1204] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:39:34,080 WARNING [train.py:1204] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:39:36,775 WARNING [train.py:1204] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:39:36,870 WARNING [train.py:1204] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:39:38,243 WARNING [train.py:1204] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 19:39:38,257 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:40,912 WARNING [train.py:1204] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:53,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1770426.6666666667, ans=0.2 2023-10-04 19:39:58,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1770426.6666666667, ans=0.125 2023-10-04 19:40:09,795 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.62 vs. limit=6.0 2023-10-04 19:40:10,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1770493.3333333333, ans=0.0 2023-10-04 19:40:19,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.84 vs. limit=15.0 2023-10-04 19:40:27,590 INFO [train.py:1046] (2/4) Epoch 50, batch 5300, loss[loss=0.1623, simple_loss=0.2519, pruned_loss=0.03638, over 24648.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2318, pruned_loss=0.03538, over 4712674.83 frames. ], batch size: 68, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:40:33,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1770626.6666666667, ans=0.125 2023-10-04 19:40:42,132 INFO [train.py:1310] (2/4) Done!